Risky Business - Snake Oilers: Realm Security, Horizon3 and Persona
Episode Date: October 7, 2025In this edition of the Snake Oilers podcast, three vendors pop in to pitch you all on their wares: Realm Security: A security focussed, AI-first data pipeline platfo...rm Horizon3: AI hackers! Pentesting robots!! They’re coming fer yur jerbs! Persona: Verify customer and staff identities with live capture This episode is also available on Youtube. Show notes
Transcript
Discussion (0)
Hi, everyone and welcome to another edition of the Snake Oilers podcast.
I'm Patrick Gray.
The idea behind these snake oilers podcast is vendors can pay us to come here and pitch their products to you, the listener.
And we've got three very interesting pitches for you today.
We are going to be hearing from Realm Security, which builds a, which has built a data pipeline platform, which
I guess is similar to Cribble, but much more focused on security.
We're going to hear from Horizon 3, and they do AI pen testing.
You know, at massive scale, I find this a really, really interesting idea.
There's a few companies in this space, so definitely wanted to get one of them into one of these segments to talk about it.
And then we're going to be chatting with persona.
And persona do that sort of identity verification through live capture stuff, right?
So they can help regulated industries, verify and identify their customers to make sure they are who they say they are by doing live capture of like a government ID and then their face.
But the funniest thing is happening, which is that with all of these North Korean people turning up for jobs in Western companies, this sort of technology is actually becoming quite useful for detecting that sort of stuff.
And it's also a pretty effective control against various forms of social engineering, at least in some contexts.
So we'll be chatting to them a little bit later.
But let's get into it now with Realm Security.
And we're going to be chatting with its chief executive, Pete Martin, and also their data science guy, Colin Germain, popped along in this interview as well.
As I mentioned earlier, Realm is a data pipeline platform similar to Cribble.
but very much tailored for security applications, right?
And it's heavy on AI.
They describe it as an AI native data pipeline platform.
And, you know, for those who are unfamiliar,
these sort of platforms, they are like waystations for data, right?
You get to decide from a log stream, you know,
which ones go over here into the seam,
which ones go into an archive for later,
and which ones just get, you know, evaporated.
Which ones do you just let boil off sort of thing?
And yeah, so Pete Martin is the first person you're going to hear from in this interview.
And I just got him to start off by explaining exactly what Realm Security is.
Here he is.
Realm Security is a Boston-based cybersecurity company that offers an AI native security pipeline,
which I realize could mean anything.
So what I mean by that is instead of being a bytes in, bites out pipeline,
we've built a transparent layer into all of the individual pipelines themselves.
which gives us the ability to gather deep statistics in an understanding of the actual composition of the log.
And then we can take further activities and implement machine learning and LLMs throughout the pipeline.
So when we think of like these data pipeline sort of products, you know, we usually think of Cribble, right?
So which isn't a security first company, but this is the this is the 800 pound gorilla in the market.
So I guess what you're saying is it's a bit like that.
But instead of it just being in and out, there's something in the middle.
there that enables you to actually do useful things with the data flowing through the data pipeline
product. Yeah, I mean, absolutely. So we've gone head-to-head with Cribble. We've displaced Cribble.
And I think the number one thing is time to value. So within seven days of deploying our product,
we'll get you to onboarded and filtered with legitimate cost savings. And when you compare that to
Cribble, it typically takes two to four months to get the product stood up and it requires
professional services. And I think one of the things that has continually come up in the market
is the data footprint for an organization is not static. So anytime you change endpoint vendors,
you change firewall vendors, you then either need to have the ability to manually use
cripple and change the configurations or you need to pay more for professional services. And so I think
security is super educated to the point where they just don't want to be reliant on anyone else to
achieve their outcomes. And so we truly, we truly enable security practitioners to use a security
enabled pipeline and migrate away from a legacy pipeline like Cribble. If you take one of our
public case studies, venture employer solutions, which is a 10,000 person benefits payroll
provider, they have, I believe it's 20 to 25 data sources. But the reality is, is 90% of the logs go
into their SIM are from network and point and firewall.
And so when you actually look at the problem
and you realize it's concentrated within those three data
sources, having a solution that can very quickly
onboard analyze and not just recommend filtering rules
based on generic data source recommendations,
but specific custom-created filtering rules
using machine learning and LLMs to filter their data,
as opposed to just a general PAC that focuses on just
that broad vendor. We were able to reduce 83% of their firewall logs within seven days,
saving them $250,000 annually. So I guess the idea here is instead of just watching the output,
as it filters into your seam and saying, well, we want to pull that back and sort of trying to
adjust the input. You can kind of do this in a more, you know, do it at the point where it's
in transit. Exactly. Yeah. All right. So also joining us is Colin Jermaine, who runs data
science with with realm. So Colin, what's the, you know, what's the, what's the, what's the, what's the, what's
the magic here? Like, uh, you know, tell us, tell us how work. How work? You know, as Pete mentioned,
it's really all about scale. How do we deal with the large volume of security data and find,
you know, the, the white pieces for, uh, you know, volume reduction? You know, what, what can go to
your archive, but doesn't necessarily need to go to your SIM? That's kind of the core question that
that we're solving and providing value out of the box.
with. And so in terms of, you know, how does it work out of the edge, right? It starts really with
online statistical learning techniques that allow us to really understand the composition of the
data, what's in there, and what are the opportunities for impactful volume reduction.
On top of that, we use gen AI systems that are able to actually reason about the meaning of the
field, the information, and the filtering rules that we can apply. So being able to speed up
that process. But, you know, as a company ultimately, we want to give the best recommendations
and we do also use human in the loop expertise to review those final recommendations to make
sure that they're, you know, really sort of hitting the mark in terms of reducing the volume
without impacting detections. Well, you've just said something that every single company that's
using Gen AI to do this sort of thing has said to me, which is it gets you 90% of the way there.
And then that last 10%, it's a human who's always got to take
the, like a couple of dumb things, the LLM is generated and fix them.
You know, part of it is, you know, our mission is to give control to, you know,
those SOC teams and be able to have them, you know, make, make the decision.
And it's really important to be able to, as you're building these AI systems, think about
where does the human need to come in and really be that oversight.
And I think that, you know, the thing is this is instead of a professional services, you know,
contract, something that could be very expensive and take a very long time to do.
So we're able to accelerate this process quite significantly with, you know, still that
oversight and still the expertise to make sure that it's really got the best result.
So I mean, like when we thought about bringing a solution to market and ultimately putting
in checks and balances to make sure customers can be comfortable, we really thought about
it as the depth test run methodology.
And so when you think about having a human in the loop, it's not necessarily a human needing to
analyze into free form thought into a solution. It's a human in the loop to make sure no errors
have occurred and that they are fully conscious of what it is that we're doing. So there really isn't
any creation of a solution. It's more of a check the box and make sure you're aware of what it
is that we've prescribed to them. It's like a code review. I think that's that's the way that I
always think about it. It's being able to come back and make sure the quality is there.
I mean, you've got all those automated tests, you've got CICD, you know, and it's kind of coming back with that.
Now, we were talking before we got recording, Pete, about this, you know, changes to the seam market, right?
Or changes to seam methodology.
And it gets interesting because you just mentioned, okay, you know, you can cut back the amount of firewall logs you're sending to your seam.
There's an entire industry around trying to minimize Splunk costs, right?
And this product certainly fits into all of that.
But you also said, okay, you're going to direct some of the most critical logs into your seam, you're going to dump some into the archive.
The question becomes, like once you've got a structured, you know, decently structured store of this sort of information and you can query it, why do you need the seam anymore, right?
And I understand that people are very welded to their seams, but we've seen all of these interesting products pop up over the years.
I'm thinking of products like Panther, which, you know, again, people are using to, you know, detection as code and tune stuff that can go into the,
their seam, but even with them, I'm like, well, why do you need the seam? If you can dump all of
this stuff into Snowflake and query it, why do you actually need Splunk? You know, I'm guessing
that you're thinking along similar lines with something like this, which is, hey, you know,
you can use it to tune stuff going into your seam, but it's also an excellent way to get a lot of
this data into a structured archive that then you can do whatever you want with it. Yeah. I mean,
you know, I think like most of our, most of your listeners are very technical, right? And so this like,
this might be something everybody's well aware of, but we're all realizing the cost of data,
right? And like the cost of data is not going away. It's only going to increase. And so even if
your end goal is to dump everything in Snowflake and run, you know, detection as code on top of that,
no matter what, everyone's going to need to be thoughtful on what data they're sending to what
destination. And so having a pipeline that helps you right size the data and only send what's needed,
both from a cost perspective, but also from a noise in eliminating false positives, is critical.
But to the bigger point you're making, I think we've all taken for granted that SIM
and detection requires storage to be combined with analysis.
And to your point, like, that's not the case anymore.
And there's so many different places you can dump data.
But as you think about Realm and our mission, like our mission is to help customers
ultimately alleviate themselves from unneeded cost.
in the SIM and unneeded risks resulting from their data that's flowing to the SIM.
But as the security market evolves, we see a much bigger opportunity for customers to have
somebody as their true data broker, which will enable them to more safely and easily adopt
the agentic solutions that could potentially end up, you know, taking over the SIM market.
So I guess that would mean what various model context protocol service being able to query your
data store, sort of, is that kind of what we're thinking?
Yeah, potentially. I mean, Colin, we've been talking a bunch about this, yeah.
Yeah, and I think part of it is also structuring the data through normalization.
We look at things like OCSF mapping, being able to bring the data into a structure that is easy to see across many different sources.
But from the AI perspective, yeah, it's thinking about, you know, model contracts protocol servers, being able to make this more accessible to the agents in other systems and thinking about those interactions.
Yeah, right. So when we think about the, you know, I've got to ask, right, because you're out there, you're rolling this product out into some big, big enterprises. Like, what's the, what's the low hanging fruit of stuff that you can just yeat from the seam? Like, what is the stuff like, even if someone weren't to go and become a customer of yours, right? If you had to tell them to eliminate one particular type of like log source or one particular type of event that's very voluminous, what would it be?
I mean, I'd say at eye level, it falls into three categories, endpoint network and firewall.
I mean, as far as the individual events and things we'd remove, Colin, what would you say?
I mean, we've done a bunch of work across those three pillars.
I mean, I have one, but I'm curious if, Patrick, if you have a guess, I'm curious if you
have a guest to which one it would be.
I have got no idea, man.
That's why I'm asking, I'm genuinely, because there's so much waste, right?
Like, there's so many dumb events, so many, like, things that just aren't even useful.
for context and I'm just like where would you even starts but I'm thinking there's got to be like a top
three right yeah I mean there's definitely those you know process in time normal event type things or
heartbeats that are you know clear non-security relevant information but then there's also areas of
you know data that is useful for for security information but is it necessarily something that
needs to be applied for your particular detection suite so one of the areas we found actually
is DNS traffic for specific domains that are known
is actually a very large chunk of volume typically.
And so DNS, no surprise, I mean, it's a pretty big protocol.
A user looked up Google.com kind of thing?
So a broader set of that and personalized to the customer in addition.
So, you know, I think we've seen a lot of, you know, out-of-the-box value for that.
All right, we're going to wrap it up.
Pete, any final words?
I mean, you know, I think like the general thesis based on what we're doing and what we would love to share with your customers is the activities that they deal with on a regular basis that resolve around resolving challenges with data ingestion, data delivery, like they don't need to be challenges.
And everybody talks about AI and everybody likes to think about all the different things AI can do.
And I think the number one thing that we've all realized is it can remove manual activities.
And it can remove a lot of the, like, painstaking things that aren't advancing anyone's
understanding or advancing security programs.
And so that's what we do.
We basically get rid of all the nitty-gritty blocking and tackling of running logging pipelines.
And we use the latest and greatest technology to do it.
I think the other thing I'd say is we are running a risky business promotion.
And so what I'd say is if there's any one of your listeners that are struggling with data,
or SIM costs, we'd love to talk to you.
And if within 10 days of using our product, we're not able to reduce your firewall, end point,
or network traffic by 50%, we'll give you the product for free for six months.
And so we would love the opportunity to earn the right to keep that promise if we have
the opportunity to do so.
All right, Colin Germain, Pete Martin.
Thank you so much for joining us on the RiskyB Snake Oilers Showcase.
It's great to chat to you both.
us. Great to be here. Appreciate the time. That was Pete Martin and Colin Germain there from
Realm Security. Big thanks to them for that. It is time for our next snake oil and now and we are
chatting with Snehall Antani, who is the CEO of Horizon 3, which is a pen testing, an AI
pen testing platform. Now, I find this stuff really interesting, right? Because throughout my career,
I've known so many pen testers
and just extremely skilled pen testers
and I think it's going to be very difficult
to automate away a lot of the skills
that they have that make them good testers.
Now that's said,
the testers that I've known through my career
are not really representative
of the typical industry pen tester, right?
So there's a lot of by the numbers
kind of pen testing out there
and I think AI is going to be massively
massively disruptive to that sort of stuff.
And given that it's automated,
I also think it's going to be quite disruptive
to the vulnerability scanning
and attack surface measurement platforms, right?
So I think a few things are sort of converging on one here
and, you know, it's probably going to put us in a better place, if I'm honest.
So Snehawl joined me for this interview
where he gets to pitch Horizon 3
and here's what he said Horizon 3 does.
We pioneer the whole concept.
of AI hackers, this idea of autonomous pen testing. And the idea there is, with no knowledge of
the environment, how can we point, click, shoot, hack, and prove what's exploitable, exactly how to
fix it, and then enable you to run a retest to verify that you're good to go. The bulk of my users
aren't pen testers, actually. They're IT admins and network engineers that just want to go home
early, and they want to make sure they're fixing problems that are exploitable, that lead to
consequence, and they know exactly how to retest and verify that they've actually solved it.
So we should really define what you mean in terms of scope when you talk about automated
pen testing, right? Because we've seen this before. We've seen vulnerability scanning
companies saying they were automated pen test 15 years ago. Is this just a better vuln scan?
What's the difference? It's a fantastic question. I just did a link.
LinkedIn post day on this. So I did a LinkedIn post on how our AI hacker node zero after
initial compromise and initial access was able to exploit HPI low, chain a bunch of issues
together, defeat an EDR, and then use that to pull the off token from Microsoft Outlook
and get global intra-admin rights. So if you think about that as the attack path, chaining
multiple issues across multiple machines, the goal of the Penn test is to show you the
the impact. A volum scanner will tell you that HP ILO is a problem. It won't show you the consequence
of what the attacker can do if they abuse ILO to achieve an objective. So the pen test is about
showing consequence. And the next part is the goal of the pen test isn't to find problems. It's to
quickly fix problems that matter. And so when you think about pen test helps show consequence,
consequence is how you prioritize. And then you're able to fix issues that are actually going to
put you in the news. I'll end with what I think about vulnerability scanners and I was a CIOG capital
I spent time as a CTO within DOD I was the largest bone scanning customer in the world for several
the major vendors and the hardest part of my job was deciding what not to fix because I would get
a list of vulnerabilities from these scanners apply a bunch of math and make those lists slightly less
crappy and because they're full of noise it weren't sure that they were exploitable maybe I had
compensating controls in place. So being vulnerable doesn't mean you're exploitable. And that's why
understanding if you're exploitable and the consequence of exploitability is super important in
prioritization. Sure, but I just asked you why you're different to a vulnerability scanner,
and then you just described how it's a better vulnerability scanner because it shows you consequence,
right? Yeah. So the most specific part is this. A volume scanner does scatter gather. It can
assess potential vulnerabilities on a single machine. What it cannot do is chain together a variety
of issues across machines. So that's number one. It's about lateral movement. It's about chaining
multiple issues, CVEs, misconfigurations, harvested credentials and other techniques that lead
to an impact, whether it's domain admin, sensor data exposure, and so on. And so when you think
about the fundamental difference, a potential issue on one host is not telling you,
much. Understanding how issues can be chained together across hosts that lead to impact is the really
important part of the insight for how you're exploitable. I mean, what I'm hearing here is maybe
pen testers were just better vance scanners all along. Maybe that's, maybe that's the lesson.
Here's another question. What's a vulnerability is by by nature, we assume a vulnerability is a
CVE, but it's actually not accurate, right? A vulnerability can be a variety of things that can be
abused for that that leads to some sort of consequence yeah a misconfiguration is not a is not a CVE
right like I I totally get what you're what you're saying there why don't you talk us through the
the sorts of things that your platform actually finds right like so you spoke about one
particularly cool case right where it went from here went from there you know grabbed a token did
this got global rights but what does a typical exercise look like
How is it deployed? Is this a continuous process? Is this a point-click go one-time thing?
You know, just walk us through a little bit more how this stuff is used.
Yeah, excellent. So let's first talk through the process of actually running a pen test.
So break pen tests up into either breaking in from the outside or assuming breach and assuming initial access.
And what is the blast radius from that initial access point?
So you can start with a zero access or you can, what, you can give this AI agent a share?
on a box and say, go from there.
That's exactly right.
At the end of the day,
good organizations assume the attacker's going to get in
because there's so many doors and windows,
whether it's rarely a zero day in your custom web app,
more likely it's a Cisicav on an edge gateway
or a misconfigured Jenkins server that's exposed to the web
or they purchase access off the dark web or so on.
They're going to get in.
So every cyber attack really starts with shell on a single host.
So if you're on and run an internal pen test,
you can come through our portal,
configure the scope, which is basically, what IP range should we operate within?
What IP ranges should we not touch?
Do you want us to be aggressive or gentle?
Go.
And then, of course, there's more advanced options like,
would you like us to auto-deploy honey tokens along the way,
which has never been done before?
Think of that as, while breaking into your house, I'll install ring cameras along the way.
So we'll auto-deploy fake ABS credentials, fake Azure tokens, fake SQL dump file.
during the pent test.
So that becomes part of the scope of the configuration.
You hit Go, it'll generate a curl command
that's a single-use Docker container
that gets downloaded.
That Docker container gets initiated
on whatever initial access point you want in your network.
Assume breach from the DMZ,
spin up that curl command on the DMZ.
Assume breach from your customer support network.
Assume breach, you know, spin up the container there.
That container will connect to our brains in the cloud,
which is a dedicated virtual private cloud session
just for that pen test.
It starts to instruct the Docker container what to do,
conduct recon enumeration first, which is pretty common.
After that, it's next best actions.
Based on what's been discovered,
should I go after the router, the printer, or the television next?
That depends on discovered stores, services,
historical record of success,
likelihood of achieving the objective of domain admin
or sensitive data exposure.
And it continues to iterate through this next best action process
until it's exhaustively and comprehensively
testes your environment.
And at the end, the Docker container shuts down
and you can delete it.
The VPC shuts down and gets completely destroyed.
So you have no persistent footprint
that you've got to go off and install and manage.
It's truly point-click shoot.
Yeah. Okay.
So this is a, you run it like a pen test exercise.
This isn't like a continuous scanning model, for example.
Now, the next part is you can run at ad hoc.
Like, I just want to run a pen test right now.
As customers, see, the problem in pen testing is the absorbative capacity for fixing stuff.
So if you run a pen test all the time, that's great.
But if you don't have the capacity to actually fix the issues, you're not making yourself any better.
So as customers improve their absorbative capacity to fix, they start running more frequently.
So they shift from one or two pen tests a year to 40 or 50 pen tests a month, assuming breach from different points of view, constantly,
finding, fixing, and verifying. So for our more advanced customers, in any given moment,
there's at least one or two pen tests running, assessing different parts of the environment,
and that's because they've become really good at quickly remediating or mitigating the findings
from these pen tests. I mean, I don't know, you know, you're very optimistic about the ability
of customers to remediate, right? Like, I've spent my entire career surrounded by black-pelled
pen testers who just dump horrifying reports onto CSO's desks over and over and over.
And quite often, they will do two reports a year apart looking at the same slice of a network
or the same application.
It'll have the same findings in it a year later.
So where are you finding all of these customers who just can't wait to fix what these
reports are surfacing?
It must be amazing to be you.
Yeah.
So, I mean, you think about the company, we are the fastest growing cybersecurity company in the
industry, according to English.
back, you know, we've got audited for that recognition. I have 4,300 companies globally using
me, and we're doubling to tripling our customer count every year as a result. There's a massive
market. But there's very different types of customers. So you've got the really large
Fortune 500 types. And the simple sales qualification question is, with an unlimited budget,
how many pen tests would you run a year? One to two, your compliance focus. Cool. Nothing wrong with
that. Four or more, that's my sweet spot. And mostly they got to run four or more because
the PCI, Dora, GDR, NIST, too. There's some reason compliance-wise, but they also want to be
resilient. So the big companies use us in that way. MSSPs are actually my fastest growing
segment. And so 80% of those 4,300 companies are fully serviced by MSPs, that white
label or OEMS, and they bundle us into their MDR offering, their SOC as a service offering as a way
to audit and improve their security controls and build their IR muscle memory.
Yeah, and I guess for them, anything that results out of these things, that's more billable
hours for them too, I'd imagine, right?
Well, they also, but they tend to also bundle remediation services with the findings,
because once again, especially in the long tail, the mid-market and beyond.
And, you know, at Black Cat, I keynoteed this topic with Bailey Bickley from NSA Cyber Collaboration
Center, the mid-market and the long tail, whether it's in the defense industrial base or
advanced manufacturing or most businesses in the world, they just want to build antennas or build
welding or whatever their big core businesses. They don't have the capacity to fix. So MSSPs end up
bundling remediation services with our pen test findings. Yeah. Okay. So I've got a couple more
questions here. One is, you know, you've spoken about the network-based side of like pen testing
and whatnot. You know, are you doing much around identity as well? Because it seems like identity
attacks these days, you don't even need to get shell, right? You just get the right identity.
You buy it. Off you go. You can pivot, pivot, pivot, pivot, all your way to your great victory.
Are you doing much around that? Yeah, in fact, our probably core expertise is credential-based
attacks. Attackers don't have to hack in with zero days. They log in with credentials that they
found. I think we all know that as practitioners. So that's one thing that we're really, really
focused on and really good at. The multitude of ways to pilfer credentials. And then the
way to abuse those credentials across an organization that's both efficient and production
safe. I can't just arbitrarily spray and prey. I'll lock out accounts and take production systems
down. So for me to be able to be really good at credential attacks against production systems,
not only do I have to understand the credentials I've found, I've got to understand and be smart
about what systems they likely have access to. I've got to introspect your credential lockout
policies, whether it's local or global and all the nuances around that.
And I've got to use that to be thoughtful in how I'm going to abuse credentials across the environment.
Okay.
Now, my last question is, where does it make sense not to do this?
Where does it make sense to stick with human brains when it comes to pen testing?
Because there is a great temptation among people such as yourself to say,
we've completely revolutionized the pen test market.
You don't need humans anymore.
I personally don't believe that's true, which also does not mean that I don't see the value in what you're doing.
I guess I'm just asking, like, as the science.
is now, where is that line?
I agree with you, actually.
So there are two areas that are uniquely human.
The first area is finding logic flaws in custom code.
That is a uniquely, especially in bespoke systems.
That is uniquely human.
And that area of pen testing, I think,
is going to be uniquely human for quite a while.
The other area that's uniquely human
are the long-tailed bespoke OTICS systems.
I can't guarantee production safety of a nuclear
power plant because I don't have one. I can't own one. I can't buy one. I can't verify production
safety against that. And so I think the long tail of OTICS will be uniquely human for a very
long time. However, I think that hybrid cloud infrastructure, assuming, you know, starting
external, gaining initial access internal, pivoting to the cloud, compromising off tokens,
dumping data from Slack, that area is primarily a graph analytics problem that is uniquely
machine-centric and algorithms are far superior at testing at scale than humans are. But algorithms
generally suck at finding logic flaws in custom code and are incredibly difficult to do long-tail
exploitation of bespoke OTICS systems. What about identifying previously unknown misconfigurations,
things like that. I mean, because they're essentially logic bugs as well. Yeah, I view that as
logic flaws, right, to some degree. Yeah. And I think humans are going to be uniquely gifted there.
To be honest with you, right? So NCC is a customer of ours.
I did a interview podcast with them.
Their guys focus on things that will put them on stage at DefCon.
That's uniquely human.
Mastering the art of recon enumeration,
dancing on domain controllers, pilfering credentials, and all that stuff.
Machines are better at that today.
Let the humans focus on the really bespoke things that are DefCon stageworthy.
All right.
Snihah Lanthani, thank you so much for joining me to pitch this stuff.
I mean, as someone who's been in the industry for a long time,
I do find this fascinating.
Yeah, thanks for joining us.
No, I appreciate the time.
Thank you.
That was Sneha Lanthani there from Horizon 3.
Big thanks to him for that.
And I really enjoyed that interview
because I got to push him a little bit there
and he gave great answers.
So who knows where all of this goes.
Okay, time for our third and final snake oiler today
and we're chatting with persona
and indeed Dimitri Greco from persona
all about their platform.
Now, their platform is all about identifying people remotely, right?
And they do this usually through some sort of live camera capture of a person's ID and of their face.
Now, these sorts of identity-proving platforms have been useful for regulated industries historically.
And then, you know, any sort of business where they have to prove to a reasonable degree that someone is who they say they are,
whether that's a, you know, Uber Eats delivery driver.
I don't know if Uber's a customer, but whether it's a delivery driver or, you know, a banking customer, whatever.
It's that sort of thing.
But what's interesting lately is that there are enterprise use cases for this sort of stuff now, right?
There's people getting socially engineered by Scattered Spider, you know, doing better identity checks using a technology like this, you know, when someone's ringing into the call center might be helpful, for example.
And then there's the, you know, the threat of North Korean IT workers infiltrating.
Western organizations. As you'll hear, Dmitri says, people are using this sort of stuff now
to do spot checks on people's identities to make sure that the person who did the interview
and did the job and did the first week's work is still around and still actually the person
doing the job. So I started off here by asking Dmitri to actually describe the user experience
when someone's being prompted for an ID challenge by the persona platform. Here's what he had to say.
Enjoy. Yeah, so the typical US, the U.S. is typically you are going to provide a live capture
of your government ID. You are going to move on to a live capture of a selfie verification. We run
biometric comparisons. We run liveliness checks to ensure you're not injecting a deep fake. You
haven't pre-recorded a video. And then you pass sort of the ultimate test of are you actually
a human in front of that camera? You're not sort of a bad actor who's, you know, pre-recorded
something. You send the user an electric shock through the handset and gauge their reaction and see
if it's real, if the reaction time's correct. But okay, so here's the thing, right?
that's always going to be an arms race, isn't it?
When you're doing these sorts of tests,
you know, I've had identity verification,
you know, apps doing identity verification flash a whole bunch of colors at me,
for example, so that I can make sure that those colors are bouncing back off my face.
And, you know, but all of these things are going to be tricks
that attackers are going to be trying to reverse engineer to try to fall.
I mean, is that the experience for you guys?
Like, on your end, do you see people trying to do this yet?
or do they just move on to easier targets once they realize that someone's actually trying to do proper ID verification?
Yeah, we see incredible sophisticated fraud attempts.
And a lot of the times we're not ignorant to the fact that it is an arms race.
So we take the approach to layer on different signals, right?
Instead of just relying on does this face look legitimate, we are collecting other signals.
What is the device?
What's the frame rates of the camera?
What's the label of the camera?
You know, are they on VPNs? Are they on tours? Have we seen session information from this, you know, browser before? We look at velocity attacks. So how often have we seen a selfie like this being submitted to sort of the persona ecosystem? And then we look at other things like similar backgrounds. You might have the best deep fake in the world. But if you rotate through 10 faces with the exact same background, we can start to detect other signals that way. So we are relying so-called on passive signals, just
as much as we are on, you know, active details that, you know, an individual is providing.
Yeah, so I guess it's like a, you know, classic case of like risk scoring, right,
with all of these different signals and like face matches, but background weird,
that might be one score and then, you know, they're using, they're coming from a Tor exit node,
et cetera, et cetera.
Exactly.
And we've seen interesting things where, and we, I almost call this, like the delivery use
case where there's tons of photos of individuals in a car that look like they're taking
a selfie about to like work for Uber Eats or Deliveroo or something.
and the actual device that is taking that photo is a is a desktop right it's a laptop but who is
actually ever taking a selfie with a laptop in their in their car those things just don't add up so
those are things that we see we you know clear giveaways that it's uh you know video injected
you know deep fake so uh what sort of customers do you have right because you can you know you
offer you can use this any way you want right it can be an SDK uh it can be like you can you can you can
you can pop up, you know, have a pop-up from a website or whatever to do it.
You know, like there's multiple different ways to use this technology.
So what are the sectors that are really using this and what are they using it for?
I mean, you know, you mentioned to me earlier that, you know, I guess this North Korea stuff, right?
Like for HR verification, you know, it's kind of turning into a useful case there.
But I'm guessing also people offering financial services and, you know, all sorts, right?
But why don't you just walk us through where the, you know, what the key verticals are for you there?
Yeah. And honestly, there is, this is, we are industry agnostic. When I first came into this, we were obviously the origins of identity verification are regulated industries. So financial services, you're signing up, you're verifying individuals to open bank accounts and sort of fintech led that drive. Now we are seeing, you know, trust and safety across marketplaces. You go to Airbnb. You want to validate that, you know, the host who's hosting you is, you know, safe. And the person who's staying in your house is also safe. Dating applications are also.
are popular, right? Verifying that you're not getting catfished is the exact person you're hoping to
meet. And then on the workforce side, it's like we are making sure that it's not an identity
mule. So a North Korean ring hasn't paid some inconspicuous individual to go through an
application for you. So validating individuals when they onboard. So just last year, we had a large
announcement with ACTA. So they use us persona internally to validate all of their hires day one
and continuously monitor those individuals
as they go through password resets
and authenticator resets and all that.
It's funny, right?
I had someone on the show recently,
from Octor actually,
talking about how they will have,
yeah, they get one guy to do the interview
and then it's someone completely different
who turns up to the job, right?
So how does that continuous monitoring piece
plug into it?
Like, is the user prompted
to go through the verification again?
Because I can imagine
they could just pass it back off
to the guy who did the interview
or is it something where there is, you know, endpoint software on the company issued laptop
that occasionally takes a photo or how does that work?
Yeah, so it is the idea of like sort of what we consider re-verification.
So at a later point in time, you do verify and it will verify you against the original selfie.
To your case, yes, you could potentially find that individual who did take the selfie for you.
Other times people are using identity mules.
They'll pay someone $10 on a dark web to go through a verification service for them.
that person's gone. It's hard to get in contact back with that same exact person.
Yeah, they're doing something else. They're on vacation, whatever, right? Like, yeah, I get it.
I get it. Exactly. And we see that a lot with like our gig economy workers. So you sign up for an account,
say on DoorDash, you're re-verified before you start your shift. That's from a legal perspective
and also from a safety perspective, making sure the right person's actually delivering your food,
picking you up. And it's not like, say, a family member or a friend.
and, you know, whose account you're sharing?
I mean, it's a story as old as time.
I remember in Melbourne, back when we used to take taxis and not ride shares,
there would be the picture of the driver in a laminated thing on the dashboard, right?
And I'm telling you, they would more often not match the guy sitting next to you.
And I say sitting next to you because in Australia we actually sit in the front seat of taxis,
but that's a whole other topic.
So I guess, you know, I was querying you before about the arms race in terms of people
being able to fool this system.
But there is the opposite problem, right?
Which is that, say, I am a legitimate identity trying to verify myself for some important
service, and persona says, no, this is a dodgy identity.
How do you tackle the false positive problem?
Because I imagine that is almost, you know, quite often with detection just generally,
whether it's threats or bad identities.
Quite often the problem is not the false negative problem.
It is the false positive problem.
So, I mean, is that where you spend a lot of your efforts?
Yeah, so I mean, we will always, the way we've designed our platform is,
is we do provide feedback in the event of, you know, quality issues.
So if someone is going through a submission and there's glare, there's blur,
there's, you know, like a specific parameter you're looking for like a date of birth or a last name
that's not available on the ID for whatever reason, like we will always try to reach,
you know, allow the user to retry.
and get through a process.
Our platform is extremely customizable.
So some customers we work with don't offer retries.
Some offer more than others.
And things for like fraud related false positives,
it's a little bit more difficult
because we obviously don't relay back to the user.
Hey, this ID looks fraudulent.
It's more obfuscated.
And typically, you know, at some points in time,
especially on like the workforce side,
there'll always be some component of manual review.
view. The idea is that a lot of the verifications that are happening right now, it's like
90% of that is all manual review. So if we can cut that down drastically, it's saving people
time. Can you think of a time where you've had to change something rapidly in response to
something a threat actor is doing? Yes. So we have this, it's funny because it's sort of a lore
inside persona, but there was one customer we're working with, and we can start to evaluate
in an anonymized fashion. Are there threat actors going across our ecosystem? So they're trying
to, you know, attack one persona customer and go across another persona customer. And there was this
guy called the couch guy. And he basically had a couch in the background and would rotate
through numerous different deep fakes. So his couch, his shirt, everything was identical, but his faces
were, you know, to the naked eye, it was very, very difficult to tell that it wasn't a legitimate
person. And that started our background similarity detection. So we realized actually, instead
of looking at the face, let's look at everything else. And in a velocity, you know,
environment, can we detect if a similar background is being submitted, you know, 50, 60 times
in a single environment? And that was sort of the formation of a similar background detection.
but uh couch guy will live on in persona yeah right couch guy's law i guess too uh an interesting
thought that i had when you're talking about this where you're essentially showing your government
ID to a camera then you're showing the face that is on the government ID to a camera i mean at that
point the computer the camera is just a proxy for someone who's at a physical presence at a at a you know
like a bank branch or whatever and you can always show someone at a bank branch a fake ID as well so it's it's it's
sort of, I guess, just, you know, I guess the problem you're solving or attempting to solve
is to get that proximity. You know, it's like a physical, equivalence to physical proximity
to someone who may or may not be who they say they are. Yeah, I will also challenge you in
the fact that although it is a proxy for someone who can visually check it, it's very hard
for someone in real time to compute connected similarities. So if I submit a driver's license to
persona and I'm on a specific device and that device is yeah yeah no I get I get the person at
the bank branch isn't going to realize that 50 people have submitted IDs where of a man wearing
the exact same tie in the last 10 seconds exactly yeah yeah so you mentioned ride share you
mentioned that you know financial institutions and whatever can do for this for their customers
you also mentioned the time with octa and this whole enterprise use case is the enterprise use
case for this kind of the new one or has you know for workforce verification is that
like the new thing? Because when I think of services like yours, right? It's very much like the bank
wants to verify a customer sort of thing or, you know, there was the IRS with the whole ADME thing
a couple years ago, right? Like that's what that's what most people think of. But now with
these North Koreans doing what they do, I would think that the enterprise thing is a whole new
practice for you, right? Yeah. For identity verification, workforce verification is very nascent. I would
say within the last 12 to 18 months. It's becoming more recent. And that's off of sort of the
tailwinds of like COVID, where remote work was flourishing and people would sign up and go to a
company and get recruited fully remote and you'd meet your hiring manager during the process and
maybe again virtually. So that's opened a lot of threat vectors. And even just with remote,
I guess remote first or remote flexible companies, the idea of people getting locked out of their
accounts, it opens up a whole vector for like social engineering. That's how most of these hackers are
actually getting in. If you talk about the young hackers, like Scattered Spider that took over
MGM and Caesars and Visa and Marks and Spencers, it's like most of the time, you know, they're just
calling and sort of finessing their way in. They're not doing any sort of like hacking to really
get in. All right, Dimitri Greco, thank you so much for joining us to have a chat about persona.
I think it's a real interesting cat and mouse like arms race kind of gig. You got there, guy.
Fascinating to talk to you about it. Thanks a lot. I appreciate it, Patrick. Thank you.
That was Dimitri Greco there from Persona. Big thanks to him for that. And that actually concludes this edition of the Snake Oilers podcast. There are links to all of the vendors in the show notes for this podcast, in the post for this podcast. I do hope you enjoyed all that. I'll be back soon with more security news and analysis. But until then, I've been Patrick Gray. Thanks for listening.
You know,