Risky Business - Snake Oilers: Realm Security, Horizon3 and Persona

Starting point is 00:00:00 Hi, everyone and welcome to another edition of the Snake Oilers podcast. I'm Patrick Gray. The idea behind these snake oilers podcast is vendors can pay us to come here and pitch their products to you, the listener. And we've got three very interesting pitches for you today. We are going to be hearing from Realm Security, which builds a, which has built a data pipeline platform, which I guess is similar to Cribble, but much more focused on security. We're going to hear from Horizon 3, and they do AI pen testing. You know, at massive scale, I find this a really, really interesting idea.

Starting point is 00:00:42 There's a few companies in this space, so definitely wanted to get one of them into one of these segments to talk about it. And then we're going to be chatting with persona. And persona do that sort of identity verification through live capture stuff, right? So they can help regulated industries, verify and identify their customers to make sure they are who they say they are by doing live capture of like a government ID and then their face. But the funniest thing is happening, which is that with all of these North Korean people turning up for jobs in Western companies, this sort of technology is actually becoming quite useful for detecting that sort of stuff. And it's also a pretty effective control against various forms of social engineering, at least in some contexts. So we'll be chatting to them a little bit later. But let's get into it now with Realm Security.

Starting point is 00:01:36 And we're going to be chatting with its chief executive, Pete Martin, and also their data science guy, Colin Germain, popped along in this interview as well. As I mentioned earlier, Realm is a data pipeline platform similar to Cribble. but very much tailored for security applications, right? And it's heavy on AI. They describe it as an AI native data pipeline platform. And, you know, for those who are unfamiliar, these sort of platforms, they are like waystations for data, right? You get to decide from a log stream, you know,

Starting point is 00:02:12 which ones go over here into the seam, which ones go into an archive for later, and which ones just get, you know, evaporated. Which ones do you just let boil off sort of thing? And yeah, so Pete Martin is the first person you're going to hear from in this interview. And I just got him to start off by explaining exactly what Realm Security is. Here he is. Realm Security is a Boston-based cybersecurity company that offers an AI native security pipeline,

Starting point is 00:02:39 which I realize could mean anything. So what I mean by that is instead of being a bytes in, bites out pipeline, we've built a transparent layer into all of the individual pipelines themselves. which gives us the ability to gather deep statistics in an understanding of the actual composition of the log. And then we can take further activities and implement machine learning and LLMs throughout the pipeline. So when we think of like these data pipeline sort of products, you know, we usually think of Cribble, right? So which isn't a security first company, but this is the this is the 800 pound gorilla in the market. So I guess what you're saying is it's a bit like that.

Starting point is 00:03:19 But instead of it just being in and out, there's something in the middle. there that enables you to actually do useful things with the data flowing through the data pipeline product. Yeah, I mean, absolutely. So we've gone head-to-head with Cribble. We've displaced Cribble. And I think the number one thing is time to value. So within seven days of deploying our product, we'll get you to onboarded and filtered with legitimate cost savings. And when you compare that to Cribble, it typically takes two to four months to get the product stood up and it requires professional services. And I think one of the things that has continually come up in the market is the data footprint for an organization is not static. So anytime you change endpoint vendors,

Starting point is 00:04:00 you change firewall vendors, you then either need to have the ability to manually use cripple and change the configurations or you need to pay more for professional services. And so I think security is super educated to the point where they just don't want to be reliant on anyone else to achieve their outcomes. And so we truly, we truly enable security practitioners to use a security enabled pipeline and migrate away from a legacy pipeline like Cribble. If you take one of our public case studies, venture employer solutions, which is a 10,000 person benefits payroll provider, they have, I believe it's 20 to 25 data sources. But the reality is, is 90% of the logs go into their SIM are from network and point and firewall.

Starting point is 00:04:47 And so when you actually look at the problem and you realize it's concentrated within those three data sources, having a solution that can very quickly onboard analyze and not just recommend filtering rules based on generic data source recommendations, but specific custom-created filtering rules using machine learning and LLMs to filter their data, as opposed to just a general PAC that focuses on just

Starting point is 00:05:13 that broad vendor. We were able to reduce 83% of their firewall logs within seven days, saving them $250,000 annually. So I guess the idea here is instead of just watching the output, as it filters into your seam and saying, well, we want to pull that back and sort of trying to adjust the input. You can kind of do this in a more, you know, do it at the point where it's in transit. Exactly. Yeah. All right. So also joining us is Colin Jermaine, who runs data science with with realm. So Colin, what's the, you know, what's the, what's the, what's the, what's the, what's the magic here? Like, uh, you know, tell us, tell us how work. How work? You know, as Pete mentioned, it's really all about scale. How do we deal with the large volume of security data and find,

Starting point is 00:05:56 you know, the, the white pieces for, uh, you know, volume reduction? You know, what, what can go to your archive, but doesn't necessarily need to go to your SIM? That's kind of the core question that that we're solving and providing value out of the box. with. And so in terms of, you know, how does it work out of the edge, right? It starts really with online statistical learning techniques that allow us to really understand the composition of the data, what's in there, and what are the opportunities for impactful volume reduction. On top of that, we use gen AI systems that are able to actually reason about the meaning of the field, the information, and the filtering rules that we can apply. So being able to speed up

Starting point is 00:06:39 that process. But, you know, as a company ultimately, we want to give the best recommendations and we do also use human in the loop expertise to review those final recommendations to make sure that they're, you know, really sort of hitting the mark in terms of reducing the volume without impacting detections. Well, you've just said something that every single company that's using Gen AI to do this sort of thing has said to me, which is it gets you 90% of the way there. And then that last 10%, it's a human who's always got to take the, like a couple of dumb things, the LLM is generated and fix them. You know, part of it is, you know, our mission is to give control to, you know,

Starting point is 00:07:17 those SOC teams and be able to have them, you know, make, make the decision. And it's really important to be able to, as you're building these AI systems, think about where does the human need to come in and really be that oversight. And I think that, you know, the thing is this is instead of a professional services, you know, contract, something that could be very expensive and take a very long time to do. So we're able to accelerate this process quite significantly with, you know, still that oversight and still the expertise to make sure that it's really got the best result. So I mean, like when we thought about bringing a solution to market and ultimately putting

Starting point is 00:07:57 in checks and balances to make sure customers can be comfortable, we really thought about it as the depth test run methodology. And so when you think about having a human in the loop, it's not necessarily a human needing to analyze into free form thought into a solution. It's a human in the loop to make sure no errors have occurred and that they are fully conscious of what it is that we're doing. So there really isn't any creation of a solution. It's more of a check the box and make sure you're aware of what it is that we've prescribed to them. It's like a code review. I think that's that's the way that I always think about it. It's being able to come back and make sure the quality is there.

Starting point is 00:08:36 I mean, you've got all those automated tests, you've got CICD, you know, and it's kind of coming back with that. Now, we were talking before we got recording, Pete, about this, you know, changes to the seam market, right? Or changes to seam methodology. And it gets interesting because you just mentioned, okay, you know, you can cut back the amount of firewall logs you're sending to your seam. There's an entire industry around trying to minimize Splunk costs, right? And this product certainly fits into all of that. But you also said, okay, you're going to direct some of the most critical logs into your seam, you're going to dump some into the archive. The question becomes, like once you've got a structured, you know, decently structured store of this sort of information and you can query it, why do you need the seam anymore, right?

Starting point is 00:09:22 And I understand that people are very welded to their seams, but we've seen all of these interesting products pop up over the years. I'm thinking of products like Panther, which, you know, again, people are using to, you know, detection as code and tune stuff that can go into the, their seam, but even with them, I'm like, well, why do you need the seam? If you can dump all of this stuff into Snowflake and query it, why do you actually need Splunk? You know, I'm guessing that you're thinking along similar lines with something like this, which is, hey, you know, you can use it to tune stuff going into your seam, but it's also an excellent way to get a lot of this data into a structured archive that then you can do whatever you want with it. Yeah. I mean, you know, I think like most of our, most of your listeners are very technical, right? And so this like,

Starting point is 00:10:05 this might be something everybody's well aware of, but we're all realizing the cost of data, right? And like the cost of data is not going away. It's only going to increase. And so even if your end goal is to dump everything in Snowflake and run, you know, detection as code on top of that, no matter what, everyone's going to need to be thoughtful on what data they're sending to what destination. And so having a pipeline that helps you right size the data and only send what's needed, both from a cost perspective, but also from a noise in eliminating false positives, is critical. But to the bigger point you're making, I think we've all taken for granted that SIM and detection requires storage to be combined with analysis.

Starting point is 00:10:49 And to your point, like, that's not the case anymore. And there's so many different places you can dump data. But as you think about Realm and our mission, like our mission is to help customers ultimately alleviate themselves from unneeded cost. in the SIM and unneeded risks resulting from their data that's flowing to the SIM. But as the security market evolves, we see a much bigger opportunity for customers to have somebody as their true data broker, which will enable them to more safely and easily adopt the agentic solutions that could potentially end up, you know, taking over the SIM market.

Starting point is 00:11:20 So I guess that would mean what various model context protocol service being able to query your data store, sort of, is that kind of what we're thinking? Yeah, potentially. I mean, Colin, we've been talking a bunch about this, yeah. Yeah, and I think part of it is also structuring the data through normalization. We look at things like OCSF mapping, being able to bring the data into a structure that is easy to see across many different sources. But from the AI perspective, yeah, it's thinking about, you know, model contracts protocol servers, being able to make this more accessible to the agents in other systems and thinking about those interactions. Yeah, right. So when we think about the, you know, I've got to ask, right, because you're out there, you're rolling this product out into some big, big enterprises. Like, what's the, what's the low hanging fruit of stuff that you can just yeat from the seam? Like, what is the stuff like, even if someone weren't to go and become a customer of yours, right? If you had to tell them to eliminate one particular type of like log source or one particular type of event that's very voluminous, what would it be? I mean, I'd say at eye level, it falls into three categories, endpoint network and firewall.

Starting point is 00:12:33 I mean, as far as the individual events and things we'd remove, Colin, what would you say? I mean, we've done a bunch of work across those three pillars. I mean, I have one, but I'm curious if, Patrick, if you have a guess, I'm curious if you have a guest to which one it would be. I have got no idea, man. That's why I'm asking, I'm genuinely, because there's so much waste, right? Like, there's so many dumb events, so many, like, things that just aren't even useful. for context and I'm just like where would you even starts but I'm thinking there's got to be like a top

Starting point is 00:13:00 three right yeah I mean there's definitely those you know process in time normal event type things or heartbeats that are you know clear non-security relevant information but then there's also areas of you know data that is useful for for security information but is it necessarily something that needs to be applied for your particular detection suite so one of the areas we found actually is DNS traffic for specific domains that are known is actually a very large chunk of volume typically. And so DNS, no surprise, I mean, it's a pretty big protocol. A user looked up Google.com kind of thing?

Starting point is 00:13:41 So a broader set of that and personalized to the customer in addition. So, you know, I think we've seen a lot of, you know, out-of-the-box value for that. All right, we're going to wrap it up. Pete, any final words? I mean, you know, I think like the general thesis based on what we're doing and what we would love to share with your customers is the activities that they deal with on a regular basis that resolve around resolving challenges with data ingestion, data delivery, like they don't need to be challenges. And everybody talks about AI and everybody likes to think about all the different things AI can do. And I think the number one thing that we've all realized is it can remove manual activities. And it can remove a lot of the, like, painstaking things that aren't advancing anyone's

Starting point is 00:14:29 understanding or advancing security programs. And so that's what we do. We basically get rid of all the nitty-gritty blocking and tackling of running logging pipelines. And we use the latest and greatest technology to do it. I think the other thing I'd say is we are running a risky business promotion. And so what I'd say is if there's any one of your listeners that are struggling with data, or SIM costs, we'd love to talk to you. And if within 10 days of using our product, we're not able to reduce your firewall, end point,

Starting point is 00:15:03 or network traffic by 50%, we'll give you the product for free for six months. And so we would love the opportunity to earn the right to keep that promise if we have the opportunity to do so. All right, Colin Germain, Pete Martin. Thank you so much for joining us on the RiskyB Snake Oilers Showcase. It's great to chat to you both. us. Great to be here. Appreciate the time. That was Pete Martin and Colin Germain there from Realm Security. Big thanks to them for that. It is time for our next snake oil and now and we are

Starting point is 00:15:37 chatting with Snehall Antani, who is the CEO of Horizon 3, which is a pen testing, an AI pen testing platform. Now, I find this stuff really interesting, right? Because throughout my career, I've known so many pen testers and just extremely skilled pen testers and I think it's going to be very difficult to automate away a lot of the skills that they have that make them good testers. Now that's said,

Starting point is 00:16:07 the testers that I've known through my career are not really representative of the typical industry pen tester, right? So there's a lot of by the numbers kind of pen testing out there and I think AI is going to be massively massively disruptive to that sort of stuff. And given that it's automated,

Starting point is 00:16:26 I also think it's going to be quite disruptive to the vulnerability scanning and attack surface measurement platforms, right? So I think a few things are sort of converging on one here and, you know, it's probably going to put us in a better place, if I'm honest. So Snehawl joined me for this interview where he gets to pitch Horizon 3 and here's what he said Horizon 3 does.

Starting point is 00:16:49 We pioneer the whole concept. of AI hackers, this idea of autonomous pen testing. And the idea there is, with no knowledge of the environment, how can we point, click, shoot, hack, and prove what's exploitable, exactly how to fix it, and then enable you to run a retest to verify that you're good to go. The bulk of my users aren't pen testers, actually. They're IT admins and network engineers that just want to go home early, and they want to make sure they're fixing problems that are exploitable, that lead to consequence, and they know exactly how to retest and verify that they've actually solved it. So we should really define what you mean in terms of scope when you talk about automated

Starting point is 00:17:30 pen testing, right? Because we've seen this before. We've seen vulnerability scanning companies saying they were automated pen test 15 years ago. Is this just a better vuln scan? What's the difference? It's a fantastic question. I just did a link. LinkedIn post day on this. So I did a LinkedIn post on how our AI hacker node zero after initial compromise and initial access was able to exploit HPI low, chain a bunch of issues together, defeat an EDR, and then use that to pull the off token from Microsoft Outlook and get global intra-admin rights. So if you think about that as the attack path, chaining multiple issues across multiple machines, the goal of the Penn test is to show you the

Starting point is 00:18:15 the impact. A volum scanner will tell you that HP ILO is a problem. It won't show you the consequence of what the attacker can do if they abuse ILO to achieve an objective. So the pen test is about showing consequence. And the next part is the goal of the pen test isn't to find problems. It's to quickly fix problems that matter. And so when you think about pen test helps show consequence, consequence is how you prioritize. And then you're able to fix issues that are actually going to put you in the news. I'll end with what I think about vulnerability scanners and I was a CIOG capital I spent time as a CTO within DOD I was the largest bone scanning customer in the world for several the major vendors and the hardest part of my job was deciding what not to fix because I would get

Starting point is 00:19:03 a list of vulnerabilities from these scanners apply a bunch of math and make those lists slightly less crappy and because they're full of noise it weren't sure that they were exploitable maybe I had compensating controls in place. So being vulnerable doesn't mean you're exploitable. And that's why understanding if you're exploitable and the consequence of exploitability is super important in prioritization. Sure, but I just asked you why you're different to a vulnerability scanner, and then you just described how it's a better vulnerability scanner because it shows you consequence, right? Yeah. So the most specific part is this. A volume scanner does scatter gather. It can assess potential vulnerabilities on a single machine. What it cannot do is chain together a variety

Starting point is 00:19:47 of issues across machines. So that's number one. It's about lateral movement. It's about chaining multiple issues, CVEs, misconfigurations, harvested credentials and other techniques that lead to an impact, whether it's domain admin, sensor data exposure, and so on. And so when you think about the fundamental difference, a potential issue on one host is not telling you, much. Understanding how issues can be chained together across hosts that lead to impact is the really important part of the insight for how you're exploitable. I mean, what I'm hearing here is maybe pen testers were just better vance scanners all along. Maybe that's, maybe that's the lesson. Here's another question. What's a vulnerability is by by nature, we assume a vulnerability is a

Starting point is 00:20:32 CVE, but it's actually not accurate, right? A vulnerability can be a variety of things that can be abused for that that leads to some sort of consequence yeah a misconfiguration is not a is not a CVE right like I I totally get what you're what you're saying there why don't you talk us through the the sorts of things that your platform actually finds right like so you spoke about one particularly cool case right where it went from here went from there you know grabbed a token did this got global rights but what does a typical exercise look like How is it deployed? Is this a continuous process? Is this a point-click go one-time thing? You know, just walk us through a little bit more how this stuff is used.

Starting point is 00:21:16 Yeah, excellent. So let's first talk through the process of actually running a pen test. So break pen tests up into either breaking in from the outside or assuming breach and assuming initial access. And what is the blast radius from that initial access point? So you can start with a zero access or you can, what, you can give this AI agent a share? on a box and say, go from there. That's exactly right. At the end of the day, good organizations assume the attacker's going to get in

Starting point is 00:21:46 because there's so many doors and windows, whether it's rarely a zero day in your custom web app, more likely it's a Cisicav on an edge gateway or a misconfigured Jenkins server that's exposed to the web or they purchase access off the dark web or so on. They're going to get in. So every cyber attack really starts with shell on a single host. So if you're on and run an internal pen test,

Starting point is 00:22:07 you can come through our portal, configure the scope, which is basically, what IP range should we operate within? What IP ranges should we not touch? Do you want us to be aggressive or gentle? Go. And then, of course, there's more advanced options like, would you like us to auto-deploy honey tokens along the way, which has never been done before?

Starting point is 00:22:28 Think of that as, while breaking into your house, I'll install ring cameras along the way. So we'll auto-deploy fake ABS credentials, fake Azure tokens, fake SQL dump file. during the pent test. So that becomes part of the scope of the configuration. You hit Go, it'll generate a curl command that's a single-use Docker container that gets downloaded. That Docker container gets initiated

Starting point is 00:22:49 on whatever initial access point you want in your network. Assume breach from the DMZ, spin up that curl command on the DMZ. Assume breach from your customer support network. Assume breach, you know, spin up the container there. That container will connect to our brains in the cloud, which is a dedicated virtual private cloud session just for that pen test.

Starting point is 00:23:09 It starts to instruct the Docker container what to do, conduct recon enumeration first, which is pretty common. After that, it's next best actions. Based on what's been discovered, should I go after the router, the printer, or the television next? That depends on discovered stores, services, historical record of success, likelihood of achieving the objective of domain admin

Starting point is 00:23:31 or sensitive data exposure. And it continues to iterate through this next best action process until it's exhaustively and comprehensively testes your environment. And at the end, the Docker container shuts down and you can delete it. The VPC shuts down and gets completely destroyed. So you have no persistent footprint

Starting point is 00:23:50 that you've got to go off and install and manage. It's truly point-click shoot. Yeah. Okay. So this is a, you run it like a pen test exercise. This isn't like a continuous scanning model, for example. Now, the next part is you can run at ad hoc. Like, I just want to run a pen test right now. As customers, see, the problem in pen testing is the absorbative capacity for fixing stuff.

Starting point is 00:24:12 So if you run a pen test all the time, that's great. But if you don't have the capacity to actually fix the issues, you're not making yourself any better. So as customers improve their absorbative capacity to fix, they start running more frequently. So they shift from one or two pen tests a year to 40 or 50 pen tests a month, assuming breach from different points of view, constantly, finding, fixing, and verifying. So for our more advanced customers, in any given moment, there's at least one or two pen tests running, assessing different parts of the environment, and that's because they've become really good at quickly remediating or mitigating the findings from these pen tests. I mean, I don't know, you know, you're very optimistic about the ability

Starting point is 00:24:58 of customers to remediate, right? Like, I've spent my entire career surrounded by black-pelled pen testers who just dump horrifying reports onto CSO's desks over and over and over. And quite often, they will do two reports a year apart looking at the same slice of a network or the same application. It'll have the same findings in it a year later. So where are you finding all of these customers who just can't wait to fix what these reports are surfacing? It must be amazing to be you.

Starting point is 00:25:25 Yeah. So, I mean, you think about the company, we are the fastest growing cybersecurity company in the industry, according to English. back, you know, we've got audited for that recognition. I have 4,300 companies globally using me, and we're doubling to tripling our customer count every year as a result. There's a massive market. But there's very different types of customers. So you've got the really large Fortune 500 types. And the simple sales qualification question is, with an unlimited budget, how many pen tests would you run a year? One to two, your compliance focus. Cool. Nothing wrong with

Starting point is 00:26:01 that. Four or more, that's my sweet spot. And mostly they got to run four or more because the PCI, Dora, GDR, NIST, too. There's some reason compliance-wise, but they also want to be resilient. So the big companies use us in that way. MSSPs are actually my fastest growing segment. And so 80% of those 4,300 companies are fully serviced by MSPs, that white label or OEMS, and they bundle us into their MDR offering, their SOC as a service offering as a way to audit and improve their security controls and build their IR muscle memory. Yeah, and I guess for them, anything that results out of these things, that's more billable hours for them too, I'd imagine, right?

Starting point is 00:26:42 Well, they also, but they tend to also bundle remediation services with the findings, because once again, especially in the long tail, the mid-market and beyond. And, you know, at Black Cat, I keynoteed this topic with Bailey Bickley from NSA Cyber Collaboration Center, the mid-market and the long tail, whether it's in the defense industrial base or advanced manufacturing or most businesses in the world, they just want to build antennas or build welding or whatever their big core businesses. They don't have the capacity to fix. So MSSPs end up bundling remediation services with our pen test findings. Yeah. Okay. So I've got a couple more questions here. One is, you know, you've spoken about the network-based side of like pen testing

Starting point is 00:27:25 and whatnot. You know, are you doing much around identity as well? Because it seems like identity attacks these days, you don't even need to get shell, right? You just get the right identity. You buy it. Off you go. You can pivot, pivot, pivot, pivot, all your way to your great victory. Are you doing much around that? Yeah, in fact, our probably core expertise is credential-based attacks. Attackers don't have to hack in with zero days. They log in with credentials that they found. I think we all know that as practitioners. So that's one thing that we're really, really focused on and really good at. The multitude of ways to pilfer credentials. And then the way to abuse those credentials across an organization that's both efficient and production

Starting point is 00:28:04 safe. I can't just arbitrarily spray and prey. I'll lock out accounts and take production systems down. So for me to be able to be really good at credential attacks against production systems, not only do I have to understand the credentials I've found, I've got to understand and be smart about what systems they likely have access to. I've got to introspect your credential lockout policies, whether it's local or global and all the nuances around that. And I've got to use that to be thoughtful in how I'm going to abuse credentials across the environment. Okay. Now, my last question is, where does it make sense not to do this?

Starting point is 00:28:39 Where does it make sense to stick with human brains when it comes to pen testing? Because there is a great temptation among people such as yourself to say, we've completely revolutionized the pen test market. You don't need humans anymore. I personally don't believe that's true, which also does not mean that I don't see the value in what you're doing. I guess I'm just asking, like, as the science. is now, where is that line? I agree with you, actually.

Starting point is 00:29:03 So there are two areas that are uniquely human. The first area is finding logic flaws in custom code. That is a uniquely, especially in bespoke systems. That is uniquely human. And that area of pen testing, I think, is going to be uniquely human for quite a while. The other area that's uniquely human are the long-tailed bespoke OTICS systems.

Starting point is 00:29:26 I can't guarantee production safety of a nuclear power plant because I don't have one. I can't own one. I can't buy one. I can't verify production safety against that. And so I think the long tail of OTICS will be uniquely human for a very long time. However, I think that hybrid cloud infrastructure, assuming, you know, starting external, gaining initial access internal, pivoting to the cloud, compromising off tokens, dumping data from Slack, that area is primarily a graph analytics problem that is uniquely machine-centric and algorithms are far superior at testing at scale than humans are. But algorithms generally suck at finding logic flaws in custom code and are incredibly difficult to do long-tail

Starting point is 00:30:11 exploitation of bespoke OTICS systems. What about identifying previously unknown misconfigurations, things like that. I mean, because they're essentially logic bugs as well. Yeah, I view that as logic flaws, right, to some degree. Yeah. And I think humans are going to be uniquely gifted there. To be honest with you, right? So NCC is a customer of ours. I did a interview podcast with them. Their guys focus on things that will put them on stage at DefCon. That's uniquely human. Mastering the art of recon enumeration,

Starting point is 00:30:41 dancing on domain controllers, pilfering credentials, and all that stuff. Machines are better at that today. Let the humans focus on the really bespoke things that are DefCon stageworthy. All right. Snihah Lanthani, thank you so much for joining me to pitch this stuff. I mean, as someone who's been in the industry for a long time, I do find this fascinating. Yeah, thanks for joining us.

Starting point is 00:31:03 No, I appreciate the time. Thank you. That was Sneha Lanthani there from Horizon 3. Big thanks to him for that. And I really enjoyed that interview because I got to push him a little bit there and he gave great answers. So who knows where all of this goes.

Starting point is 00:31:17 Okay, time for our third and final snake oiler today and we're chatting with persona and indeed Dimitri Greco from persona all about their platform. Now, their platform is all about identifying people remotely, right? And they do this usually through some sort of live camera capture of a person's ID and of their face. Now, these sorts of identity-proving platforms have been useful for regulated industries historically. And then, you know, any sort of business where they have to prove to a reasonable degree that someone is who they say they are,

Starting point is 00:31:52 whether that's a, you know, Uber Eats delivery driver. I don't know if Uber's a customer, but whether it's a delivery driver or, you know, a banking customer, whatever. It's that sort of thing. But what's interesting lately is that there are enterprise use cases for this sort of stuff now, right? There's people getting socially engineered by Scattered Spider, you know, doing better identity checks using a technology like this, you know, when someone's ringing into the call center might be helpful, for example. And then there's the, you know, the threat of North Korean IT workers infiltrating. Western organizations. As you'll hear, Dmitri says, people are using this sort of stuff now to do spot checks on people's identities to make sure that the person who did the interview

Starting point is 00:32:33 and did the job and did the first week's work is still around and still actually the person doing the job. So I started off here by asking Dmitri to actually describe the user experience when someone's being prompted for an ID challenge by the persona platform. Here's what he had to say. Enjoy. Yeah, so the typical US, the U.S. is typically you are going to provide a live capture of your government ID. You are going to move on to a live capture of a selfie verification. We run biometric comparisons. We run liveliness checks to ensure you're not injecting a deep fake. You haven't pre-recorded a video. And then you pass sort of the ultimate test of are you actually a human in front of that camera? You're not sort of a bad actor who's, you know, pre-recorded

Starting point is 00:33:16 something. You send the user an electric shock through the handset and gauge their reaction and see if it's real, if the reaction time's correct. But okay, so here's the thing, right? that's always going to be an arms race, isn't it? When you're doing these sorts of tests, you know, I've had identity verification, you know, apps doing identity verification flash a whole bunch of colors at me, for example, so that I can make sure that those colors are bouncing back off my face. And, you know, but all of these things are going to be tricks

Starting point is 00:33:45 that attackers are going to be trying to reverse engineer to try to fall. I mean, is that the experience for you guys? Like, on your end, do you see people trying to do this yet? or do they just move on to easier targets once they realize that someone's actually trying to do proper ID verification? Yeah, we see incredible sophisticated fraud attempts. And a lot of the times we're not ignorant to the fact that it is an arms race. So we take the approach to layer on different signals, right? Instead of just relying on does this face look legitimate, we are collecting other signals.

Starting point is 00:34:18 What is the device? What's the frame rates of the camera? What's the label of the camera? You know, are they on VPNs? Are they on tours? Have we seen session information from this, you know, browser before? We look at velocity attacks. So how often have we seen a selfie like this being submitted to sort of the persona ecosystem? And then we look at other things like similar backgrounds. You might have the best deep fake in the world. But if you rotate through 10 faces with the exact same background, we can start to detect other signals that way. So we are relying so-called on passive signals, just as much as we are on, you know, active details that, you know, an individual is providing. Yeah, so I guess it's like a, you know, classic case of like risk scoring, right, with all of these different signals and like face matches, but background weird, that might be one score and then, you know, they're using, they're coming from a Tor exit node,

Starting point is 00:35:08 et cetera, et cetera. Exactly. And we've seen interesting things where, and we, I almost call this, like the delivery use case where there's tons of photos of individuals in a car that look like they're taking a selfie about to like work for Uber Eats or Deliveroo or something. and the actual device that is taking that photo is a is a desktop right it's a laptop but who is actually ever taking a selfie with a laptop in their in their car those things just don't add up so those are things that we see we you know clear giveaways that it's uh you know video injected

Starting point is 00:35:39 you know deep fake so uh what sort of customers do you have right because you can you know you offer you can use this any way you want right it can be an SDK uh it can be like you can you can you can you can pop up, you know, have a pop-up from a website or whatever to do it. You know, like there's multiple different ways to use this technology. So what are the sectors that are really using this and what are they using it for? I mean, you know, you mentioned to me earlier that, you know, I guess this North Korea stuff, right? Like for HR verification, you know, it's kind of turning into a useful case there. But I'm guessing also people offering financial services and, you know, all sorts, right?

Starting point is 00:36:17 But why don't you just walk us through where the, you know, what the key verticals are for you there? Yeah. And honestly, there is, this is, we are industry agnostic. When I first came into this, we were obviously the origins of identity verification are regulated industries. So financial services, you're signing up, you're verifying individuals to open bank accounts and sort of fintech led that drive. Now we are seeing, you know, trust and safety across marketplaces. You go to Airbnb. You want to validate that, you know, the host who's hosting you is, you know, safe. And the person who's staying in your house is also safe. Dating applications are also. are popular, right? Verifying that you're not getting catfished is the exact person you're hoping to meet. And then on the workforce side, it's like we are making sure that it's not an identity mule. So a North Korean ring hasn't paid some inconspicuous individual to go through an application for you. So validating individuals when they onboard. So just last year, we had a large announcement with ACTA. So they use us persona internally to validate all of their hires day one and continuously monitor those individuals

Starting point is 00:37:23 as they go through password resets and authenticator resets and all that. It's funny, right? I had someone on the show recently, from Octor actually, talking about how they will have, yeah, they get one guy to do the interview and then it's someone completely different

Starting point is 00:37:37 who turns up to the job, right? So how does that continuous monitoring piece plug into it? Like, is the user prompted to go through the verification again? Because I can imagine they could just pass it back off to the guy who did the interview

Starting point is 00:37:50 or is it something where there is, you know, endpoint software on the company issued laptop that occasionally takes a photo or how does that work? Yeah, so it is the idea of like sort of what we consider re-verification. So at a later point in time, you do verify and it will verify you against the original selfie. To your case, yes, you could potentially find that individual who did take the selfie for you. Other times people are using identity mules. They'll pay someone $10 on a dark web to go through a verification service for them. that person's gone. It's hard to get in contact back with that same exact person.

Starting point is 00:38:24 Yeah, they're doing something else. They're on vacation, whatever, right? Like, yeah, I get it. I get it. Exactly. And we see that a lot with like our gig economy workers. So you sign up for an account, say on DoorDash, you're re-verified before you start your shift. That's from a legal perspective and also from a safety perspective, making sure the right person's actually delivering your food, picking you up. And it's not like, say, a family member or a friend. and, you know, whose account you're sharing? I mean, it's a story as old as time. I remember in Melbourne, back when we used to take taxis and not ride shares,

Starting point is 00:38:57 there would be the picture of the driver in a laminated thing on the dashboard, right? And I'm telling you, they would more often not match the guy sitting next to you. And I say sitting next to you because in Australia we actually sit in the front seat of taxis, but that's a whole other topic. So I guess, you know, I was querying you before about the arms race in terms of people being able to fool this system. But there is the opposite problem, right? Which is that, say, I am a legitimate identity trying to verify myself for some important

Starting point is 00:39:28 service, and persona says, no, this is a dodgy identity. How do you tackle the false positive problem? Because I imagine that is almost, you know, quite often with detection just generally, whether it's threats or bad identities. Quite often the problem is not the false negative problem. It is the false positive problem. So, I mean, is that where you spend a lot of your efforts? Yeah, so I mean, we will always, the way we've designed our platform is,

Starting point is 00:39:55 is we do provide feedback in the event of, you know, quality issues. So if someone is going through a submission and there's glare, there's blur, there's, you know, like a specific parameter you're looking for like a date of birth or a last name that's not available on the ID for whatever reason, like we will always try to reach, you know, allow the user to retry. and get through a process. Our platform is extremely customizable. So some customers we work with don't offer retries.

Starting point is 00:40:24 Some offer more than others. And things for like fraud related false positives, it's a little bit more difficult because we obviously don't relay back to the user. Hey, this ID looks fraudulent. It's more obfuscated. And typically, you know, at some points in time, especially on like the workforce side,

Starting point is 00:40:44 there'll always be some component of manual review. view. The idea is that a lot of the verifications that are happening right now, it's like 90% of that is all manual review. So if we can cut that down drastically, it's saving people time. Can you think of a time where you've had to change something rapidly in response to something a threat actor is doing? Yes. So we have this, it's funny because it's sort of a lore inside persona, but there was one customer we're working with, and we can start to evaluate in an anonymized fashion. Are there threat actors going across our ecosystem? So they're trying to, you know, attack one persona customer and go across another persona customer. And there was this

Starting point is 00:41:30 guy called the couch guy. And he basically had a couch in the background and would rotate through numerous different deep fakes. So his couch, his shirt, everything was identical, but his faces were, you know, to the naked eye, it was very, very difficult to tell that it wasn't a legitimate person. And that started our background similarity detection. So we realized actually, instead of looking at the face, let's look at everything else. And in a velocity, you know, environment, can we detect if a similar background is being submitted, you know, 50, 60 times in a single environment? And that was sort of the formation of a similar background detection. but uh couch guy will live on in persona yeah right couch guy's law i guess too uh an interesting

Starting point is 00:42:17 thought that i had when you're talking about this where you're essentially showing your government ID to a camera then you're showing the face that is on the government ID to a camera i mean at that point the computer the camera is just a proxy for someone who's at a physical presence at a at a you know like a bank branch or whatever and you can always show someone at a bank branch a fake ID as well so it's it's it's sort of, I guess, just, you know, I guess the problem you're solving or attempting to solve is to get that proximity. You know, it's like a physical, equivalence to physical proximity to someone who may or may not be who they say they are. Yeah, I will also challenge you in the fact that although it is a proxy for someone who can visually check it, it's very hard

Starting point is 00:43:02 for someone in real time to compute connected similarities. So if I submit a driver's license to persona and I'm on a specific device and that device is yeah yeah no I get I get the person at the bank branch isn't going to realize that 50 people have submitted IDs where of a man wearing the exact same tie in the last 10 seconds exactly yeah yeah so you mentioned ride share you mentioned that you know financial institutions and whatever can do for this for their customers you also mentioned the time with octa and this whole enterprise use case is the enterprise use case for this kind of the new one or has you know for workforce verification is that like the new thing? Because when I think of services like yours, right? It's very much like the bank

Starting point is 00:43:44 wants to verify a customer sort of thing or, you know, there was the IRS with the whole ADME thing a couple years ago, right? Like that's what that's what most people think of. But now with these North Koreans doing what they do, I would think that the enterprise thing is a whole new practice for you, right? Yeah. For identity verification, workforce verification is very nascent. I would say within the last 12 to 18 months. It's becoming more recent. And that's off of sort of the tailwinds of like COVID, where remote work was flourishing and people would sign up and go to a company and get recruited fully remote and you'd meet your hiring manager during the process and maybe again virtually. So that's opened a lot of threat vectors. And even just with remote,

Starting point is 00:44:26 I guess remote first or remote flexible companies, the idea of people getting locked out of their accounts, it opens up a whole vector for like social engineering. That's how most of these hackers are actually getting in. If you talk about the young hackers, like Scattered Spider that took over MGM and Caesars and Visa and Marks and Spencers, it's like most of the time, you know, they're just calling and sort of finessing their way in. They're not doing any sort of like hacking to really get in. All right, Dimitri Greco, thank you so much for joining us to have a chat about persona. I think it's a real interesting cat and mouse like arms race kind of gig. You got there, guy. Fascinating to talk to you about it. Thanks a lot. I appreciate it, Patrick. Thank you.

Starting point is 00:45:07 That was Dimitri Greco there from Persona. Big thanks to him for that. And that actually concludes this edition of the Snake Oilers podcast. There are links to all of the vendors in the show notes for this podcast, in the post for this podcast. I do hope you enjoyed all that. I'll be back soon with more security news and analysis. But until then, I've been Patrick Gray. Thanks for listening. You know,

Risky Business - Snake Oilers: Realm Security, Horizon3 and Persona

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.