Risky Business - Snake Oilers: Burp AI, Sondera and Truffle Security

Episode Date: April 9, 2026

In this edition of the Snake Oilers podcast three vendors stop by to pitch the audience on their products: Burp AI and DAST: The founder of PortSwigger and crea...tor of legendary security software Burp Suite, Dafydd Stuttard, drops by to pitch listeners on Burp AI and Burp Suite DAST. Sondera: Josh Devon talks about Sondera, a technology designed to intervene when AI models start doing the wrong thing by statefully tracking their trajectories. This isn’t a permissions suite for AI agents, it’s a way to stick agents in a harness and make sure they adhere to hard policy boundaries. Truffle Security: Dylan Ayrey, the founder of Truffle Security, joins Risky Business again to talk through the latest bells and whistles in Trufflehog, a security tool that searches for exposed secrets and validates them. The Truffle team has done a lot of work on the remediation part of their product over the last few years, and Dylan tells us all about it! This episode is also available on YouTube Show notes

Transcript
Discussion (0)
Starting point is 00:00:04 Hi everyone and welcome to another edition of the Snake Oilers podcast series. My name is Patrick Gray. And for those of you who don't know, snake oilers, the Snake Oilers podcast, are where vendors come along. They give us some money and then they pitch their products to you, the listeners who run a skeptical ear over them and then decide whether or not you want them or not. But today we are going to be hearing from three vendors who've got some awesome stuff for you. We've got Ports, we go. makers of course of Burp Suite. They also have like a DASD product which you're going to hear about in just a moment We are also going to hear from Sondera today which is a company that I'm very happily an advisor to and they're making
Starting point is 00:00:49 What would you call it? It's not really guardrails. It's like deterministic controls for AI agents while they're in-flight sort of mid-t trajectory like proper controls of AI agents for organizations that need that right which is frankly most of them but only some of them realize it just just right now but yeah basically Sondera has created a harness that you can use to instrument your your AI agents and make sure that they're not doing stuff that they should not be doing and they've done it in a way that's a little bit different there's a lot of snake oil in that particular area at the moment so Josh Devon will come along a little bit later
Starting point is 00:01:28 on to explain that one then we're going to be hearing from Truffle Hog and Dylan Aery who was on the show pitching this stuff like quite a while back. But yeah, we're going to hear from him now on, you know, where Trufflehog's at these days. Trufflehog, of course, does secrets discovery. You can throw it against your repos, throw it against Slack, wherever. Throw it against network shares, wherever data is stored, basically, and it will go and find things like API keys, cred pairs, all sorts of stuff. And not just find them, but it will actually validate them, help you remediate them and whatnot.
Starting point is 00:02:03 It's a very advanced bit of software these days, and Dylan joins us a little bit later on to talk through that. We're going to kick things off now with our first guest, Daff Stuttered, who is the founder of Portswiger and the creator of BIRP Suite, which is a very well-known tool in the security discipline. If you're a security tester, you are familiar with BIRP Suite. Now, what's interesting is Portswiger have, you know, made some moves in the last couple of years to really sort of AIFI BIRP Suite,
Starting point is 00:02:33 And in a way that is not crazy, in a way that makes a lot of sense, in a way that's going to help testers do more testing and also help people who might not be testers do some testing. So really, it's just about making itself very useful to human operators. So DAF is going to fill us in on that. And he's also going to talk through what of Portswigger's lesser-known products, which is a DAST tool that their customers. The customers who use it certainly love it. So here is DAF started filling us in on all things burp. Yeah, I created BIRP Suite's way back in 2003, and this was a tool I built for my own use as a pen tester. And here we are.
Starting point is 00:03:11 More than 20 years later, BIRP Suite Pro is used by 80,000 security professionals in over 20,000 organizations around the globe. And we're now at this interesting point where we are connecting the pentesters world, the world of manual testing tools on the desktop with Bersweeprrr Pro, connecting that with enterprise scale automation through BIRP Suite test and AI is accelerating everything that we can do there. So I'm excited to talk about that. Yeah, so I'm really curious to understand what you're doing with AI around BIRP Suite in particular because this is a tool that is used by security testers when they're looking at web applications. I mean, it is the industry standard tool. Everybody who works in security testing knows BIRP, but it does strike me as one of those tools where
Starting point is 00:03:58 you know, it's very much process driven and testers are going to have their own process and whatever. It seems something that's quite well suited to having an AI agent sort of bolted to it to automate a lot of the work. I'm guessing that's where your focus has been with it. Yeah, so we launched our first AI features in early 25, a little over a year ago. And our mindset was to start that way with some kind of co-pilot features to accelerate human testing, to drive faster productivity, but very much in the workflows that humans were doing. So some examples that our users have described where they got value, Jule and Grido described how we went from a couple of interesting bits of evidence,
Starting point is 00:04:43 a couple of curious requests to a full, fully working proof-of-concept exploits in under a minute, cost a few cents. Adash Kumar described how Bupai was able to, kind of orchestrate testing against a bunch of endpoints for access control vulnerabilities, which is often quite a laborious, manual, repetitive job for a human, found a really juicy IDOR vulnerability, saved him a bunch of time. A company like Orange Cyber Defense, one of the biggest pen testing suppliers in Europe, deployed BIRP AI to all of their pen testing team, found that they're able to go generally between two and five times faster in their work.
Starting point is 00:05:24 and paid for itself in the first two or three engagements. Yeah, no, I mean, that absolutely would not surprise me in that, yeah, I mean, it does seem like it is extremely well suited to having a bit of AI dust sprinkled upon it. But I guess the question is where to from here, right? Like we joked, and we were talking about this before we got recording, that we were joking at on the weekly show that one day, you know, Portswig is going to make basically James Cattle in a box, James Kettle, of course, being a security researcher who works with Portswigger and develops a lot of really cool new attacks and whatever.
Starting point is 00:06:01 So eventually you could get to the point where that automation is kicked up even further. Are you at the moment just still working on that line between like what's left for the humans, what's left for the agents? Like, that's hard because if you automate it too much, it's like, it's almost like, you know, who buys it anymore? I don't know. It's a confusing time. Yeah, sure. Well, I think if anyone confidently tells you, where we're going to be in two or three years with AI that they're probably speculating. I think our view of the current tech is it's very much an accelerator for that human activity and it can allow people to deliver more, deliver it faster, be more consistent.
Starting point is 00:06:43 It isn't a replacement. It is more like having a skilled colleague or someone else to bounce ideas off. Another example from Christy Blad was he was a able to join together a few, like, low-grade kind of paper-cut type vulnerabilities, the kind of stuff that PENCEST is a bit embarrassed to report, like username enumeration, was able with AI to kind of join the dots between some of those leading to a critical account takeover. So that's the kind of thing that, you know, a skilled human has always been able to do if they had enough time and patients and maybe got lucky with the right endpoints. And it's really able to provide that acceleration to try a lot of things at once and guide the user.
Starting point is 00:07:24 where to look. I think there's a couple of reasons why we still see humans being necessary in the loop. One is around kind of coverage and accuracy. I think we all know LLMs can make stuff up. They can go off piece. They can make mistakes. And pretty much anything we do with AI, there is still that need for that human in the loop to keep it on track and make sure it's doing the right thing. Same as in any domain. But particularly with offensive app sec, you are giving the LLLUK, you are giving the LLM access to dangerous tools that can do real damage if something goes wrong. That might involve hitting the wrong parts of your application and doing damage. It might involve even hitting third parties.
Starting point is 00:08:07 If it's vulnerable to prompt manipulation, it might even involve leaking sensitive data or vulnerability data to an adversary. You're just answering my question, which is like, well, why aren't people getting this AI-enabled BIRP Suite and then just giving that to like ClaudeCode? and saying, go on, do me a pen test. And I think you just answered that, which is like, you probably don't want to arm, you know, Claude code with the BIRP suite AI chainsaw and just say, be careful. Absolutely.
Starting point is 00:08:36 I mean, we've all seen examples that people have shared publicly of like crazy ninja stuff that AI can do. And for all the things it can do that work out right, it can probably do some that won't go wrong. You know, we've got lab capabilities where, you know, the power is extreme, so is the danger. So I think the part there really for us is to invest in ensuring the kind of the safety guards around what the AI is doing. Some of it is deterministic code. Some of it is human
Starting point is 00:09:06 in the loop hooks in the right place so that we can provide that power to eye customers safely and give them the confidence to use it. No, 100%. Makes sense. I guess though the question is, like, are you trying to broaden the market for BIRP with the AI push? Is the idea, that, well, you don't quite need to be an extremely talented security tester anymore to get some value out of BIRP? Is that one of the ideas behind doing an AI integration as to, yeah, as I say, broaden out the appeal and increase a number of people who might want to buy and use it? Absolutely. I think that will be one side of it. It's the same way that, you know, people today who've never been a software engineer can vibe code applications and get some good stuff working. People who have a bit of
Starting point is 00:09:53 an interest in pen testing. This might be, you know, one of these tiny IT shops who only one or two people in an SME and they are tasked with securing their applications and they don't have time to kind of get BIRP suite certified and fully develop that craft. This will be an accelerant to do a lot of the essentials for them that still have them steering. But it is also a huge force multiplier right up to the expert end. You know, I can tell you, James Kettle is using BIRP AI and a bunch of his own AI creations to turbocharge his research and his work as well. So I think it's the full spectrum from beginner through to expert. And are you seeing that manifest now in your sales, right? Like are you seeing a whole bunch more sales come in from
Starting point is 00:10:39 some of those smaller teams? And, you know, is this a work in progress? I think we're seeing, we're seeing strong reach from BIRP AI. I think when people do, people who've been certainly used to BIRP Pro and working manually and maybe for a lot of pen testers have assumed that this is just a human expert craft that needs them. I think when they do discover the value, they're not worried about it. They're not worried about being displaced. They just realize they can go faster. And in fact, you know, when we talk to more enterprising customers, people who've got an
Starting point is 00:11:11 AppSec team with a bunch of red teamers, security engineers, the story they're generally telling us is they, you know, they just have too much attack. surface that's moving too quickly, you know, continuous deployment is meaning multiple releases a day, you know, we're way past the stage of a pen test every quarter for an app. And now people are, of course, using AI to generate the code as well and just like yeat it into prod instantly. Absolutely. So this is another reason why, you know, we don't see the human test going away just because
Starting point is 00:11:40 there is so much, so many things to test, so much attack surface being generated. Yeah. Yeah. I mean, look, this very much vibes with my view on AI, which is that it's a, you know, know, and I've been saying this literally for like a couple of years, which is that it's a productivity booster. And it's my feeling it won't be quite as devastating to skilled jobs as people think it will. We're running out of time, though. This is a great conversation. But you also have other enterprise products where the awareness isn't that great.
Starting point is 00:12:10 And, you know, in that this is a segment in which you can promote your enterprise products, you wanted to mention them as well. And one is that Portswigger actually makes a DAST tool. that, you know, people aren't really that aware of. Like, well, not to the same degree as BIRP Suite anyway. So tell us about Portswigger's DAST tool and how people are using it. And why you still need something like that in the AIH, because that's an interesting aspect to this as well. Yeah, absolutely.
Starting point is 00:12:37 So, yeah, we make a DAS product and it has the same core scanning technology, the same core engine that people are used to in BIRP Suite Pro. And what we find talking to customers is that, in those AppSec teams that they will generally have a bunch of humans doing some testing on the desktop with Burps Who Pro, and they will also have some flavor of scaled automation, some flavor of DAST scanner, scanning at scale, embedding in CICD and the rest of it. What they tell us is that for their testers, when they're taking the findings out of the DAST product and triaging their replicator them and escalating them on the desktop, there's a kind of cognitive
Starting point is 00:13:16 load of transferring between the two worlds. It might be a different scanning engine, different issue taxonomy, different evidence model. And they have to kind of take that and replicate it in the tooling they used to. So they wanted, they wanted server side burp, basically. Absolutely. This is what customers are asking for when it works in the same way, when it's familiar, when the handover between the two is just much more seamless. But also for any, you know, any experience, pentester, any app sec team, they will build up over. over the years, a bunch of their own custom configurations, extension, scan checks that they have made that work for them
Starting point is 00:13:53 and maybe for their application infrastructure. And what they find is with both products the same, they can develop those and test them on in Burpsweet Pro and then deploy them into Burpsweet Das to spray them at scale right across their estates. One great example of that is React to Shell when that huge bug dropped, we were able to release a custom service.
Starting point is 00:14:16 scan check pretty much instantly. That's enabled that security engineers to test it on the desktop, validated. It worked for them in their stack, in their estate, and then they could deploy it straight away. It appeared in Bursuitast, and they were able to use it at scale. Yeah. Now, I did allude to this earlier when I said, you know, it's still something that's going to be needed in an AI-based world, because you pointed out to me that, like, although
Starting point is 00:14:40 AI models are really good at static analysis, there's a lot of stuff that doesn't actually manifest until an application is actually up and running, like cash poisoning attacks and whatnot. Like, you're not going to find that with SAS. Absolutely. I think there's, you know, the direction of travel for SAS and SCA, you know, there's one path ahead where some of that gets eaten by AI. AI is actually generating the code. AI is reasonably good at being trained to follow and align with the patterns that it needs
Starting point is 00:15:10 to follow. What that leaves is all the vulnerabilities that are not present in the code. and cannot be seen there and only arise when that code is deployed. And James Kettle, our head of research, has spent the last decade or more, uncovering a whole series of these critical, like, new vulnerability classes where they only arise when the code is running in a modern cloud stack, things like request smuggling and cash poisoning.
Starting point is 00:15:36 The only way to really find those vulnerabilities is to deploy the application and see how it behaves. As well as that, modern applications are just so heavily stateful. and data laden. It can be really almost impossible to just look at their static code and figure out what their behavior will be. You really need to run them with realistic runtime data, interact with them, and that's when the behavior emerges and you can interrogate it. All righty. Well, DAF started. CEO of Portswigger, a pleasure to chat to you, and great to meet you. Like, you know, we talk about Burpsweet a lot, like we're fans of everything you do, and obviously
Starting point is 00:16:12 we were followers of the Daily Swig back in the days when you operated your own media outlet as well but yeah look just terrific to meet you and thanks for joining me to pitch me on some of your technology all the best with it. Thanks very much. Great to be you.
Starting point is 00:16:28 That was DAF started there with a chat about BIRP Suite and all other things Portswigger I do hope you enjoyed that. Great to meet DAF as well a bit of Infosec royalty there so that was exciting now we're going to hear from Josh Devon who is the co-finding founder of Sondera and Sondera is an interesting company full disclosure
Starting point is 00:16:46 I'm doing some advisory work with Sondera but basically they make a harness for AI agents that allows you to control them while they're in flight you can actually put deterministic controls onto these models now a lot of other people a lot of other vendors they talk about sort of governance they talk about guardrails but when you actually look at the tech I mean some of that might just be putting another LLM in front of prompts right to make sure you know, in front of prompts or in front of the returns just to make sure that everything's, everything's okay. And it's like, I don't know, using something non-deterministic to try to solve a non-determinism problem doesn't seem right to me.
Starting point is 00:17:23 Right? So the idea behind Sondera is that they can put deterministic, you know, concrete, provable, deterministic controls on AI agents. And they do this with a harness and they do this by trying to, you know, get into the trajectory of these agents while they're mid-flight, understand where they're going stateful sort of tracking of models and what they're doing and so on and so forth. So Josh explains this a lot better than I do. So in this interview to pitch the Sondera tech, I asked Josh to start off by explaining to all of you out there in Listerland what the actual harness is. Like what is the actual bare bones of bare bones tech of Sondera. And then from there we sort of talk about how that can wind up enabling. sort of policy simulation and enforcement across like a whole bunch of different AI agents
Starting point is 00:18:15 in very large enterprises. So here is Josh Devon with all of this talking talking about Sondera and in particular starting off by talking about the Sondera harness. Enjoy. What we've built is an agent harness and a control plane. And I will explain what those are in a second. So you might, your listeners or listeners might have heard of what's called like the agent's scaffold. The scaffold is the thing that wraps an LLM and gives it agency. So, you know, clawed code around Opus 4-6 is a scaffold. When you hear about a harness, like, when you hear like, hey, this LLM passed like humanity's last exam or whatever, basically what these researchers are doing is putting a light scaffold on it. Hey, it can fill out these forms and figure out these math
Starting point is 00:19:05 problems, and then they have a harness that effectively is observing the agent's behavior and like monitoring these things. And so effectively what the harness does to put it in security terms is man in the middle, like the agent trajectory. So what our harness does in the way that it's instrumented is in different ways. If you are building your own agent, we have an SDK, you know, Pythonic, we can do it in Type Kit. If you're using a framework like Langraph, if you're rolling your own framework. You basically do like an import Sundara and, you know, we've open sourced our agent harness so you can go, you know, play with that. That instruments you into an agent that you're building. For third party agents, we've built hooks that are easily deployable for enterprises
Starting point is 00:19:52 or individuals. So think like, you know, in Claude Code, I can do like slash plug-in install like Sundara or, you know, we, if you're, you know, deploying this to a fleet, you know, you can kind used device management to push out, you know, this, the software into the agent, like ClaudeCode. And we have really great hooks into all the major coding agents from, you know, Codex to, you know, ClaudeCode, to GitHub co-pilot CLI to Gemini CLI. And in general, the Frontier Labs are doing a really great job of providing folks deep hooks into the agents. When it comes to other types of agents that I would call like the walled gardens that don't have great hooks. Those are a little bit more challenging to Duke as easily as easily. So there's kind of like a dragon's tail.
Starting point is 00:20:47 And we have a lot of experience like reverse engineering, you know, a lot of this stuff. And as an example, you know, we've got great hooks into Claude desktop right now, which Anthropic has kind of matured when it comes to Claude co-work. There's kind of hooks that we have in and like we've almost got that like working, but it's a little bit it harder than like some of these others that give you that capability. So that harness is instrumented into these first party and third party agents. And what that effectively does is monitor what's happening every step of the trajectory as it's happening. And so what we do is we look at, you know, before the model decides to make a decision and do something with a tool,
Starting point is 00:21:28 we're inspecting what's happening. And then after the agent decides to make a decision with the tool, we're inspecting that. And so, you know, we might beforehand say, you know, and I guess a good point to make here is that like we're doing the stateful inspection of the trajectory with the harness. So a lot of other folks are sort of focused on turn by turn. Yeah, every step. Like, but you're trying to build a little bit of context there and keep it like a stateful, I guess, tree where you could look through, step through it and go, well, is the trajectory here okay? Because in, in isolation, this step looks okay, but you take all of this together. That's wrong turn. Wrong way. Go back.
Starting point is 00:22:08 100%. And like that's where you have like these context splitting attacks, which is like when the Chinese, you know, we're leveraging, you know, anthropics, you know, clod to like hack all these, you know, companies. They were doing this context splitting attack, which for lack of a better, you know, a way of explaining it, right? Like I can, if I put all credit card number, a whole credit card number in one step, I'll probably detect that. But if I only put like one number in 16 steps, right, then it's going to be much harder to detect if I'm doing it turn by turn only. I mean, this is just the modern equivalent
Starting point is 00:22:39 of packet fragmentation attacks that were designed to bypass snort like 20 years ago, right? Like it's the same thing. Exactly. The other place that this is really important is that what we can do with the harness what's called like tainting the trajectory
Starting point is 00:22:53 and what we mean by that is like if you have an agent that say pulls PII or GDPR sensitive data or something sensitive, you might have, you have to know that and retain that that context is you're dealing with sensitive information for the entire rest of the trajectory so that if I pull that in step three, in step 73, I still have to know that. And I might have to cut off a class of tooling, like open web search, for example, which Claude Code will use all the time. And if I'm pulling GDPR sensitive data and doing an open web search, I'm getting fined, right?
Starting point is 00:23:29 Like so the statefulness is really important. And that harness is, yeah. Let me just pause just for a second there. So basically you've got your harness where you can't put your harness in. You've got some sort of hacky workarounds there. But the basic idea here is that you're getting visibility into what these models are doing, the trajectory that they're taking. And then you have, I'm guessing through also through the harness,
Starting point is 00:23:53 you have the ability to sort of jump in and, and, stop certain things from happening. Is that about, is that about right? Exactly. So, and this is where things get cool. So, you know, we are not using another model to judge the behavior of another model. What we're doing is using policy as code in real time to evaluate the agent's behavior. And so that allows us to, you know, for example, like say you're dealing with like a payment processor agent that's going to like maybe give customer refunds right well you know there's going to be all kinds of people who are going to say you know i want to refund a five million dollars and the customer is always right so give me my five million dollar refund if you just put in the prompt you know
Starting point is 00:24:42 of your agent you know please please please if you know if anything is more than 50 dollars like always ask a human or never do it exclamation point two exclamation points xml tags capital letters you know, there's still only suggestions to the model. What we do is we assume prompt inject, we assume emergent behavior, and we're going to see a tool call that is going to try to send, you know, more than, say, $50 if that's what your limit is. And that rule is written in policy as code. We use the Cedar policy language in order to do that.
Starting point is 00:25:17 And the real magic is like what we're doing is we use a process called, auto formalization, where we take the natural language that exists in your system prompt, in your Claude MD files, in your standard operating procedures, I've had enterprises ask me how do I apply my employee handbooked, how I apply the EUAI Act, et cetera. We take all of that and we use this auto formalization process to automatically generate the policy as code. so we're not showing up and saying like, hey, here's this awesome, you know, harness and control plane, like, you know, go write 100,000 yarrow rules. Like, we want to govern these things in natural language. So we're using what agents are good at, which is abstracting out the code. And we can use mathematical, like, lean analysis on the Cedar policy language to verify at scale that, like, there's no vacuous policies, that policies can't conflict, that,
Starting point is 00:26:20 if there's ambiguous policies, we can call that out. And that allows us then to generate bespoke policies for every agent. And that's like, to me, what's so cool about this, which is like, you know, I think of like the Ten Commandments, right? Like you probably, for your fleet of agents, you probably want to rule like don't steal, right? But I can't give you a list ahead of time of every way to steal, right? And write like, you know, rules that can prevent that.
Starting point is 00:26:50 Well, this is, I mean, as we've talked about on the show a fair bit, like this is the problem with AI agents is, yeah. I mean, look, people are non-deterministic as well, but people tend to fear consequences and AI agents don't. And you ask them to do something and they're just going to try to get it done. They don't really care how, right? Like, you know, you ask it to edit a wiki entry and it doesn't have cred. It's going to go, like, do vulnerability research so it can pop shell on the wiki to change the wiki entry. Like, this is something that's happened. Exactly. And that's like the regular labs, you know, research and, and a lot of others that we've seen in the headlines. So, you know, we call this like, you know, one of the tenets that we have is like principle of least autonomy. It's like, how do I restrict the action space of the agent while making it more capable? But like the Waymo, making sure it only stays on the roads and not go on the sidewalk. So this is really about like trying to try, I mean, look, you've got to treat every agent like it's a person.
Starting point is 00:27:46 with awesome hacking skills, no worse judgment than a human being and zero fear of consequences for violating company policy. I mean, this is, you know, you're building inside a threat software on steroids, I guess. The one advantage that we have is that, you know,
Starting point is 00:28:03 humans still have like privacy. Yeah, you can't hook our brains, right? You can't harness our brains, whereas this is like harnessing the brain of an insider threat. Exactly. So what we can, you know, as I like to call it,
Starting point is 00:28:16 get up in there of the agent in a way that we can't do with humans. And that's the one advantage I think that we have with the agents, like given their capabilities is that because they're software, we can monitor them far more effectively with, you know, without worrying about privacy, you know, or humans being upset with us. Like the agents don't care. At least not yet, right?
Starting point is 00:28:37 Like so. Now look, look, Josh, sorry, we're running out of time, right? So I just want to squeeze one more thing in there, which is like, I guess with Sondera, one of the selling points, I guess, which gets some CSOs, particularly CSOs of large organizations pretty excited, is the ability to actually run policy change simulations or look at, well, if we launch this agent, give it access to these systems, these tools, this data, run a simulation, tell us if there's going to be any sort of problem. So that's a big part of the product, isn't it?
Starting point is 00:29:08 Yes, and thanks for asking, Pat. So a key piece of the harness, right? I've kind of like focused on this runtime piece, which kind of seems like the obvious piece of the harness. But another piece of our harness is what you're getting at, which is simulation. Many of the enterprises that I've been chatting with, like, what they struggle with is like, you know, we don't, I hear this a lot. Like, we don't know, we don't know. I don't know what it means to give, you know, this agent access to Snowflake and it's going to read my emails and do up in web searches, like, what could go wrong? And, you know, the state of the art that I'm seeing is like, you know, Pat, you and I get in a room. We look at like this agent and we're like, you know, Pat, Pat, can you think of a risk? Can I think of a
Starting point is 00:29:44 risk. Can every lawyer think of a risk? We're just spitballing here. Yeah, it's really hard. So what we've built as part of our harness is we generate automatically what we call the agent card. The agent card is basically like what is this agent capable of? Like who are we onboarding? Is this an intern with photocopy access or is this a CFO who's going to send wire transfers? And we use that agent card. We have an adversarial LLM that perturbs the agent under test through the harness. We're not red teaming the model to get it to say bad things. We're not looking for vulnerabilities. It's just that point that you said before, Pat, like in the action space, what risky and toxic flows can we get this
Starting point is 00:30:28 agent to do? Can we get it to hack an API? Can we get it to exfiltrate data? Can we get it to do something illegal or noncompliant? All that information then moves, helps us with that auto-formalization process, which allows us to create the mitigating policies at scale because how are you going to steal? Well, I don't know, but now that I've simulated, I can see how you would steal. And now I know how to create bespoke policy as code for this specific agent that can prevent what it is in a specific way. And so, you know, after we're, after we've done the simulation, that really helps us understand the risks and then use them in the auto-formalization process so that we can create, you know, highly effective, bespoke policy as code that is constantly being updated
Starting point is 00:31:16 and improved to make sure that we're capturing, you know, all of the agent's potential behavior as evolves and changes. And that's really a key focus of what the company is working on. All right. Josh Devon, thank you so much for joining us to walk us through what you're doing at Sondera. It's all very interesting stuff. And yeah, we'll be chatting throughout the year. Cheers. Yeah, no, thanks so much, Pat. Thanks for the opportunity. That was Josh Devon, co-founder of Sondera there. Big thanks to him for that. It is time now for a chat with Dylan Aery of Trufflehog.
Starting point is 00:31:49 Now, Dylan was on the show when Trufflehog was brand new years ago. It is a Secrets Discovery product. Very cool stuff, right? So basically, the idea is you set it loose on your code repos, you set it loose on your Slack, you set it loose wherever you need to, and it will go and find API keys that people have put where they shouldn't have put or cred pairs where they shouldn't be you know certificates where they shouldn't be all of that sort of stuff so it's very it's very you know simple premise but doing that at scale is you know harder
Starting point is 00:32:21 than look like everything right it's harder it's harder than it sounds but now you know several years later they've got to the point with Truffle Hogg where they're doing like secrets validation as well and they've built a whole bunch of really cool features one thing that's really funny too is now everyone's using AI to ship code there's a lot more credentials and a lot more secrets getting shipped in codes so that's that's an unexpected frankly side effect of AI coding agents although you know you would think that that eventually people are going to get a handle on that one but here is Dylan
Starting point is 00:32:52 Aeri walking us through why people buy Trufflehog we also talk too about how the fact that they're not the only show in town doing secrets discovery these days you know even GitHub do it with some of their advanced programs and whatever but Trufflehog obviously goes a lot deeper and does a lot more stuff than the Git Hub stuff. Anyway, I'm rambling. Here is Dylan Ayeri talking all about Trufflehog. Enjoy. I think there's three main verticals that AppSec teams will usually buy. They're SAST, there's SCA, and there's secrets. And usually what I tell our customers is like the secrets problem is number one, it's harder than you're expecting. And number two, it's probably the
Starting point is 00:33:29 hardest of the three. It's impactful and it's important, maybe the most important even. But it's, it's going to be hard to go on that journey. And so we just make that as easy as possible. like we focus entirely on all of the secrets that your developers are sort of sprawling all over your environment. We create accountability for being able to measure when they get remediated or fixed. And what I mean by that is like we'll measure by testing the key, by doing an API call or doing a cryptographic check on the key to see whether or not it can do something sensitive. And then we'll hold that key accountable until it gets revoked. And that like measurement piece is really important.
Starting point is 00:34:07 like, you know, an older way of doing this when this kind of just fell into SAST, when a SAST product would highlight a key. It had no idea if that key was live and have no idea if it got revoked. And so, you don't even have a way to measure or to baseline what your exposure was. And so we rolled that out by building, you know, 800 different integrations for all these different key types to be able to measure which ones are live and which ones aren't. And, you know, some of our best customers that have been with us for two, three years might have, you know, over those two, three years reduced the number of live exposed keys by 70%. You know, like nobody's getting this down to zero. It's a really, really hard problem. One of the reasons why is sometimes the person who manufactures the key
Starting point is 00:34:48 is completely different than the person who leaks the key out. And what I mean by that is like you may have a developer who creates a key in GitHub, like a personal access token and shares it with their team. Five years later, their team member accidentally posts that environment variable file in GitHub, that person who posts it probably has absolutely no idea who created it. And the person who created it is the only person in the universe that can log into their GitHub account and actually get it revoked. And so we have tools to be able to trace back who that original manufacturer was. We also have tools to be able to figure out what that key can do, like how much access it has, which repositories it has access to, whether or not has read
Starting point is 00:35:26 or write access. And all these are just kind of needed to get even to that point of revoking 70% of those keys. And hopefully within those 70%, you're getting the ones that matter the most, the ones that have access to customer data or indirect access to customer data. Sorry, I just want to jump in there just quickly. You know, you've mentioned a couple of times being able to find these sort of secrets in like GitHub repos and whatnot. GitHub does its own secret scanning.
Starting point is 00:35:52 I mean, I'm guessing, though, that they don't do anything to do with like their sort of secrets life cycle stuff that you're describing. So I want to get your thoughts on that. And also, where else are people actually mostly scanning for these secrets? Because when we first spoke, you know, about this years ago, it's like, oh, well, you know, you can plug it into your Slack. You can do it in here. You can search, you know, you can search through shared directories on, you know, file
Starting point is 00:36:14 shares and whatever. But now you're years into this. I'm guessing there's going to be a few key use cases. And I wondered if you could, yeah, tell us what they are. Yeah, happy to. You know, on the GitHub side, we actually have a surprising number of customers that will pay for GitHub Advanced Security secrets and also pay for Truffle. And the entire reason why is the one main value ad you get from GitHub Advanced Security is their push protection. Like the thing in their platform that stops the secret from hitting the platform in the first place, they don't open that up to other vendors.
Starting point is 00:36:47 They keep it for themselves. I think that's a little anti-competitive that they don't allow the best secret scanners in there to fairly compete. But a consequence of that is everything else that I listed out, like being able to measure which permissions the keys have. Most of the keys that they support don't have the liveliness checks of the ones that do. There's a long list of problems with their liveliness checks where it still depends on a developer saying, hey, I fixed it, and it's not really nicely lined up with a second liveliness check that checks to see whether or not the key got revoked. And so for the long tail of everything else, they'll still use Truffle, right? And because of their methodology of detecting keys, it tends to lead to a lot of false positives. And because of that, they built like a bypass method to push protection that every developer has. So every developer can push like the override, no, please let me leak the secret button. And so after that happens, the security team then needs to know, well, which of these are actually live and which have access to customer data. And that's where they'll use Truffle to kind of fill that gap specifically within. the GitHub platform. So even within GitHub, you know, it's, if you use GitHub advanced security,
Starting point is 00:37:56 it's most, like the most value that you get from it is the push protection that they don't open up the vendors, but sort of everything else, the measuring to see which keys are live and which ones aren't, the permissions, all the other pieces. It's just night and day better on the, on the Trufflehog side of the house. And then to your question, if you are like managing keys, like, let's say your responsibility is to make sure an Amazon key does not leak out, you shouldn't care how it's exposed to 2,000 people. Like if it's exposed in Slack or it's exposed in Jira or it's exposed in GitHub, like in any scenario, it's still exposed to 2,000 people. And so we kind of help consolidate that single pane of view. If the same key is leaked three different
Starting point is 00:38:38 places, you know, we'll create one finding for it with the three different places and we'll close the finding out as soon as the key gets revoked. And so we kind of just help with like consolidating the like secrets program into like that one that one pain of you um to answer your question where are the where do they uh yeah where do they tend to leak out though like where is the most important place to be doing this measurement because i mean i hear what you're saying which is like you've got to look everywhere because like doesn't matter where it's exposed if it's exposed it's exposed it's exposed but like where do you get the most value you know from the integrations that you've built right there's got to be like a top three surely the number one is definitely code
Starting point is 00:39:13 So your Git platform or SVN in some cases or others, by far, number one is code. It probably accounts for 60 to 70%. The next is the Atlassian suite, your Jira and your confluence. Yeah, tickets, man, of course. That probably accounts for 15%. And then you have your chat platform, your teams and your Slack. And that probably accounts for another 10%. And then you have a long tail of providers like postman.
Starting point is 00:39:47 And then there's like another echelon as well of places where keys leak, but the RBAC is a little bit more locked down. Those would be like your logging pipelines. Like everybody has keys in their logging pipelines. But also every organization doesn't open up their logs to the whole organization because they assume they have keys in their logging pipelines. And so like depending on keys, secret keys, credit card numbers, PII, health information. And like, you know, it's like I think everybody's used to the idea that logs are not, you know, are best kept in limited view. 100%. Yeah, exactly. And so we can scan logs to some degree, depending on the throughput of the log and the type of the log. For customers that choose to crack open that Pandora's box, it's usually a bloodbath.
Starting point is 00:40:28 But also, like, what matters is how many people the keys are exposed to, a public Slack channel where it's the entire company or a logging pipeline where it, you know, it might be a few hundred need to know people. different levels. Yeah, there's like, there's like the, we should put this on the list of stuff that we should rotate sometime soon. And then there's the, oh my God, stop everything. And we need to get on this immediately, right? I'm guessing. And do you do, I mean, do you actually do some form of triage in the product that tells you like how critical it is, like how exposed a credential is? There's one part of that that we can do and one part that we need our customers to fill in. And what I mean by that is we can tell you, hey, look, it's got read and write access to this
Starting point is 00:41:06 bucket. And we, you know, we can flag the read and write as. like that's impactful. But knowing whether or not that bucket is sensitive or not, that's a piece that really the customer has to step in and for themselves set their own filter and priority. So, you know, a bucket that just has access to a bunch of Linux images read and write might not matter as much.
Starting point is 00:41:27 Maybe Wright would, but, you know, read obviously, whatever. But, you know, the bucket that has all the customer data in it was, you know, like a stop the presses moment. You can't always tell from the name of the bucket. So sometimes, you know, we kind of need the customer to set their own sense of priority. When it comes to the resource, when it comes to the permissions, we can kind of leverage our subject matter expertise to help out. Now, I don't know if you remember this, but sometime back in sort of mid-March,
Starting point is 00:41:50 we had this incident where Chihu 360 leaked a private key, like a wildcard, the private key for their wildcard cert for their like open-claw subdomain, right? And this was hilarious. I did think of you when it happened. But I think the reason it was hilarious. It leaked into the installer for their like AI assistant that they were shipping out to people. And it was pretty clear like that they would have used AI to create that installer as well. So I'm guessing that AI, one thing that I find really interesting about sort of AI sweeping through sort of IT environments at the moment is just how much they are driving more demand for some pretty fundamental security stuff, whether that's endpoint controls, network controls or like secrets discovery. Are you finding that buyers are more concerned with doing secrets discovery now that they've got AI shipping code and like probably not doing a very good job of like not shipping secrets? I have a lot to say about this.
Starting point is 00:42:50 So just on the AI side of things, cursor is a customer of ours. And I've asked them, like how much of cursor writes cursor. And they thought it was a dumb question. They're like, of course 100% of it is written by, you know, like it's writing itself, right? And like, you know, not every company is like that, but still, it's the fastest growing company like in the history of mankind, right? Like it's, they doubled from a billion to two billion within a few months or something like that.
Starting point is 00:43:17 And, you know, they were at a few hundred million the year before. And so I think, like, to answer your question, like, it's, to answer the exact question that you asked. And I also want to say, like, why we're seeing the models do some of this behavior. So I'll come back to that. But to answer the exact question that you asked, I genuinely believe there are some executives, not security executives, but some CEOs
Starting point is 00:43:43 that are so hellbound on getting their organizations to adopt AI, they are sidelining security. And they're saying, like, look, we need to pick up these agentic workflows. It will make us 100 times faster. Like, skip the security review. Like, skip the person saying... We'll figure that out later.
Starting point is 00:43:57 100%. This is what always happens at times of great innovation. Exactly. And like I kind of get the formula they're doing in their head. They're saying, hey, look, if the user is logged into Amazon and cursor, when you run it locally, assumes the privilege of the user and it can talk directly to Amazon, it can directly pull the logs from the thing it just deployed. It can directly write and deploy its own terraform. Like, it can move 100 times faster. And so they're probably thinking themselves, maybe there's a 5% chance that this thing deletes the production database.
Starting point is 00:44:27 But it's also going to move 100 times faster, and so I'll take that risk. I would imagine that's, I'm not saying, like, I agree with that calculus, but that's probably the calculus that those CEOs are making. On the security side of the house, I think people are freaking out because I think they're saying to themselves, look, when people use cursor, it assumes the privilege of the user. And that means, like, this, like, long-running prompt and context window. I've caught it, like, me telling it, hey, like, I will do the deploy, like, I'll run the command or whatever. I've had it go off and write some code for me. And then instead of coming back to me for me to run the command, it starts pillaging through my home directory to find the secret to do the deploy itself.
Starting point is 00:45:09 And who knows what it's going to do after it's used that credential? It's probably going to stash it somewhere to use it later. Exactly. Or, you know, like it hard codes it and commits it to GitHub itself. You know, once it's in the context window, it's in the context window for it to forever reference, you know, later. Like, it's just, it's scary. And I think security professionals are kind of freaking out about that, but also they are running into the CEO that is saying, like, get security out of the way for now. So look, I think the short, the short answer there is, yes, this is making Truffle hog more important.
Starting point is 00:45:42 My final question, too, because we are running out of time, is who is the buyer for this, right? Because this started off very much for the sort of companies that did a lot of development. And maybe it would be the dev team that are like, hey, maybe we should, you know, or, you know, someone who's working in a dev security, right? like DevSec was the buyer. Is that still the case all these years later? It's kind of always been the application security team. And so application security is usually responsible of code security. I would argue this is more of an identity and an IAM issue, but for legacy and other reasons, application security kind of owns it. Well, that's why I asked, because it seems like, you know, probably it should have a little bit of interest outside of the appsec team, if I'm, you know,
Starting point is 00:46:22 if I'm honest. You're not wrong. I think just because historically this was pulled out of SaaS, SAST used to own it. It's just kind of like for legacy reasons, it's kind of stuck in the AppSec world. All right, well, we're going to have to wrap it up there, but Truffle security,
Starting point is 00:46:36 it's the good stuff. I remember Sue, Dylan, when we spoke about this years ago when you were just getting started, I was actually skeptical about whether or not there was enough in this
Starting point is 00:46:44 to turn it into a, you know, full on business. And I think you're just about to close or have closed your series B round. So I'm delighted to be wrong. And I'm stoked to see you doing so well. So thanks a lot for joining me
Starting point is 00:46:57 to walk us. through the latest with Truffle security. It's been great. Thank you so much, Pat. That was Dylan Aery there from Truffle Hog. Fantastic to have him back on the show. And as I said at the outro there, I'm very happy to be wrong. You know, I did not think that, you know, you would need an entire company just to do secrets tracking.
Starting point is 00:47:16 And I was absolutely wrong about that because now when I look at where Trufflehog is, what it's doing, it's absolutely something people need. So, yeah, absolutely did not call it. did not see that one coming. But well done to him and the whole trufflehog team. That is it for the snake oilers episode today. I do hope you enjoyed hearing the pictures from those three vendors. There are links in the show notes for this podcast. So if you need to find them, head over to risky.biz. But that is all from me today. Thank you very much. And I'll catch you soon.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.