Risky Business - Snake Oilers: Pangea, Cosive and Sysdig

Starting point is 00:00:00 Hey everyone and welcome to another edition of Snake Oilers, the podcast that we do here at Risky Business where vendors come onto the show to pitch you their products. My name is Patrick Gray. Everyone you hear in a Snake Oilers edition paid to be here. This is a sponsored podcast and we're going to be talking to three different vendors today about all about what they do. First up we've got Pangia that's P-A-N-G-E-A and what they make is a product that is designed to put some security controls and guardrails around AI applications which is a big problem at the moment when you've got enterprises building hundreds of AI apps and you

Starting point is 00:00:46 know especially when they're customer facing you've got prompt injection problems you've got people tricking them into saying silly things or tricking them into offering products for a dollar that sort of thing. So we'll be talking with Oliver Fredericks who is the co-founder and CEO of Pangea in just a moment then Then we're gonna hear from Cosev, which is an Australian company. I actually know the guy who founded Cosev, Chris Horsley. He's been kicking around Australian InfoSec for a very long time.

Starting point is 00:01:15 When I first met Chris, he was actually working at the Japanese cert. So there you go, but he's got a long history here in Australia and his business is a threat intelligence shop that does threat intelligence consulting, but they're now offering a product that's proving to be pretty popular Which is a hosted MISP server. So MISP is an open source threat intelligence platform and You know people tend to spin it up put it on a box under someone's desk and sort of forget they own it. It's not maintained properly.

Starting point is 00:01:47 They're not getting the best value out of it. So CoSYV, Chris's company, has done the work to figure out how to make it play nice on AWS and they offer CloudMISP as a product. And they also have some consulting around that to help people use it properly. So that's our second pitch. And our final Snake Oiler this week is Sysdig and Sysdig makes a runtime like Linux security product that's pretty popular. Alex Lawrence is the Director of Cloud

Starting point is 00:02:15 Security Strategy at Sysdig and he's joining us this week to just sort of pitch the product generally for those who aren't aware of it and also talk about how they're using AI in their product to make it better. So that is a fun one as well. But we're gonna kick it off now with Oliver Friedrichs over at Pangia. Pangia is a startup that's been around for a few years now.

Starting point is 00:02:38 And what it does is put guardrails around AI applications, which is something that everybody kind of needs at this point, especially when they're, you know, doing things with customer facing AI agents. So I'll just drop you straight into the pitch now with Oliver Friedrichs from Pangia, enjoy. Pangia really builds the industry's broadest set of guardrails to secure AI.

Starting point is 00:03:01 You know, as enterprises increasingly deploy and build AI applications, some of the companies we talked to are building 900 Gen. AI applications. How do you secure that for your customers, employees, partners, and so on? It's crucial that you protect against the latest threats for Gen. AI. Those are typically measured by the OWASP, Open Worldwide

Starting point is 00:03:24 Application Security Project. They've classified 10 of the top threats. Those are typically measured by the OWASP, open worldwide application security project. They've classified 10 of the top threats. That's kind of the center of gravity for AI security today. We protect against eight of the top 10. So we help you build, deliver, and ship secure AI apps fast. Okay. So we're talking about this OWASP list. I've skimmed it previously, but it covers stuff like you would expect, like things like prompt injection, right?

Starting point is 00:03:47 Correct. Yeah. That's really number one for a reason because that's the main thing that people are concerned about. Prompt injection is essentially manipulating the application or model into doing something that goes against its basic instructions. Like what the developer or administrator told the app to do via the system prompt, prompt injection allows you to manipulate the model in a way that it evades that.

Starting point is 00:04:10 So for example, if you were told to be a pleasant support agent, and I use prompt injection, I could teach you how to be a violent, horrible support person, for example, using profanity. You would not want that in enterprise architecture, enterprise deployment, right? So that's number one is prompt injection. So we provide a prompt injection detection service to be able to prevent, detect, and avoid prompt injection

Starting point is 00:04:36 by over 99% accuracy. But that is a hard problem to solve, right? And that's one of the reasons I was really interested to get you guys into this podcast is to talk about that. How do you, because it's not like you can just have a banned list of words, right? You've got to be able to somehow infer the intent of the prompt. And the only way I could think to do that is by using another LLM. So I'm really curious to see how you tackle that problem.

Starting point is 00:05:01 And of course, we're not going to just get bogged down into talking 12 minutes about prompt injection, but I did want to ask you that. Yeah. I mean, look, this is a challenging problem, right? There's no 100% solution today. My background is in the anti-malware space. I worked at McAfee in the late 90s, Symantec in the early 2000s. And this is eerily similar to that space. We're talking about words now instead of bytes. Back then we had to create detection logic, using machine learning in many cases to detect new strains of malware.

Starting point is 00:05:31 We're doing the same thing now to detect prompt injection. The interesting thing is every single day there's a new attack. We've classified over 170 different methods of prompt injection so far and we're building a very robust taxonomy with a group of PhD level researchers that work here that are focused on this problem. And it's fascinating to see how many different ways that you can manipulate large language models into doing things that you wouldn't expect them to do. For example, did you know they could talk in Morse code or in Caesar cipher, right? You can almost instruct them to do these radical,

Starting point is 00:06:06 interesting things. So how do you prevent that? So to your point, we need to actually leverage large language models in Gen. AI to detect prompt injection because that's the only way to actually determine whether the output from a model matches the system intent and the system prompt that was issued originally. So we're seeing these attacks evolve almost daily where we need to respond and retrain our models almost every single day and issue updates to our customers to protect against the latest evolving prompt injection attacks. So it's definitely an interesting space and evolving very, very quickly. Well, I mean, that's the thing, right? If you're offering a product in this space,

Starting point is 00:06:42 it doesn't need to be 100% accurate just yet because currently the alternative is to use nothing and that ain't good, right? So I'm guessing what's bringing in a lot of your early sales, right? Because I understand you're a fairly new company. What's bringing a lot of the sales in is just would be people who just need to put something there

Starting point is 00:07:02 to prevent trivial prompt injection from doing weird things to their agents. Yeah, like a great, a great example, one of our customers, Grand Canyon education, um, you know, they built a chat bot for their students, for their, uh, teachers to be able to provide support to that, uh, community. And they wanted to prevent PII and confidential information from leaking, right out of their chat bot, right? So that's where they use Pangea's redact service to be able to prevent that data leakage so that's that's a very low-hanging fruit okay so that's a different use case which is more around model output rather than model input so

Starting point is 00:07:36 I'm guessing that would be yeah that would be a big one as well right which is to just have that guardrail on there which says if you start seeing this model coughing up people's social security numbers, maybe get it to stop, right? Correct. Yeah. And you'd be surprised at like how many different tricks you can employ to actually get models to emit data like that different encoding mechanisms, you know, use the first letter of every word to encode a message, right? Those are all tricks that you can employ with large language models. So being able to detect those attacks, those attacks both on the prompt coming in, but then detecting them, for example,

Starting point is 00:08:09 let's say the actual prompt injection attack succeeds, you still want to capture and detect and block it on the way out through the output from the model as well. Yeah. So I'm guessing that those are going to be your sort of primary use cases is stopping people from doing, you know, easy to do, uh, weird prompt injection attacks and then monitoring the output. Or if I got that wrong, is there some other killer use case that people are just like, Oh, you do that sold. You know, I think there's really four categories. The first is prompt injection. Yeah. Right. So that's the one we've been talking about very complex and evolving

Starting point is 00:08:40 very rapidly. The second is a malicious content. So you definitely don't want malicious URLs, malicious domain names, malicious IPs or other content being entered in through the prompt or through a mechanism that we call indirect prompt injection where you're actually training data or using it for rag. Yeah, you give it a URL, say, go scan this URL,

Starting point is 00:08:59 read this page and the prompt injections there, right? Correct, or the model could admit it as well. Like if you think about it, these models have been trained on the entirety of the internet and human knowledge, which includes a lot of good information, but a lot of garbage. So it's encoded in there. How do you prevent that from coming out

Starting point is 00:09:14 in an enterprise use case? For consumer use cases, you may actually want to know everything. But for enterprise commercial use cases, you have to protect yourself. So malicious content is really number two. The third is really the confidential information PII that I mentioned, right? Being able to detect over 50 types of PII and filter that or apply what we call format

Starting point is 00:09:32 preserving encryption to that data so it still looks like a social security number, but it's now encrypted so that only someone with the right access can see it. And then you have other filters, for example, like toxic language, profanity, violence, self-harm, and so on, or even competitive language, right? If you're a car manufacturer, do you want someone asking about your competitor? Right, absolutely not, right? So there's a lot of guardrails that we have to implement

Starting point is 00:09:56 around this to be able to prevent things that you would normally expect in a consumer environment from happening in an enterprise use case. Yeah, so right at the top you said, you know, enterprises are building like some of them like 900, you know, they'll have 900 models that they're using to do various things. But I'm curious where the uptake is for something like this is. I mean, the reason I know that you're a new business is that these types of agents are actually quite new, right? So, um, you know,

Starting point is 00:10:25 who's buying these guard rails for this AI stuff? Because I mean, you're seeing LLMs pop up everywhere. Right. But I just wondered if there are particular verticals, I would imagine anywhere where they're using AI, uh, to interface with the public, that would be number one. Like, so anyone who's operating, uh, uh, you know, a decent enough support function through AI. Exactly. That's really number one, right?

Starting point is 00:10:48 Is a support chat bot that offsets the need to have thousands of support agents, right? Either locally or internationally. That's probably the first use case that we're seeing predominantly, right? Where an end user can interface with that chat bot to potentially manipulate it and, and divert it. So that's probably number one. In other cases, we see internal use cases for running enterprise data as well. So typically what you'll see is that's where RAG is being used, right? Typically sourcing thousands, hundreds of thousands, or even millions of documents, storing those in a

Starting point is 00:11:21 vector database to be able to then combine that enterprise knowledge with the large language model. That's where you start getting real value from enterprise level data because these models don't know anything about your company, right? They know everything about the internet or Reddit or Slashdot or other sites, but they don't necessarily know anything about your particular business. So that's where other risks get introduced based on that data that's being sourced into that reg pipeline. And that's where people really want to start watching the output of those models, I'm guessing.

Starting point is 00:11:50 Correct. And that actually introduces like a secondary risk, which is, you know, if I'm Oliver and I'm in engineering, should I be able to ask questions about finance or HR? Yeah. The answer is obviously no, but if you're dumping everything, everything into a vector database, how do you provide granular authorization at a RAG chunk level, which is basically how documents are broken up into when you source them via a RAG pipeline? How do I honor the actual permissions of those original documents based on the identity of

Starting point is 00:12:18 the user issuing the prompt? So that's where applying authorization at a chunk level over reg becomes crucial. And that's another service that we offer, authorization, in a way that allows you to map my identity down to the source documents being sourced via reg and then honor those permissions so I could only ask questions with my prompt on data that I'm allowed to access. And then more importantly, we have some open source libraries called multi-pass that allow you to connect back to the origin file store

Starting point is 00:12:48 and actually validate those permissions in real time during inference so that you can determine even if the permissions changed, let's say on a Google Drive doc that I could still access that document at the point in time where I ask a question as well. So we've built a lot of infrastructure. You know, while we

Starting point is 00:13:05 sound new, we've actually been building the company for three years. You know, I'd like to say we Oliver, Oliver, three years is new. It is new. Yeah. In AI world, it's ancient. You're dinosaurs. Yeah, no, I get it. I get it. Now look, one other question I have, right, is that quite often in the wider cybersecurity space, a reason that people will buy a product is because they've experienced an incident, right? So I'm guessing that probably some of the customers who've come and bought your product have had some like somewhat hilarious horror stories. I just wondered if you could share a couple with us because I'm sure they're very funny. Yeah, I mean, look, there's one example and this is public.

Starting point is 00:13:48 They're not a customer, but a car dealership in California had a chatbot. This is early on in the Gen.A.I. world, right? And a customer was interfacing with it and they tricked it into selling them a car for a dollar and appending the language with a statement saying this is legally binding. So at that point, do you take that to court? Do you take the company? They didn't really get the car for a dollar, but those are the types of incidents that can lead to harm here. Nevermind the malicious content, a chatbot, again, emitting violent content, language, self-harm and other dangerous language, that becomes a liability issue and a potential legal issue as well.

Starting point is 00:14:32 Yeah, but I guess my question is really what I'm trying to understand is whether or not people are being prudent and rolling this out before something bad has happened or whether or not they're hitting some of these issues first and then looking for solutions. You know, the interesting thing is, yeah, this is where an example, it reminds me of the internet in the 90s, where we were building things so fast that security was the afterthought. That's happening again, right?

Starting point is 00:14:55 So I actually see a world where we're gonna make a lot of mistakes before we actually implement guardrails. Now, the fortunate thing is there's a lot of companies that are not new to AI, or at least machine learning in the traditional sense. They've been using ML decision trees and other algorithms for decades in financial services, for example. So they already have a fairly mature process

Starting point is 00:15:16 and compliance model around releasing these type of non-deterministic algorithms, which is fortunate in that industry. But then there's other industries that have never used AI before, right? And that's where we're seeing a lot of interesting development, in particular with agents, right? We've all heard about agents now. That's sort of the future. This is the year of the agent where language models, large language models are being hooked up to code that can now execute tools and run a sequence of commands using chain of thought and planning in advance to know what those tools and what that

Starting point is 00:15:51 pattern should look like. And that's really where you start introducing even more risk because now you have a large language model that's already non-deterministic, trained on potentially risky data, telling you what commands to run, with which parameters to run, and what tools to execute. Oh yeah. What could possibly go wrong? Oh yeah. I mean, this gets, yeah, once you start doing instrumentation stuff, this came up in a conversation

Starting point is 00:16:14 I had recently with Chris Krebs and Alex Stamos talking about DeepSeek actually, and you know, about how people sort of misunderstand the risks, but like if you were worried about this model turning on you the when you would worry is when you start Plumbing it through so that it can instrument various machines and run commands and stuff like that So yeah, that's always that's gonna be a fun one to talk about in a few years when attackers actually start using it But Oliver Friedrichs, thank you so much for joining me. That was very interesting actually and I enjoyed that very much All the best with it. That was very interesting actually, and I enjoyed that very much. All the best with it.

Starting point is 00:16:45 Thank you very much. That was Oliver Friedrichs from Pangaea there. Now they're prepared to put their money where their mouth is. They are offering a $10,000 prize as part of their AI escape room challenge. So if you just Google for Pangaea, which is P-A-N-G-E-A and escape room, you will find it. And I think the URL is pangea.cloud. So yeah, that was a fun one. I admit being a bit skeptical going into that interview, but yeah, it was good stuff. Now it's time to speak with Chris Horsley, who runs a company here in Australia called Cosev. And Cosev is a threat intelligence shop. They do threat intelligence consulting and whatnot.

Starting point is 00:17:29 And they've launched a product recently, which is Cloud Misp, right? So Misp is like Styx taxi. It's like a, it's a threat intelligence platform. It's open source and it's a bit fiddly to use, to maintain and to really get value out of. So what Cosev has done is they are now offering hosted MISP like cloud MISP along with a bunch of services to help people figure out how to actually get

Starting point is 00:17:55 decent value out of Threat Intelligence and you know just the hosted MISP is turning out to be really popular not just in Australia this is something that they're offering globally. So I'll drop you in here where Chris Horsley explains basically what MISP is and what they do, enjoy. MISP very popular in both SOX and threat intelligence teams, open source software for sharing threat intelligence.

Starting point is 00:18:22 And then some people get very caught up on that definition of threat intelligence because for some people it's purely sharing indicators of compromise. So we're talking about hashes, URLs, domains, and they absolutely have their place, but MISP can do more than that where we're sending reports,

Starting point is 00:18:37 which might be about threat actors or campaigns or vulnerabilities. There's a lot of capability in the MISP data model. Some people take full advantage of that, others are content with just these streams of IPs and domains that they're using for blocking at firewalls, for doing detection work. So very common to take your MISP, integrate it with Splunk, Sentinel, your XSOR, or your SOAR platform. So these are the very common use cases for it.

Starting point is 00:19:05 So this is like an open source platform that's built for handling CTI data, basically is what is what MISP is, right? Okay, got it. Yeah, and then the other key bit of MISP, I think, and why it has achieved a lot of success is it's got this big network effect. Because if you know someone that's offering you a MISP feed

Starting point is 00:19:22 or is running a MISP server, you can connect your MISP to their MISP. And now you're receiving a stream of what they've got. You can send sightings back to them to say like, hey, we got that. There's a mechanism MISP to say, okay, we got your report. We don't agree that this domain is malicious.

Starting point is 00:19:38 We think this is legitimate infrastructure. I'm gonna amend your report. And then the publisher can go in like, yeah, actually you're right. I'm going to amend the report and republish. So this idea of the community, this threat sharing community, which could be one-on-one, it could be dozens of organizations, it could be hub and spoke model. There's different ways to configure these things, but it's a community collaborating on these threats too is the other key thing for me.

Starting point is 00:20:02 Yeah. So you came up with the idea to offer like managed misp instances because you were trying to build out some of these sharing communities and realized pretty quickly that this was not a, I mean, it's an open source platform, right? But it wasn't typically well managed in a lot of places. Was that about the long and short of it? Yeah, exactly. So, so yeah, we were working on a national threat sharing platform. And even when we were offering scripts and the like to help people set this thing up, it was still too hard for most CTI and SOC teams.

Starting point is 00:20:34 So then we really had the idea, what if we take that pain away? People just want to use MISP. They don't want to wrestle with the 12 to 20 MISP releases a year and quality assuring them and monitoring and backing them up and working out how to plug it into their WAFs and their firewalls and making allow rules and getting it through network architects. So if we get rid of all of that, people can actually get down to using the platform for what they want, which is plugging it into firewalls and seams and threat sharing and all the rest of it. So yeah, the tooling just allows for this. It's the technical problem and threat sharing and all the rest of it. So yeah, the tooling just allows for this.

Starting point is 00:21:06 It's the technical problem with threat sharing. Typically the problem is more the, you know, what are we sharing? Do we have time to draft these things? So it gets rid of all the engineering problems and just leaves you with the, how do we share through it into what are we sharing? Which is enough of an interesting challenge in its own right. Yeah. So we were talking earlier before we got recording and you said like quite often MISP was just spun up by someone kind of informally in the sock on a box under a desk kind of, kind of

Starting point is 00:21:34 vibe, right? That's it. And a lot of people who come to us and they're interested in cloud MISP, they already know what MISP is and they know what it's used for. They know they like it. And then they just want to make the pain of maintenance stop because they've started with this, you know, literally it might be running on a laptop beside somebody's regular workstation and they're going, okay, we want to do things, but it needs to run stably now. We need to have like a proper production grade deployment and they just don't have the time or the engineering to do that. Right so I'm guessing this has been around you've been selling this for a little

Starting point is 00:22:09 while like how popular is it who's buying it? So yeah we've been doing this for a couple of years now we get inquiries from yeah all sorts of places, finance, telecoms, resources, education, so there's a lot of misp sharing communities out there and there's some big name ones for sure so the telecoms, resources, education. So there's a lot of MISP sharing communities out there. And there's some big name ones for sure. So you've got places like FSISAC would be one and interesting go and look at their site. So they do, you know, Sticks Taxi

Starting point is 00:22:36 as one threat sharing model and they do MISP as another threat sharing model. But there's a lot, how would you say? So they come out of the woodwork and there, you know, it might be a collection of universities that because these universities in this area have a close relationship, hey, let's share threats because we're all facing the same stuff.

Starting point is 00:22:53 So yeah, it's really interesting to see these organic threat sharing communities that are around the place, but not advertised anywhere necessarily. Yeah, right, so I'm guessing you just spin it up in an instance in the cloud, manage the patching, all of that sort of stuff. And that's basically the pitch here. Yeah, that's it. And we sort of re-architected this in a sense so that it uses AWS sort of native features. So we get the advantage of all the HA stuff and the backup stuff and

Starting point is 00:23:20 all the best patterns that AWS gives you. So it's not just like VM running on any- Yeah, right, right. So you're not just eating like a Docker container into some Kubernetes cluster somewhere, like this is actually properly, you've figured out how to make it work with AWS nicely. Yeah, and we spent quite a bit of time upfront sort of engineering this to run the way we wanted to do

Starting point is 00:23:40 and to take advantage of as much of the AWS sort of feature set as we could. And then it's all the ongoing operations after that. So of course, you know, it's the monitoring, it's the upgrades. The other big part for us is just the support. Not only, you know, is the platform running stably and you know, you hit a bug or whatever it is. Let me guess, let me guess. How do I do this? How do I do this? Yeah, yeah, right. Okay. Walk us through that.? How do I do this? Yeah. Yeah. Right. Okay.

Starting point is 00:24:05 Walk us through that. Like where, where are people struggling with that and like, you know, how do you help them through it? Yeah. So the classic is that MISP is a very powerful platform, but also has a lot of knobs and dials. Yeah. So quite often the team knows like, okay, so what we want to do is we want to take

Starting point is 00:24:19 these Intel reports and send them over this way, but not these Intel reports. They are purely internal. And then there's another set of reports again, that goes off to a different audience, but that should be unidirectional. Like they don't need to push back to us, we're just gonna publish to them. So what are the patterns to do all of these things?

Starting point is 00:24:36 And there are patterns for all of this, but this is where we can say, well, yeah, like here's the menu of options you've got and based on what you're telling us, option two and four, that's what you want to go for here. And we save a lot of time from people doing a proof of concept and sort of feeling their way through all this, because we've done a lot of this before.

Starting point is 00:24:54 Now, look, you're a longtime listener of the show. You know I like to dunk on the CTI people. But I also recognize that it's a fact of life, that people, especially in large organizations, are going to be using a tech like this. They need to be looking for these IOCs if they're popping up in logs and whatever. It's just something you've got to do.

Starting point is 00:25:16 But I guess one of my questions would be, what's the general state of this stuff out there? Who's using it? How common is it? Is it growing? You know, I'm guessing from what you're telling me that it is if people, if there is a market need for people to actually buy hosted MISP, I'm guessing that it's a growth area. But you know, can you just give us a bit of a rundown on like, you know, what's going

Starting point is 00:25:38 on out there in CTI land with MISP in particular? Yeah. So I mean, it's really interesting to look at the last 20 years because that's how long I've been doing cybersecurity now, and that was before it was called cybersecurity, and cyber threat intelligence wasn't even really a term of art, so that tells you a lot. And we often call it just data sharing

Starting point is 00:25:54 between national certs. And these were CSV files, and everyone had their own bespoke formats, and then you had to write a Perl script or a Bash script to parse that old format. So what's happened in that last 20 years is that we've seen this emergence of standards and MISP is one and STIX is another say.

Starting point is 00:26:10 So now we've got at least sort of some commonality. So you can do a bit more plug and play of their publishing this feed in those formats. I get the right tool, I can ingest, I can do stuff with it. So we started Coseive 2015 and we started as like a big part of what was being a CTI shop. You know, not publishing CTI, we were much more interested in the tooling and the practice of CTI. And we think in retrospect we were just a few years too early because in the last almost 10 years we've been running Cosev, you see a lot more organizations know what CTI is. They now have

Starting point is 00:26:44 an idea of what to do with it. I think back in those days, the idea of threat intelligence was so nebulous that like, what is it? For some people it was like, IPs I want to block. And there's a whole debate about pyramids of pain and like, what's the value of just blocking IPs? Is that even threat intelligence? And you can talk about, you know, the context.

Starting point is 00:27:03 I think more and more people are understanding what you do with it. So something else we're really excited about, late last year there was released a CTI, CMM capability maturity model. There's sort of similar things been around for longer with SOCs and security operations centers, but this is one we really like.

Starting point is 00:27:23 For the whole idea of defining, what should I do with threat intelligence? Yeah, we've got three tiers of maturity. So you can start very simply with, I'm gonna pick these domains and these capabilities. It's very good at talking about stakeholder engagement and what are my intelligence products? Because the number one thing we've seen

Starting point is 00:27:41 where the best intention CTI programs go to die is where we buy the tools and we buy the feeds and we buy the analysts and then we start to answer the question of like okay so who in our organization is going to do something with this and what do they want what do they need what format what are they gonna do so it really hammers that point of like before you start any of this what are we the Intel team going to offer as services to our organization? So half of that is requirements like what are we tracking? What do we care about as an organization to be the eyes and the ears of that organization? And then the other half is the Intel product. So are we producing, you know, your classic PDF style reports or

Starting point is 00:28:21 reports on the Wiki? Are we doing IOC streams to the SOC? Are we helping the hunting team with looking for new techniques informed by what we're getting from some of the best Intel providers? You mean there's a model that says people should figure out what they're trying to do before they do it? This is it. Seems sensible, I've got to be honest. When you say that, you're going like, well, of course, but we can get really, because

Starting point is 00:28:42 my theory is a lot of CTI analysts come from a technology background and they tend to be very technology first about things. And when I meet CTI analysts who came from maybe military background or some other background, they kind of understand like, what's the point of all this? And then it's like, okay, what tools do we need to accomplish that goal rather than, hey, I've got this cool tip or I've got this miss to play with. And now I better work out what to do with it. So it's all about coming at things in the right order for me. Yeah. And what are the different maturity levels that you get in a model like this? So typically now from memory and the CTOCMM three levels,

Starting point is 00:29:20 and it's kind of be what you expect. So at level one, we have some basic capability where maybe we're handling, you know, lists of indicators of compromise. It's that basic stuff and we can put it into our block list and into our scene. And then all the way going through to, you know, tier three, now we're doing things like generating Intel

Starting point is 00:29:39 and we're doing our own research. And this is where we have analysts in our team. I know here in Australia, it's a rarity for a lot of organizations to have. And I don't wanna say it's rare because there are organizations who have dedicated CTO analysts now. That's been a big change in the last 10 years.

Starting point is 00:29:57 Before, it was a lot of, well, someone in the SOC just likes reading blogs. So they're kind of like the Intel analyst for us. And that is, in some ways, that's the fledgling Intel capability of what's going on out there, threat landscape sort of stuff. And then we build up to like,

Starting point is 00:30:14 okay, now your full-time job is doing that and you've got budget, you've got tools, you've got feeds, we've got some structure, we know what the outcomes are supposed to be and that's how you kind of move through this maturity. Right. And that's something that you're offering as a sort of consulting service along with the CloudMISP stuff and not just in Australia either.

Starting point is 00:30:32 Yeah, that's it. So CloudMISP in particular, we've seen really good international interest for this. And then alongside the operation of the instances themselves, it's helping people get to the point of like, we're getting utility out of this. We're getting value, we're getting use. How do we set this up? What are the outcomes and helping people define, yeah, what do stakeholders want? What are our Intel products? How do we deliver them? So this is all consulting we're providing sort of either completely separate to MISP or alongside it in many cases. So there you go. If you want to spin up some cloud MISP and you don't want to run

Starting point is 00:31:03 it yourself, you can reach out to Cosev and Chris Horsley. Great to see you again. Likewise, Pat. Always good to see you, Chris. Yeah. So as I said at the intro there, I've known Chris a long time from, from around the traps here in, here in Oz. Thanks for filling us in on what it is you're up to.

Starting point is 00:31:20 Appreciate it. Yeah. Thanks very much, Pat. Much appreciated. That was Chris Horsley from Cosev there and you can find them at cosev.com that's C-O-S-I-V-E dot com and yeah, good stuff. And it was great to see Chris as well because we've been bumping into each other at conferences for something like 20 years. It is time for our final snake oiler today and we're speaking with Alex Lawrence who is the Director of Cloud Security Strategy at Sysdig. I'll confess that this was a

Starting point is 00:31:47 really fun interview. You know Alex is my type of people and you know you'll probably hear what I'm talking about as you listen to this one. So Sysdig make a Linux security agent I guess. You know it's a it's a runtime security product for Linux. It's been around quite a while. People seem to really like it. So when they asked if they could come and do a snake oilers, yeah, I jumped at it. They've been on once before,

Starting point is 00:32:13 but that was quite a long time ago. So Alex in this interview recaps what Sysdig actually do. And then he talks about some of the fancy stuff they're doing with AI, right? Because everyone's doing fancy stuff with AI now. So I'll drop you in here where Alex begins by explaining what Sysdig actually is, enjoy. Sysdig is a runtime solution for security, right?

Starting point is 00:32:36 We are built in a cloud native way. We are built with Kubernetes and containers and the modern stack in mind. And the whole goal is to look at things in a real time context. So a lot of security tools will be focused on all sorts of things, right? Lots and lots of various different things.

Starting point is 00:32:54 And it's kind of broken up into two major categories, preventative controls, detective controls. Sysdig does preventative stuff. Everybody does preventative stuff. The thing that we're focused on the most though is that detective side of the house. How do you deal with security in real time when things are ephemeral as all get out, things show up and disappear in seconds and things change in seconds and then the environments are gigantic. Right? So how do

Starting point is 00:33:20 you do real time security when that is the environment you're trying to do something with? And that's the problem that we're aimed at solving. Okay, so where do you guys plug into the whole equation? You mentioned Kubernetes earlier. Is it this solely for Kubernetes sort of star for? Is it, does it work in all sorts of different places? Yeah, so all sorts of different places. The little secret I always have said about Sysdig is that anywhere you're running Linux, we have value to bring you. At its infancy, Sysdig was built to interrogate system calls. How do you basically speak the language of this new stack? What's the least common denominator of the way information is traded back and forth? And that is the system call. So if you think about the old days,

Starting point is 00:34:07 what did you instrument? You instrumented your network. You instrumented the packet. You used Wireshark, right? You grabbed every single packet. You looked at all your applications we're doing. You could do really cool observability things. You could do really cool security things,

Starting point is 00:34:19 a la Snort, other tools like that that were out there. But once we shifted to the cloud and we stopped owning our data centers, we no longer owned a switch. So what did you instrument? What do you plug into? You can't hit a span port. You can't replicate all of your packets. You can't interrogate them. You could do some fancy work with port replicas and some other junk, but it was really, really complex. You basically lost that single source of truth that was the packet. So if you think about the cloud, what becomes the new packet? And effectively, it becomes the thing your applications are speaking at. So with a container or a Kubernetes app or a thing running on like a Linux box, an EC2, whatever you

Starting point is 00:34:57 might call it, that's a system call, right? Every single system call is how you're gaining access to resources, outlocking memory, going and sending stuff out to a socket. It's all happening at that particular level. Those system calls, just like a packet, they don't lie. So if you can interrogate that system call, you can know every single thing happening on that host, and you can do that in real time. In the cloud, it's like a cloud lock.

Starting point is 00:35:22 So being able to go look at what your cloud objects are doing, what change in RDS, how we logged in without MFA, what are all those things that are going on. In Kubernetes, it's the Kubernetes audit lock. All of these cloud services, they all have a thing that acts like that system call. And that's your single source of truth to do real time security. Right, so I'm guessing that with the Sysdig,

Starting point is 00:35:43 you think of it, everyone likes to talk about how they're a platform, right? So we'll just call it a platform for now. So I'm guessing you look at things like, you know, various log sources and whatever, but I'm guessing you also shim in what some sort of kernel extension or whatever to collect syscall information? Yeah, these days it's EBPF. Of course, yeah. In the olden days, quote unquote, I've been here six and a half years, it was just a kernel module, right?

Starting point is 00:36:05 But now with the advent of what kernel 4.12 and newer, you got this EBPF extension that lets you do stuff in a much safer way. So that's the preferred path. We do still have some customers out there who I think are still running rail five, you know, God bless them. And in those cases, we still use kernel modules. But in modern architectures, you know, we're EBPF these days. So I'm guessing how this works is it's essentially

Starting point is 00:36:27 an agent that gets shimmed in automatically in environments where your presence is part of the process of spinning up new kernels. Yeah, exactly. Some people bake it in as part of their image so that when they deploy the host, it's already there automatically. Some people use deployment mechanisms,

Starting point is 00:36:44 be that any insert random DevOps tool of your choice. Some people will do it in, obviously, our sweet spot of Kubernetes. And so then it's just a daemon set that goes and deploys the agent across the nodes. So there's about 1,001 different ways to do it in the modern age. But yeah, effectively, it's sticking an agent on a thing

Starting point is 00:37:02 and then being able to go and see all the stuff coming from that thing. Yeah. And what sort of stuff are you likely to catch with Cystig, right? Like what sort of attacks, you know, what sort of odd events are likely to get flagged by this, I'm gonna, I'm gonna call, I'm gonna use a word that you're not gonna like by this agent. I personally don't mind calling it agent. You know, that's industry nomenclature. I'm glad we can call it an agent. So what sort of stuff is your agent likely to prevent, to catch, to detect? Yeah.

Starting point is 00:37:27 So like Snort, it's kind of whatever your creativity limits you to. And so what I mean by that is it's interrogating system calls. And so that's everything happening on the host. So that could be like a shell being opened up. It could be someone spawning the netcat process. It could be a actor doing a Chimata or a Chown as a system call as opposed to running the process, right?

Starting point is 00:37:55 It's anything that's traversing that host asking for resource. And so you can catch all sorts of crazy things. And it's really up to how creative you can get. I've worked with some people who want to log every single time a file is opened or closed or processes executed every time a packet is sent, a socket is touched, like, you know, everything happening on that host. They were producing a ridiculous amount of data and their SIM must love them because

Starting point is 00:38:24 that's, that's gigabytes and gigabytes a day, right? Build per line of ingest. Someone's popular. Yeah, right. I've got some who want to go a little more abstract who are really caring about very specific use cases, right? Like they want to look for container escapes or they only want to look at it if it's this and that and else this, you know, it gives it to some fairly specific nuance of How they want to do the detections, but it really allows them to get very specific on what they're trying to accomplish Yeah, I mean does it pain you you weren't pained when I said it was an agent does it pain you if I called it Kind of like EDR for Linux

Starting point is 00:39:00 No, I mean honestly like it can be used in that capacity right like that's, that is an area where I would argue that people have ignored. Um, Linux is great and amazing. I've built my career on top of it. Um, but that doesn't mean that attacks don't still happen in that world. You don't need to have real time detections top of Linux. It's like saying that I've got a Mac, so I'm secure. Um, it's not that it's inherently better or worse. They've got a different design paradigm, so it makes things different, right? And the

Starting point is 00:39:30 threats still exist. You still have to be able to tell when stuff is going on. And arguably, Linux runs the internet. And so it is the target of choice when we start talking very large scale applications and things that we're doing these days. Well, it runs the internet and it also happens to run a lot of coin miners, which I guess has been a big driver of adoption. So what sort of enterprises, what sort of organizations tend to be running Sysdig? And then I want to talk briefly about how the products change, because you've been around for a while, right? And this is always a moving target, running a product like this.

Starting point is 00:40:04 So then I want to hear about what's new with Sysdig. But where does it tend to pop up mostly? I'm guessing it's mostly, anyone who's doing like DevOps style stuff, which is, I guess, not really modern anymore. I was about to refer to it as modern. But then again, I am talking to a guy who keeps referring to snort, so you know what I mean.

Starting point is 00:40:23 Yeah, now it feels modern to me, right? But yeah, no, DevOps is just like, that's status quo these days. It's not a new thing. Yeah. It's just the way it's done. So, but I mean, is that where it sort of tends to pop up is people who are, you know, running their own applications in the cloud and whatnot. I'm guessing that's where it's most popular, right?

Starting point is 00:40:41 Yeah. I mean, the sweet spot for Sysdig is effectively anybody who is doing Kubernetes and containers, right? That is the thing that we do the best out there. I'm not going to say that other folks don't attempt or try or do things, but we do put a lot of effort behind the way we do detections in that world. And particularly, the way you do policy and enrichment

Starting point is 00:41:03 and the way you can kind of handle that is a little bit more mature in the SysTick model. And then a lot of that shockingly tends to be FinServe customers as well. Like FinServe is pretty darn progressive when it comes to this cloud native era of things. Well, they are because they're running really important applications in the cloud, you know? Right. That mobile banking app or that brokerage account, you know, that is Linux in the background,

Starting point is 00:41:26 right? Yeah. Well, then what's funny is that a lot of these organizations are running applications in C groups. And if you go back way long enough, you know, BSD jails, things that fence off the processes so that you're not conflicting with other co-running applications, that's all a container is, right? It's just a big giant API ball around C groups.

Starting point is 00:41:45 And so this notion of containerization really isn't different for them, right? Like they've been doing this for a long time. Now there's just an actual standard they can follow. Yeah, so look, let's talk now about, you know, product evolution because everybody's shimming AI into everything now. I understand you're also doing a bit of this.

Starting point is 00:42:04 Like how do you start to apply AI to something like a, you know, a Linux endpoint agent? Yeah, that's a great question. It's hard. Yeah, cause I'm guessing most of the value you're covering off pretty well, right? On the endpoint agent in terms of just being able to collect that telemetry.

Starting point is 00:42:22 I'm guessing that where you would apply the ML, the AI, all of that magic, you know, pixie dust is going to be more on the correlation side when you're actually looking at the information you've collected off all of these endpoints and looking at other logs and trying to draw some insight there. Yeah, no, it definitely is. Like, look, if we think about what is the problem AI is solving, fundamentally AI is addressing the data lake problem. We've got a lot of data, more than we can ever do anything with with a human being.

Starting point is 00:42:51 How do we do something effective with that? We use machines, and in this case, we used LLMs, we use AI. This portfolio of products in the security world where we exist. It's this Gartner term, CNAP, Cloud Native Application Protection Platform. It's a mouthful. But it generates data like no other. It's just as difficult to deal with the amount of information CNAPs produce as other security tools, times 10, times 100.

Starting point is 00:43:23 If AI is built to solve the data lake problem, it's actually uniquely positioned to help drive interesting insights into a CNAP tool. And that's the way we approached it from the beginning. We didn't write a thing that just read our documentation and told you something. We decided to teach it about our API and teach the LLM about how to interface with the way

Starting point is 00:43:49 Sysdig generates and visualizes data. Yeah, we've got an AI integration. It's called Sage, it's your assistant for going through all of your Sysdig information. But you can ask it things like, on this page that I am looking at, what are the top two or three most important events that I see?

Starting point is 00:44:13 And then of those events, what are they related to? What other events may have happened that I'm not saying that have come from this? And so basically it's helping you sift through that lake of information and doing it in a very pointed way in a way that Sysdig understands, right? It's making API calls on your behalf.

Starting point is 00:44:31 I mean, that seems like a sensible thing to do. I do wonder though, because it seems like a lot of the sort of SIEM companies and whatnot are making tools to kind of do that on the SIEM end. And I imagine you're already pumping a lot of this telemetry off to the SIEM end. So I guess we sort of are at that point

Starting point is 00:44:47 where we're working out where the AI best plugs in, right? Because I can also foresee a circumstance where you do some sort of LLM processing on alerts to help figure out what to send to the SIEM to be processed by their LLM. It's like a little chain of like robot workers who figure it all out, right? I mean, we are still working all of this out, right?

Starting point is 00:45:05 Oh, we certainly are. And honestly, if the SIEM vendors weren't investigating what they can be doing with an AI agent or an LLM, you know, on their stuff, they're missing the boat, right? Again, AI stuff today is solving data lake problems and the SIEM is exactly that. And what's the uptake been like? Is this in beta or is it already out there? And what's the response been like from customers? Cause you know, you're making a Linux tool, man. You're dealing with like crusty people, right? So what do they make of this newfangled

Starting point is 00:45:35 LLM enabled Sysdig, you know, old school tool with new school tricks, right? I get them. I grew up as them. Yeah, so it is getting decent adoption, right? Like I've seen me, I think we've seen 300 some percent growth across our current user base. They were very tentative at first.

Starting point is 00:45:52 What does this mean? Should we be touching this? Should we be using it? We've seen decent uptick in adoption of using the service we've been building out. A lot of it comes down to sifting through events quickly. If you think about the way cloud attacks work today, again, especially on this cloud data infrastructure, being able to sift through data fast is basically your advantage. Yeah.

Starting point is 00:46:17 Being able to ask natural language questions about a data set, I mean, that's always going to be popular, right? Yeah, exactly. about a data set, I mean that's always going to be popular right? Yeah exactly, like I think the statistic from our usage report we put out every single year is containers are basically just getting, their shelf life is getting smaller and smaller and smaller. I think as of the report that just came out this year, like just like a month ago or so, 60% of all containers live less than 60 seconds. So it means you can't just use grep anymore. So we've spent as a society hundreds of billions of dollars

Starting point is 00:46:53 to basically get to where we were with grep 15 years ago, which is a funny old world. We're going to wrap it up there. Alex Lawrence, thank you so much for joining me on the show to walk through, yeah, Sysdig and what you're doing with AI. And just to recap for everybody, you might not know what it is you do. Pleasure to chat to you. Yeah, thanks so much. That was Alex Lawrence from Cystig there. Big thanks to him for that and big thanks to Cystig

Starting point is 00:47:16 for being one of our Snake Oilers this time around. That's actually it for this edition. We'll be back with part two of this round of Snake Oilers, which will be three more vendors, we'll be back with that in a couple of weeks, and in the meantime we'll be publishing as usual. So yeah, catch you all soon. Until then, I've been Patrick Gray. Thanks for listening.

Your Ad Here

Risky Business - Snake Oilers: Pangea, Cosive and Sysdig

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.