Risky Business - Risky Biz Soap Box: BEC actors embrace LLMs to attack Japan
Episode Date: July 20, 2023This Soap Box edition of the podcast is sponsored by Proofpoint. Proofpoint offers email security and DLP products and services, and they’re probably best known for b...eing the biggest email security company on the planet. That means they process a LOT of emails in the hopes of throttling the number of malicious emails that organisations have to deal with, whether that’s malware, phishing or BEC. So, with that in mind, what role could large language models play in email security? Now that the initial ChatGPT hype has died off a little, we spoke with Proofpoint’s VP of cybersecurity strategy Ryan Kalember about large language models and how they’re going to help defenders and attackers alike.
Transcript
Discussion (0)
Hi everyone and welcome to this special Soapbox edition of the Risky Business Podcast.
My name's Patrick Gray.
As most of you know, these Soapbox podcasts are wholly sponsored and that means everyone
you hear in one of these editions of the show paid to be here.
This Soapbox edition is sponsored by Proofpoint, one of the largest cyber security companies
on the planet.
They offer email security
and DLP products and services, and they're probably best known for being, as far as I know,
at least, the biggest email security company on the planet. And that means they process a lot
of emails. That means headers, attachments, and written content in the hopes of cutting back the
amount of malicious emails that organizations have to deal with whether that's malware phishing or BEC. So with
that in mind what role could large language models play in email security?
Now that the initial chat GPT hype has died off a little we spoke with
Proofpoint's VP of cybersecurity strategy Ryan Calimba about large
language models and how they're going to help defenders and attackers alike.
Here he is now explaining his broad take on what LLMs are good for. Enjoy.
It's great for a limited set of things, many of which were already being done with predecessors
that we called stuff like machine learning. But ultimately, it's not going to be a panacea. It
doesn't fix everybody's problems magically in a way that I think was briefly at least marketed, if not ever backed up by anything substantive.
I mean, it almost feels like one of those mass delusions where some village in Peru, like they'd all see the Virgin Mary or something. You know, it sort of felt a little bit like that for a while, didn't it? Well, speaking as somebody who lives in Palo Alto, California, I can say I was right in the middle of that particular delusion and not
sure if it was anything atmospheric or in the water supply, but yes, it absolutely felt like
that for a while. And in cybersecurity too, it was fascinating to see everybody start talking
about the same thing at the same time, which is reasonably rare given how broad and diverse the subject matter generally is. So what are the use cases though that we're seeing
flush out, right? Because there is the, you know, there is the now with added AI sticker being
slapped on an awful lot of boxes. I mean, I can think of one, you know, sensible use case, which
came from one of our sponsors, Corelite, where they just added a,
you know, a little button you could click that would throw a prompt at chat GPT,
which would cause it to explain an alert, right? Which, you know, just in terms of like generating some useful content for people who are sitting in front of their product, it seemed like a pretty
good idea. But, you know, what are some other use cases that you can think of where this thing is
already showing some promise? I mean, there are some straightforward use cases that you can think of where this thing is already showing some promise?
I mean, there are some straightforward ones that actually have a lot in common with that
Corelight use case.
If you're actually interacting with a human in a way that can be reliably actually passed
off as an authentic interaction by a chatbot, then it's a great way to actually collate
a lot of information and get it in the same place.
And in some cases, it's actually a pretty good explainer.
If you need to take something that would make sense to a cybersecurity professional and
explain it to somebody as if they're in elementary school, it's not actually very bad at doing
that.
You just described my job when I used to write for newspapers, but anyway.
Thankfully, you don't have that job anymore.
So you're not under threat from the AI job replacement revolution.
If anyone's wondering why I started Risky Business, that is why.
There you go.
So yes, content aggregation, all that aside.
One of the things that becomes really interesting is that end user facing applications are the
ones that have shown some real promise, like the co-pilots that are out there, what CrowdStrike
is doing with Charlotte.
If it's just a matter of taking information that is available at a relatively straightforward form somewhere else and putting
it in front of somebody, probably can't use a model that was trained on the internet a few
years ago to do that, but you can absolutely have it pull some things together that then
ultimately make sense in a human context when a human is the recipient of that information.
Yeah. So instead of just seeing like an error code, right?
Like I feel like ChatGPT will,
and similar technologies are going to end the era
of just having, you know,
really weird numerical Windows errors.
Oh, totally.
And it might also end the era
of some things that we do right now, right?
So if you want to,
you can place a warning banner on email messages that
says, this is a brand new domain, please be extra careful with this message. Does that mean much to
a cybersecurity professional? Well, yes, we know that we shouldn't trust brand new domains most
of the time. Does that mean much to an average user? I think it's debatable, right? Those are
the sorts of things where we could be explaining it in very different ways rather than relying on a rules-based system that, frankly, has a technical explanation behind it like all those error codes.
Yeah.
This website has only existed since last Tuesday, and usually that's a sign that blah, blah, blah, blah, blah.
Yeah, exactly.
And if you're interfacing with a human, like we have a fairly large security awareness business, I think generative AI is actually going to help us
make things a lot more tailored, a lot more relevant,
particularly because we already have a really good understanding
of which people are getting attacked,
how they're getting attacked,
what we could be delivering if we were not in the way,
what threats they would in theory be receiving in their inbox,
the types of social engineering
that would cause those threats to be effective, as well as their own habits in terms of data handling and a whole
mess of other things that could help us build up models of behavior that ultimately become really
interesting ways to make things less generic, make it much more specific to a particular individual,
as opposed to, hey, everybody gets the same training every however many months.
Yeah. Yeah. Everyone gets forced to watch the video every six months or whatever.
Yeah, exactly. And click through all the quizzes and you maybe get the hard one because you did
well last year, but that's not really tailoring in the same sense. There's a lot of debates,
I think very reasonable ones around simulated attacks, especially simulated phishing attacks,
and whether those are worth it, whether those have a basis in reality and prediction. But I will tell
you one thing, they're a whole lot more effective when they derive from real attacks rather than
fake ones. And so those are the sorts of things that these models make more efficient than if you
were doing these types of things manually. So look, one of the reasons I wanted to talk to you about this is that Proofpoint is by revenue, the third largest security company
on the planet, right? Like it is, it is a very, very big company. It also falls into this category.
I mean, you're the opposite to most of our sponsors, right? Who most of them are startups
doing, you know, really cool, interesting niche stuff. Proofpoint is kind of what I describe as an infosec utility, right?
Like, you know, you pay a utility to keep the lights on and the water flowing.
And, you know, you pay Proofpoint to keep your email flowing in a state that's reasonable, right?
Which is, even getting that into a reasonable state is a heavy lift.
Obviously, I would not recommend you abandon all of your other controls
and rely exclusively on an email security provider to stop all bad stuff getting through.
But, you know, the point is that Proofpoint is a utility that certainly makes it possible to operate email services and not immediately be destroyed.
So, you know, the number of inboxes and addresses that you are protecting is just absolutely astronomical.
The breadth of in terms of the types of businesses that you're protecting also absolutely astronomical.
So if there's one and also the amount of text that you're processing is absolutely astronomical.
Right. So there are a number of reasons why i think that you know if there is
one company out there that that should be investigating uh this stuff it is you so what
you know where are you looking uh you know where are you looking to sort of integrate this stuff
into what you do i'd imagine there would be uh these models could be useful when scanning mail
for example i'd imagine these models can be useful in, like we're
talking about, explaining stuff to users, trying to upskill them, make them smarter. There's just
so many ways that you could use it. So where are you looking? How do you even start to
prioritize your efforts here? Yeah, I think our data science team would agree with you that we're
an interesting place to start working on some of these problems because we have human communications and we have billions of messages a day in order to
train these models. And for a very long time, we've worked on getting very large,
well-labeled data sets that we can use for those purposes. It is interesting to pick and choose
between problems that you would solve with an approach like this versus problems you would continue to solve with other things. And there are a few that are maybe
worth starting with that actually AI has not been particularly useful for. I mean, it's still a
control we use for, say, malware detection. But if you train an AI model on a bunch of malware,
you have seen an email. It's not particularly good. And it's certainly not better than the static analysis or sandboxing at catching the next one, which is not actually a use case
that works out well. I think in the early days, obviously, Silance and others pioneered AI-driven
malware detection. There was some really strong initial promise there. I don't know that anybody's
throwing out their EDRs for an AI solution that would
solely do prevention at this point in time. I mean, I'd imagine a lot of the sort of machine
learning based stuff that you would do would be more around doing things like stripping domains
out of messages and doing some domain analytics and scoring them and all of that sort of stuff.
Just the old boring stuff is going to be, chat GPT is not going to help you there.
Totally, totally. And actually one of the things that is most interesting for us to use
ai to do is to figure out whether there's a good chance that a url is ever going to be clicked on
or not right because we process about 49 billion urls a day we can't run them all through all of
the detection engines it would be prohibitively expensive to do so and actually the cost part of
ai is actually an interesting part of this equation too. But when we look at likelihood of
being clicked and likelihood of being malicious, if that's a one drive wake or something, we're
going to scan it a hundred percent of the time because we actually probably didn't need the
fancy AI model to tell us that, but it's just much more likely to be malicious than a whole
set of other things. And yes, the domain analysis becomes a
useful part of that model. And ultimately, when we look at resource allocation between AI use cases,
even with the data set that we have available to us, there are a few things that it actually
lends itself to really well. Social engineering and manipulation of identity is actually a good problem to point AI at.
If I can say in somebody's conversations, well, you've maybe contacted this person before,
they've contacted you before, there's a graph of communications between the two of you,
but no one's ever mentioned a payment. No one's ever mentioned a change of bank account details.
Those are the sorts of things that are very, very identifiable
if you have sort of semantic understanding of the message text.
Now, that's interesting.
That's interesting.
So, you know, to look at stuff like, for example,
business email compromise, right?
Yes.
When you can detect like a shift in tone or a shift in intent
that might not be obvious to a human being
but would stand out if being analyzed by one of these tools.
Yeah, exactly.
And we call that behavioral AI.
I don't know if there's a better term for it,
but that's certainly what we've used.
We've had that in production for a couple of years now,
and it's been really good at a lot of the BDC use cases
that don't necessarily rely on third-party compromised accounts,
but rely on impersonation.
Compromised accounts you have to catch in other ways, and the tone shifts become more interesting.
But I mean, you're not using a large language model to do that, are you?
I mean, I'd imagine you're just scoring it with things like keywords and whatever.
I mean, you can certainly start there.
The keywords are helpful because they're cheap, too.
A rules-based engine will always execute faster, and we want to do the expensive detections last and the cheap detections first, just because it's much more efficient that
way. But we are actually now using semantic understanding. We're using a BERT model,
actually, which is what Google developed before BARD. It's not trained on quite as many parameters
as ChatGPT, but it does the job in terms of tell me the intent behind this email.
And we're getting it down to a tenth of a second in terms of how fast it can run across
a message.
And there are also useful signals that we can incorporate in terms of when to use it.
So when it comes from a known supplier, do that 100% of the time.
When it's a message that has been reported by an end user is suspicious feed it through the llm like those are those are great times to do that especially when again
you're not looking for malware you're not looking for credential fish in the traditional sense you're
just looking for social engineering yeah no that's really interesting i mean let's talk about the
resource thing though right because that's a really interesting part of this whole conversation
i think you know there would be a even split i'd imagine among the people listening to this between the people who understand
how computationally intensive this stuff is and the people who have just didn't think of it right
and it is monstrously computationally intensive to do this stuff to the point that
nvidia's share price has gone loopy right it's gone to the moon because for them to justify
their current share price i think like their their manufacturing capacity would just have to
boom in all sorts of unrealistic ways like we're at the point where we're going to hit capacity
constraints on compute if this stuff gets as widely adopted as the hype train sort of thinks it will, right?
So this stuff is going to get increasingly expensive
and it's going to get rationed.
This idea that you can just LLM all the things is no, it's not going to happen.
It's going to get expensive, more expensive than it is now
as people find more and more productive use cases for it.
So, you know, I mean, you've already talked a bit
about how you have to
ration this capability, but do you anticipate that you're going to have to ration it even more,
I guess is the question. That's a good question. What we're trying to do is actually figure out
how to use it in more places, but not break the cost curve. And your description is absolutely
spot on. That's the main reason that we haven't
actually put it at the beginning of the detection stack. Because it actually really-
Well, forget it. I mean, just forget it. Like at your scale, forget it.
Yeah. No, it's impossible. Because even if you get it down to a 10th of a cent to analyze a message,
if you multiply that by billions, no one's going to pay for that, right? So it's if NVIDIA can even
make the chips.
And so, yeah, we are looking for basically cheap ways to knock down the things that we don't have to analyze further.
So some simple things, IP reputation, static analysis, rules-based things that knock that
piece down.
And then when you get to the interesting parts that you've identified in other ways, then
you can do the expensive stuff.
And this is actually something we have some experience with because this was true for sandboxing too,
right? Sandboxing an attachment is actually more expensive than the current LLM based analysis
that we do. That's the most expensive thing we can do. And we figured out how to do that,
obviously hundreds of millions of times a day in varying ways.
But it is a very, very complicated engineering exercise to just sort of build the harness that can do all of that at scale.
Yeah, I mean, it's interesting, though, when you start really drilling into this as a topic, right?
Whether it's sandboxing or whether it's LLMs, you know, making, the making the decision about where to apply them becomes the
critical thing, right? Because you don't want to not apply it to the thing that's going to
get one of your customers owned. Well, exactly, exactly. And it's one of the things where we're
a little bit in a privileged position because we don't have to operate at wire speed for this type
of detection. And we're not showing up endpoint resources that I can't imagine how you would
possibly do that. Actually,
you'd have to do it somewhere in the cloud in order to make any of this work efficiently.
So because email's asynchronous by definition, we have at least the luxury of doing more things.
And we'll keep doing the stuff that has worked historically. But even when it comes to what
you're trying to solve for,
gratuitous use of LLMs
is definitely off the table
because to your point,
you can't do that at scale
and have any kind of
a reasonable cost model.
So we have to try
and use those other signals to say,
you know, are we solving
a social engineering problem here?
Is this a BEC thing
rather than, you know,
a big old QBOT campaign?
Because detection engineering is still
undefeated when it comes to that stuff. And so what one of the things that becomes interesting
is we also have to figure out on our side. Okay, we've got 5000 people about 1400 across
detection engineering, threat research and data science. How do you allocate those things?
Those people don't all cost the same also, and their efforts don't scale the same ways.
And so even though actually the data science work
does scale pretty well,
the ongoing cost of running that
is an order of magnitude more than the detection engineering.
So it becomes a really complicated thing
for us to make sense of.
And for those who are evaluating our tech
versus what others have out there,
we're past the point
at which you can do AI hand wavy stuff.
But if you are using an LLM to do something,
it should be able to generate something
that you can actually see.
And so if you're going to,
just using that previous example,
if you're going to condemn an email message as BEC,
it better say supplier impersonation,
change in tone,
add all of the other things that the LLM actually demonstrated because it cannot be a black box.
It's just simply too expensive to do so. Yeah. I mean, I wonder too, what sort of uses you could
have for an LLM in terms of actually using it in a combined approach with detection engineering. And from that, I don't mean prioritizing.
What I mean is like using it as a tool
to analyze a lot of BEC emails
and try to spit out more generic and cheaper detections.
Like is there, or, you know,
I mean, I obviously don't understand
this technology super well.
You know, am I just talking nonsense there?
Or is that something that works?
You're actually not wrong.
So there's a couple of things that we did that are i mean llm ish in that in that category so we did
train an llm type model on macros right like malicious macros all kind of fit into similar
patterns so it was actually a pretty good thing if you could parse it out of the document to
actually vetero vetero who do their content disarm and reconstruction, they did something
similar and they said exactly the same thing, which is once you can actually view a macro
in isolation of the document, just pull it out and train up a model, it's actually extremely
reliable.
Yeah, absolutely.
And we've done the same thing for a lot of credential phishing URLs.
The cred phish that we work on mostly now is the MFA capable stuff like evil proxy,
evil NGINX2, and all of those adversary in the middle direct proxy ones and adversary in the middle indirect proxy ones. That actually is pretty good if you're using an LLM for it. And we're actually now at the point where the models that we push into production, obviously
they have to exceed their predecessor in effectiveness.
But if you do it right, you can push a new one every week.
And the models do actually get better and better and better when focused on problems
like that.
The last one that I'd throw into the mix there that is, again, it feels relatively straightforward, but it's still
ultimately a pretty good application of this is if you have basically semantic understanding of
the message content and you have stuff from suppliers, that's a tricky problem to solve.
We always sort of tossed our hands up
and say, okay, that, that third party got owned. They can't really do anything about that. It's
coming from exactly where it's supposed to be coming from. It's a reply to a legitimate thread,
all those sorts of things. As it happens, even when you look at kind of the cloud logons,
and then the after effects that show up in places like email headers you can actually build some models on that and that's where you get into the message bodies also
in terms of the shift in tone shift in that yeah that part actually you have to use more ingredients
it's not a single dimension i get it i get it it's when you combine an llm with the other ml
based stuff exactly right and you ask it you know you you pull together all of the information that you would stick
in front of a human analyst.
So I'm getting this right.
So you would pull together all of the stuff that you would put in front of a human analyst
that would allow them to make that call.
Exactly.
And you should be able to train an LLM that has access to headers and time of day information
and archive of previous communications between those people.
You should be able to
ask an LLM, is this a BEC email with the correct prompt?
And it should be able to tell you.
It will tell you within a reasonable degree of certainty.
And even if it can't tell you 100%, you should be able to then use that to warn the user.
Well, it could tell you if they need to pick up the phone.
Yeah, yeah, no, absolutely right.
And I think that that's exactly the approach
that has mattered here,
just combining those pieces.
It won't be able to say if something is BEC,
but it will be able to tell you
if something is indistinguishable from BEC.
That's correct.
That's exactly the right way to put it.
And then even,
and this is where I'm going to depress everybody.
So if you look at a brand new Microsoft 365 tenant
that we have never seen before,
the odds that we find a compromised account in the first week are about 50-50 for our enterprise
customer base, which again, it might be happening more frequently than that, but that's depressing,
particularly because this is your FIDO and and password lists and lots of things
should solve this no i know man like even even something like three or four years ago i think
the stats for microsoft mfa like across 0365 it was like two percent or something like that and
now it's like really like just what and gen x release been been on this as well because it's
not much improved from from back then and if we look at actually a lot of those compromises they're from accounts that just don't have mfa on them and they've been
brute forced still in 2023 we are still talking about that i think it was only a couple of years
ago that uh microsoft turned off legacy protocols that didn't support mfa yes for people who signed
up before 2017 because if you signed up before 2017 you got these you know and my dates
could be wrong but i think that's right no you had it yeah i'm out pop three all that stuff yeah
exactly so if you signed up before 20 uh 2017 all of that stuff would be open and you know unless
you disabled it it would be there even if you never used it so initially what they did is in
2017 new tenants wouldn't have that stuff and you'd have to turn it on but they left all that
stuff open it's all exactly they eventually turned it off like what, a year or two ago? And that made a big
difference. But God, why did they take so long? But you can still have an account that doesn't
have MFA on it. It's possible to do. And so what we end up seeing, and this is again, pretty
depressing, is that a 50% rate of account compromise somewhere, and most of that is still brute forcers,
which is shocking, right?
Conditional access, all this stuff should work.
Then there's reality, right?
And so one of the things that's actually been a useful,
it's not generative AI,
but it's definitely what I would call ML,
is basically using the data from the brute forcers
to fingerprint them and identify
that type of activity. And you can get to the point now where it's a bit, it's almost recursive,
right? You can have the old models make new models that are more effective than themselves
when it comes to something like brute forcers, which are the same data repeated over and over
and over again in very, very slightly differing parameters so hang on you're fingerprinting the brute forces
i mean yeah do you see those logins well yeah from graph api right okay okay from graph api
and uh and that's that's sort of how we try to connect hang on hang on if you can detect brute
forcing through the graph api yeah right what's my question like i'm sure you can predict my
question if you can see that ryan yes why is microsoft not
stopping it it's the same reason they're not stopping malicious messages in email inboxes
which are or slightly noisier failures right it's very very hard to do this stuff at scale
it's very expensive to do this stuff at scale and i don't know uh why they're extremely well regarded threat detection team and the intel
they have like it just plays out differently in real life but surely surely you could make a
couple of little engineering changes right if you are operating that core service you could make
some engineering changes that would make detecting brute forcing quite easy i would hope so i think
they've done much better on certain things like how conditional access policies work, but it's still 50%.
Man.
And that's...
Microsoft!
I'm shaking my fist for the people who can't see it.
But look, again, you know, this brings us back to something that's been near and dear to my heart for a long time is, you know, anyone listening to this, probably the best thing you can do the most bang for buck that you
can get for your enterprise right now in terms of improving its security is to go out and get
502 keys for your employees especially if you're using something like 0365 google workspace go get
502 or at the very least you know maybe turn on pass keys or something like that right because
yeah it's just it's just well we're at the point where in most
of our enterprise customers the number one attack technique they will see is evil proxy
just by volume right because it's now available those who are not ofa these are these um these
fish kits that will bypass stuff like uh yeah code based mfa what's it's a it's a reverse proxy
basically so it's loading
the real login page it's proxying your input back to the real login page so to your point off codes
even with number matching push notifications that they're they're somewhat pointless against that
and that's uh that's where you know authentication cookies get stolen but really you know again we're
a little far afield from the ai topic but again yeah like things that that actually do
resemble human human communications or that fit into a really really well hang on hang on the
reason the reason that i took us down that path right is to mention something that you alluded
to earlier right which is that even if you roll out fido 2 and all of your inboxes are locked down
and it's all wonderful that doesn't protect you against one of your partners not doing that.
Exactly.
And that's how the BEC Babby is made.
Yeah, that's spot on.
So we've actually tried to put together a couple of interesting things on that front
because this is actually a pretty good use of, it's not really generative AI, but it's
good ML because figuring out which domains you communicate with
regularly those are probably your suppliers and the third parties you depend on to some extent
right email communications are a good proxy for business relationships uh no pun intended uh and
that ultimately becomes actually a way for us to try and figure out when we talk to this massive
enterprise when and you ask them how many suppliers they have, and they have no idea. It's some number in the single digit thousands. Sometimes I talked
to a pharma the other day where the number is actually 10,000 suppliers that they have,
you know, database and procurement system. And we try and figure out, all right, well,
maybe you have a likely compromised account from that third-party supplier communicating with one of your users.
Can I identify that anywhere else, not just at your organization?
Because they're usually going to pop that third party and go after more than one target.
That's one of the few things that actually becomes a useful way to, again, feed these models.
But you can use the model to figure out who the supplier is and you can use the semantic understanding to actually understand whether the message fits
into pre-existing communication patterns and this actually starts to feel like a useful thing that
you could do with generative ai that you couldn't do before and so that's kind of the light at the
end of the tunnel for this stuff but it still remains really poorly suited for, frankly, most problems.
Yeah, yeah, no, agreed.
I think it's also quite useful for people on the attacking side as well.
I think a lot of these BEC emails are going to be generated by LLMs, right?
So especially the ones in languages that are not widely spoken outside their own countries.
You mentioned this before we started recording, and you've seen this massive uptick in BEC
targeting Japanese organizations in Japanese.
Yes.
So BEC is always a contentious topic in Japan.
If you talk to a lot of Japanese security leaders, Japanese organizations don't have
CISOs at this point for a variety of reasons.
Some of them were very dismissive for a long time of the BEC problem because the attacker was not Japanese.
They didn't speak Japanese.
They didn't understand business customs and how these sorts of things were communicated.
If you talk to a Japanese multinational who was working all over the world, they very much understood BEC.
They were very much a target of it and had been for a really long time.
But now we're seeing that that language and perhaps cultural understanding barrier
is pretty much removed.
That's gone now.
So the moat's gone,
and the messages are definitely increasing in volume.
The schemes are not fundamentally different,
but if you can write a convincing message in Japanese,
I mean, that's an easy thing for any kind of LLM to do.
And that's the other thing that can scale in interesting ways.
What I think we could see but have not seen is, you might have seen we published some
research on Charming Kitten, or as we track them, TA453, where they do what we call multi-persona
fishing, where they're setting up multiple different fake identities,
and they're having a conversation with the real target sort of on copy.
And so you see this whole message thread with multiple personas involved,
and it feels like a real thing where people are talking,
and you just happen to be one of the people on the thread.
You happen to be the target, of course.
That's the sort of thing that chatbot-type functionality
could really automate quite well uh and i i just on this japan thing i just think it's actually really clever for you know one of these crews you know say one of them based in
nigeria yeah to be like hey let's go hit japan because thanks to this tool we can speak japanese
in a way that passes you know muster emails. That's amazing. It is.
And also those are not user populations that understand BEC.
They don't understand what they're looking for.
Exactly.
They're not finance departments.
They're starting from zero.
This is like this is unplanted fields.
Virgin territory.
Yeah, exactly.
Virgin territory, 100%.
You know, though, I will just say one thing.
Like over the last couple of weeks,
so as much as we talk about BEC issues and technological solutions to it and whatever, I don just say one thing. Like, over the last couple of weeks, so as much as we talk about, you know, BEC issues
and, you know, technological solutions to it and whatever,
I don't know if you saw this,
but there's a new regulation in the United Kingdom
where the banks are going to have to actually cover scam losses.
Yes.
For their customers, right?
That is fascinating.
And it's real funny.
Like, we're sitting here talking about large language models.
Let's see what BEC looks like in England in six months
because I have a feeling that funnily
enough um they might be able to eradicate a lot of this yeah I mean it was always better solved
by a finance process than a technical solution and countries like Japan who have historically
not dealt with much BEC they're in the crosshairs right now in a way they weren't before yeah i mean when you when you go back to like years ago
say you know five to ten years ago when bc was first sort of kicking off you know a lot of the
big ones that worked back then they wouldn't work now i think large companies at least have got
better at uh uh dealing with this yes one i i keep remembering the story of facebook i don't
know if you remember hearing this one on the show.
So for those who don't remember.
Facebook and Google for that same one.
Yeah, yeah, yeah, yeah.
So it was Facebook's large Taiwan-based hardware supplier that supplied the hardware that they use in their data centers.
There was someone who did BEC on Facebook and managed to run away with, well,
they didn't get very far, but they got like
$150 million
in payments. And Alex
Stamos, friend of the show, he was
the CISO at Facebook at the time, and he described
them as the dog who caught the car.
Because the thing is, when it's
$150 million, funnily enough,
you can get the FBI on the phone
and they'll take that one and
even if you even if you're in the baltics here that's an extradition treaty uh not smart so the
fbi got that one they got their money back and they all lived happily ever after but you know
it is still just such a a big problem for ordinary business small to medium business you know real
estate companies construction firms firms, whatever,
firms that tend to push around large amounts of money
but are small businesses.
And you would sort of think maybe some of these approaches
are going to help.
I mean, first of all, the banking regulation stuff,
I think, will make a big difference,
but I don't see that happening in America.
But you would hope that approaches like these
might actually really help to stamp out some of this fraud, or at least make it much harder. Well, and it's sort of like ransomware in
that it kind of migrates around the world. You just have somebody else's turn in the barrel
every once in a while. When the heat gets too high in certain places, we've certainly seen that with
Europe lately. We saw it with Latin America not too long ago. BEC is similar. We're seeing an uptick in Latin America
and actually one of the top destinations
for initial transfers of BEC funds
for the first time ever is Mexico.
So it is this sort of rotation that occurs
and I think they will flee the regulatory regimes
that you just talked about.
It would be nice to see that UK template happen elsewhere,
but it's certainly not going to happen overnight.
And because this level of sophistication is low... Just on the Mexico thing, why are they sending it to Mexico? Is it because the groups are based in Mexico or because someone just has a good relationship with a corrupt bank or something?
Probably both.
A little bit in Colombia, a little bit in Colombia. Exactly. But we are seeing just a general uptick in BEC scams in Latin America broadly, because I think it's just less well known than it is in a lot of other places where it's going to help them be a lot more effective in ways that they hadn't been previously.
It's not going to revolutionize phishing.
It still works the same way.
It's not going to revolutionize how people write malware.
We certainly haven't seen that.
It's not even really going to, I think, fundamentally change.
No, sorry.
I'm just going to pull you up on that thing, on the change the way write malware i mean i don't think it's going to fundamentally alter i mean it's going to there's
going to be useful software development tools to come out of this that that's going to impact
software development writ large sure but i also think too there will be some useful things in
terms of obfuscating code and whatever like i think i think there will be use cases yeah it
can make it easier to create a new packer or something for sure exactly exactly yeah but i mean packers already
exist right yes that's that's exactly it right and the and the annoying part of malware you know
you're continuing to uh basically rotate infrastructure and doing all those sorts of
things putting together the giant spam bots that tend to fire out the bigger campaigns
that's that's really hard to
automate even with an llm so i don't know that that's going to actually be meaningfully affected
the other thing that we're constantly asked about is okay if you have these new phishing messages
generated by you know generative ai you know aren't they going to get past your existing filters can't
you just you know ask hey uh chad gpt GPT, create an attack that'll get through Proofpoint, that sort of thing, or Microsoft or anybody else?
Interestingly, we can do that too.
So it's entirely possible to actually sort of pre-train the models on outputs of those models, which we have not had to do, just to be clear. But that is one of the interesting ways to approach that.
And I think ultimately is going to be a useful use of Gen AI.
Get there first.
Do what you think they're going to do to you before they're actually able to do it and
then train your own models on it.
So you're going to build a trace buster, buster, buster, buster.
The only thing that is getting in our way right now is that the...
There's an obscure 90s reference that like 10 people got.
But anyway, continue.
It's a quality for those of us of a certain age, for sure.
But the thing that is currently the obstacle, and I'll be completely candid on this,
the true positive rate for the tools that we're using to detect whether something was created by ai
the true positive rate right now for us is i believe 24 percent uh so well and it's not good
the other thing is too that you got to remember is that i personally know people who are just
terrible writers they're very good at their jobs yes but they're terrible writers and guess what
they use to send emails with now you know like they use chat gpt totally they're like hey write me an
email that tells the person this and it spits out you know look as someone who's been writing
professionally for two decades i hate the way chat gpt writes it's horrible because it's been trained
on how everyone writes and how most people write is horrible. And it just, you know, it is the most bland, average, corporate speak.
Weird blend between corporate speak and forum posts.
Well, yeah.
And like ninth grade essay answers also.
Yeah, exactly.
Exactly.
And it just spits out the email.
They hit send and happily ever after.
So you can't just detect, you know, you can't just say that all text generated by these
tools is going to be somehow malicious when people who might be dyslexic, you know, are using it and love it for that. whether it's for security awareness purposes, whether it's for alerting people when they're in an email chain
that looks like they're falling for a DEC attack,
or even when they put data at risk.
You know, we have prompts that pop up on our DLP
and insider tools
when somebody looks like they're violating a business process
in terms of how they're handling sensitive information.
Being able to interact with something
that can talk to you and can talk back
is very, very useful.
It's like Clippy, but he jumps up and says and says hey tell me what the you're doing and some people
want clippy to be armed that's that's for sure it's a cultural thing uh some people it looks
like you're trying to exfiltrate our customer list what the are you doing yes please click
here for your manager's approval on that yeah those things and actually to And actually, to your point, this is just a random aside.
Now that we have fun MLS analyze things and you can look at lots and lots of DLP alerts.
Do you want to guess the top two job functions that exfiltrate data from our customers?
Tell me.
That would be sales.
Number one, by a very, very long distance.
And customer list, mate. The the rolodex is coming with me
exactly exactly and uh well they sort of take everything with them all the way at the door
and i think they they change jobs frequently so that certainly affects the data uh it actually
engineers uh people feel like they're entitled to their source code uh and that shows up in the data
incredibly clearly look with both i can kind of understand it like yeah you know
it's the thing that you worked on with the source code and i i kind of think sales people the reason
they get hired is because of their rolodex for god's sakes you know what i mean you're hiring
them because they've got that big bag of contacts you can't get salty when they take their contacts
with them when they go to the next job it's why you hired them in the first place but anyway
it's a fair point and i will say the last thing on the DLP side where you can actually do some interesting things, although this is not still in the generative AI bucket, is now instead of writing regular expressions or dictionaries in order to try to identify sensitive data, if you've got 10 examples of that document, you can just train a little model.
It's very, very easy to do.
And that's very straightforward.
Unfortunately, like all these other things we're talking about, it's not magic. It's not getting anywhere near 100% fidelity, so you still have all the classic
problems of false positives, and you have to look at other things like the user behavior context
in order to actually alert on things and not drive everyone crazy.
So to sum up, basically, large language models are allowing business email compromise
crews to hit any country in which the language is supported by the chatbot. So that's a massive win
for them. And you're having some success in being able to pick this stuff up on the detection side.
But I think that's the thing, isn't it? I keep describing large language
models as productivity tools. And here we've got an example where it's increasing productivity for
people on the adversarial side, on the malicious side, and on the defender side. So yay, I guess.
Well, and the only people whose jobs have been affected more are those who are either raising
money right now or are cybersecurity marketing. And ultimately, that is having a massive impact
on how they have to do their jobs
and how you have to explain what you do.
And that's the part that I think is actually interesting
in light of this conversation,
how to actually talk about what it does specifically
and its impact on the threat landscape.
It almost seems a little bit prosaic
compared to all the hype
and all of the things that we thought back in November that would would immediately change it's it's just not winning up to that
reality all right well ryan callenberg thank you very much for that conversation that was all very
interesting stuff uh yeah great stuff man really good to talk to you really good to see you and
we'll chat to you again soon likewise a pleasure pat you