CyberWire Daily - Leaking your AWS API keys, on purpose? [Research Saturday]

Starting point is 00:00:00 You're listening to the Cyber Wire Network, powered by N2K. of you, I was concerned about my data being sold by data brokers. So I decided to try Delete.me. I have to say, Delete.me is a game changer. Within days of signing up, they started removing my personal information from hundreds of data brokers. I finally have peace of mind knowing my data privacy is protected. Delete.me's team does all the work for you with detailed reports so you know exactly what's been done. Take control of your data and keep your private life Thank you. Hello, everyone, and welcome to the CyberWires Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems, and protecting ourselves in a rapidly evolving cyberspace.

Starting point is 00:01:47 Thanks for joining us. It was the beginning of the COVID-19 pandemic. Every conference and career networking event had been canceled that I could find. career networking event had been cancelled that I could find. And I created a script that would email companies and ask for free conference swag. That's Noah Pack, an intern with the SANS Internet Storm Center. The research we're discussing today is titled, What Happens When You Accidentally Leak Your AWS API keys. When I started my time in university, I was doing an introduction to computer science class.

Starting point is 00:02:47 And towards the end of that class, the instructor encouraged all the students to pursue a personal project, something they could post publicly on GitHub or learn from. And I saw a video online of a student at a different university who created a Python script to email universities and ask for free swag. So things like a t-shirt or a mug. And he got a great response. A lot of admissions departments sent him things to encourage him to apply there for grad school. I wanted to take that same idea and adapt it a bit. It was the beginning of the COVID-19 pandemic. Every conference and career networking event had been canceled that I could find. And I created a script that would email companies and ask for free conference swag. So I wrote this up in Python. I found a list of 10 companies that I was fond of and I hoped would respond to me with a

Starting point is 00:03:47 t-shirt or keychain of some kind. I added those company names to my script and it worked flawlessly. I sent it out. I checked the email sent folder and saw all 10 messages. But to celebrate my achievement, I posted this code on GitHub. Shortly thereafter, receiving multiple login requests to the email address I'd created for the script to use. That was because I hard-coded,

Starting point is 00:04:19 which means that I put in plain text inside of the code, the username and password for that email account for my script to use. Oh, Noah. Noah, Noah. Dear sweet Noah. And I was just a freshman in computer science, so I hadn't learned safe programming practices yet,

Starting point is 00:04:39 but I certainly learned from the situation. There had been no ill consequence, but if my project had been bigger and I was using AWS or another cloud provider and hard-coded credentials, that could have dire financial consequences. Yeah. So the lesson here, I suppose,

Starting point is 00:04:59 is that moments after you put this hard-coded email address and password up on GitHub, I guess automation from other people had searched that out and just started hammering that email address. That's exactly what happened. It happened within minutes. And I saw the same thing in my research when I published canary tokens on GitHub for AWS API keys. They were snatched up and used immediately by both threat actors and security companies that were monitoring for them. Well, let's dig into the research that you did here.

Starting point is 00:05:39 I mean, this does involve canary tokens. For folks who may not be familiar with that, how do you describe a canary token? Yeah, so a canary token is sort of like a honeypot. At the Internet Storm Center, we use honeypots, most of which are Raspberry Pis, that look like an attractive target sitting on the Internet for a threat actor to go after. The honeypots record what commands the threat actor uses and what files are downloaded.

Starting point is 00:06:07 Canary tokens are similar but on a much smaller scale. They work really well to supplement the honeypots that we use. Canary tokens can be things like an Excel document, a QR code, or AWS API keys. a QR code or AWS API keys. When a threat actor opens the document, scans the QR code, or uses that API key, it sends an email alert to whoever made the Canary token, alerting them and giving them a little bit of information about how it was used. In my use case of AWS API key tokens, it gave me the IP address and user agent that tried to use those credentials. Well, let's dig into the research here. I mean, what exactly did you set up? So to conduct my research, I added some AWS API key canary tokens to about a moderately but small e-commerce website

Starting point is 00:07:11 that I helped maintain. It gets roughly a thousand visitors a day. It's enough that it'll come up at the top of a Google search, but someone who's not looking for the things that they're selling probably won't find it. So I didn't really expect this to be found very quickly, and it wasn't. So it took a while before someone picked up those keys and tested them. They could have been picked up much earlier than they were actually tested. But when they were tested, the user agent string was pretty interesting. The person that was testing them was using the Boto3 library, and they were using Python on Windows subsystem for Linux. And their IP address came from ProtonVPN.

Starting point is 00:08:02 So because of the anonymity of a VPN service, it's hard to tie this to any other attacks or to figure out who is actually behind testing this. It could be anyone from a threat actor who's looking to abuse this credential to a security researcher that's just scanning websites. So just so I'm clear here, you had a pre-existing website and within this website, you embedded the Canary token, which to the outside world looked like an AWS API key. That's exactly right.

Starting point is 00:08:39 And I tried to make it look as though a developer might have accidentally left it there. And so who do you suppose was going after this? I mean, was this obviously an automated process here? So I would assume that the key was picked up in an automated process, but that it was manually tested. process, but that it was manually tested. And that if it were a larger website that receives more traffic, one with a much different threat

Starting point is 00:09:12 profile, full automation, or a different threat actor who uses more automation could pick it up much quicker. I see. Well, you didn't stop here. You posted your AWS key elsewhere. Take us through the next step of the process here. That's right. I also added some AWS API key canary tokens to GitHub. Now, I created a GitHub repository that I knew any security researcher who lays their eyes on would know that it's a honeypot. It's there to catch people. The repository was named Canaries and it had a readme

Starting point is 00:09:53 that said something like, this is for some research. If you're a bad guy, try these out. If not, please just ignore. Wow. Okay. And after making that repository public, the requests just flooded in. It was much different to when I embedded them in the website. I ended up having to turn off the alerts just to preserve my email inbox. But the first one came from AWS,

Starting point is 00:10:27 the first attempt at using those credentials. And I didn't touch on this in my research, but when you publish AWS API keys on GitHub, almost before you can even refresh the page, AWS will test those keys themselves. And it's because GitHub has secret scanning built in where they send anything that they think might be an API key to AWS to test it.

Starting point is 00:10:59 And AWS will take action. If it's a real API key and not a Canary token, they'll send you an email with an urgent subject line. Action required. Your AWS access key is exposed for AWS account, and then it will list your account number. But not even seconds after AWS tested the key, I got a ton of requests from a company called GitGuardian.

Starting point is 00:11:30 Now, they have a service that will scan public and private repositories for your secrets. It's a paid service. And they tested the keys multiple times within the first few minutes to verify them, all from similar IP addresses in Canada. We'll be right back. Do you know the status of your compliance controls right now? Like, right now. We know that real-time visibility is critical for security, but when it comes to our GRC programs,

Starting point is 00:12:10 we rely on point-in-time checks. But get this. More than 8,000 companies like Atlassian and Quora have continuous visibility into their controls with Vanta. Here's the gist. Vanta brings automation

Starting point is 00:12:24 to evidence collection across 30 frameworks, like SOC 2 and ISO 27001. They also centralize key workflows like policies, access reviews, and reporting, and helps you get security questionnaires done five times faster with AI. Now that's a new way to GRC. Get $1,000 off Vanta when you go to vanta.com slash cyber. That's vanta.com slash cyber for $1,000 off. So they're looking for your business here. They're saying, hey, look what we did.

Starting point is 00:13:12 We found this. Look how quickly we found this thing. And if you use our service, we'll help you protect against this sort of error. Not necessarily because they don't reach out to you in any way like AWS did, but they certainly did see it right away and test it out to protect their customers. Oh, interesting. Okay. Yeah, that would be a great marketing strategy though. No, clearly I assumed too much, but that's interesting. So at this point, I mean, you're getting hammered. You said you had to turn off your email because everything's flooding in. Right. So this key got a ton of alerts right away.

Starting point is 00:14:00 Almost all of them were from GitGuardian. There was also the request from AWS and a couple from IP addresses that had been seen doing similar things and scanning the internet. And so what was your response then? I mean, you see the degree to which this has triggered all of this activity. As a researcher, what do you do next? Yeah, so my next step was to remove the GitHub repository. I had gotten the results that I wanted from my research. I found out that if you publish your AWS API keys on GitHub, they will be used. If you publish them on your website, they will be used.

Starting point is 00:14:42 They might take a couple of seconds. They might take a couple of seconds. It might take a couple of days. But we also don't know the difference between when they're picked up and when they're used. You might be able to rewrite your GitHub repository history and erase those API keys. But someone might still have access to them. They might have downloaded your repository

Starting point is 00:15:04 or the source to your website or scanned your website before they use those keys. So the best practice is definitely to rotate them, to remove all permissions from those keys and create new keys with the permissions that your code needs. Yeah, that was going to be my next question. So once you had removed the information from GitHub, were those keys still being activated? Were people still trying to hammer away using

Starting point is 00:15:31 those credentials? They were. It took about an hour after removing the repository before my last alert came in. So you could chalk that up to someone having the repository open or downloading it before looking through it. Perhaps their scanner that they're using to find these leaked secrets has a bit of a delay or a bit of a backlog from other code that's being uploaded. I wonder if they'll ever get hit again. Are there folks out there who will grab this and then say, okay, well, clearly this person realized they had a problem,

Starting point is 00:16:11 but we're going to check again in a month just in case. Oh, I am extremely excited. I will be extremely excited if I see that because that would be so cool. We know that a lot of threat actors like to lie in wait on networks before they execute their attack. So I'm sure a similar thing is possible here. Yeah.

Starting point is 00:16:36 Well, I mean, I think that the lessons here are pretty clear. I mean, how do you sum them up in terms of the things that you've learned? I mean, how do you sum them up in terms of the things that you've learned? Yeah, so leaking your AWS API keys or any credentials is an extremely big deal. According to Verizon's 2023 data breach investigation report, they said that 61% of data breaches were due to leaked credentials. And while leaking credentials might seem kind of silly, it seems like a fixable problem. I mean, it's the equivalent to leaving keys to a building

Starting point is 00:17:13 in the parking lot. But it really is harder to stop than you might think. Users reuse usernames and passwords on sites that are breached. Anyone can fall for a social engineering attack. Even experienced developers can accidentally publish credentials. And all of those reasons are those are just three of many reasons that this issue exists and why it's so prevalent and why entire companies like Truffle Security and GitGuardian exist to solve this problem. I've seen horror stories from small businesses that had their AWS account hacked, and the attackers racked up bills in excess of $300,000 before the developers could figure out how to rotate those keys and mitigate the problem because they didn't have the incident

Starting point is 00:18:06 response experience and they didn't have tools integrated into their code pipeline to find these secrets and stop them from being published. It's a really good reminder of what I, you know, I suspect there are folks in our audience who are just nodding along and saying, you know, what a basic straightforward thing this is. And yet, as you say, despite that, it does happen to so many people. It happens all the time. There was a cryptocurrency. It was sort of what some people call a meme coin back in, I think, 2022 called Shiba Inu coin. And the developers had a code repository on GitHub where they accidentally leaked their AWS credentials.

Starting point is 00:18:54 Luckily, some security researchers who are fans of the crypto project found them. And unfortunately, they had no way to contact the developers. There was no bug bounty program. There was no security.txt on their website. And those researchers noticed that after a few days, the AWS API credentials were revoked. They stopped working, which means that either they did get a hold of someone at Shiba Inu or the people at Shiba Inu noticed that someone else,

Starting point is 00:19:27 maybe a bad actor, was using those credentials. Right. What's your advice for folks to help mitigate something like this if it does happen? The first advice I ever heard on how to mitigate this issue is actually bad advice. And that would be to rewrite your code repository history on GitHub. That's because things like the Wayback Machine exist, and you don't know if somebody's downloaded that code with the API keys in it. So the better idea is to rotate those keys. You could also do things like looking at your CloudTrail logs in AWS

Starting point is 00:20:04 or set up alerts. At SANS, we like to say that prevention is preferred, detection is a must. So finding out that those keys were accessed is extremely important. Teaching secure coding practices is also probably the best and easiest way to prevent this. is also probably the best and easiest way to prevent this. This includes avoiding the git command, git add, and a wildcard, because that can very easily add sensitive files to your repository. Name the files that contain sensitive information in your.gitignore and your.npmignore files.

Starting point is 00:20:43 Those are sort of like the robots txt of your website, but forget. And then as a threat hunter, one of the techniques that I really like to use is to take a baseline of something. So on a network, I would take a packet capture and look at all of the traffic for the network, slowly eliminating things that I know aren't bad. And at the end, I'll end up with just the network traffic that could be malicious. And I'll have a bunch of filters that will filter out all the stuff that I know is good. Then I can dig into those things that are bad and do the same exercise again in a month or a week or a quarter and add those same filters and find the new traffic that's bad. That same concept can apply

Starting point is 00:21:32 to any log type, including logs from your cloud provider. So look at those CloudTrail logs, understand what is supposed to be running in your AWS account, who is supposed to be running what, and look for services you don't recognize. There are over 200 AWS services at this point, so it's hard to know them all. But you can at least know what ones you use and everything else you can assume is something that you don't and you can dig into it more. Our thanks to Noah Pack from the SANS Internet Storm Center for joining us. The research is titled What Happens When You Accidentally

Starting point is 00:22:18 Leak Your AWS API Keys? We'll have a link in the show notes. and their families at home. Black Cloak's award-winning digital executive protection platform secures their personal devices, home networks, and connected lives. Because when executives are compromised at home, your company is at risk. In fact, over one-third of new members discover they've already been breached. Protect your executives and their families 24-7, 365,

Starting point is 00:23:04 with Black Cloak. Learn more at blackcloak.io. The Cyber Wire Research Saturday podcast is a production of N2K Networks. N2K Strategic Workforce Intelligence optimizes the value of your biggest investment, your people. Thanks for listening. We'll see you back here next time. are Jennifer Iben and Brandon Karp. Our executive editor is Peter Kilby, and I'm Dave Bittner. Thanks for listening. We'll see you back here next time.

Your Ad Here

CyberWire Daily - Leaking your AWS API keys, on purpose? [Research Saturday]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.