CyberWire Daily - Crypto client or cyber trap? [Research Saturday]
Episode Date: January 4, 2025Karlo Zanki, Reverse Engineer at ReversingLabs, discussing their work on "Malicious PyPI crypto pay package aiocpa implants infostealer code." ReversingLabs' machine learning-based threat hunting syst...em identified a malicious PyPI package, aiocpa, designed to exfiltrate cryptocurrency wallet information. Unlike typical attacks involving typosquatting, the attackers published a seemingly legitimate crypto client tool to build trust before introducing malicious updates. ReversingLabs used its Spectra Assure platform to detect behavioral anomalies and worked with PyPI to remove the package, highlighting the growing need for advanced supply chain security tools to counter increasingly sophisticated threats. The research can be found here: Malicious PyPI crypto pay package aiocpa implants infostealer code Learn more about your ad choices. Visit megaphone.fm/adchoices
 Transcript
 Discussion  (0)
    
                                         You're listening to the Cyber Wire Network, powered by N2K. of you i was concerned about my data being sold by data brokers so i decided to try delete me i have
                                         
                                         to say delete me is a game changer within days of signing up they started removing my personal
                                         
                                         information from hundreds of data brokers i finally have peace of mind knowing my data privacy
                                         
                                         is protected delete me's team does all the work for you with detailed reports so you know exactly Thank you. Hello, everyone, and welcome to the CyberWires Research Saturday.
                                         
                                         I'm Dave Bittner, and this is our weekly conversation with researchers and analysts
                                         
                                         tracking down the threats and vulnerabilities, solving some of the hard problems,
                                         
                                         and protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.
                                         
                                         So in this case, the detection was triggered by a machine learning model and we have a review procedure of those detections to see which are true positives
                                         
    
                                         and determine what type of malware, in this case, we have.
                                         
                                         That's Carlo Zanchi, reverse engineer at Reversing Labs.
                                         
                                         The research we're discussing today is titled
                                         
                                         Malicious PyPI Crypto Pay Package Implants InfoStealer Code.
                                         
                                         So you get that indication from the automation.
                                         
                                         And what motivated your team to dig deeper into the package?
                                         
                                         Well, in this case, we have a package we have previously seen to have clean versions.
                                         
                                         So usually when we encounter most of the malware in public factory repositories,
                                         
    
                                         you encounter new packages.
                                         
                                         So somebody publishes a new package, which is immediately malicious,
                                         
                                         the first or second version.
                                         
                                         In this case, we had a package that was, let's say,
                                         
                                         maintained for some time, for several months,
                                         
                                         since I believe beginning of September or something like that.
                                         
                                         And usually, when you see such a package that has several months of maintenance,
                                         
                                         they are not always true positives.
                                         
    
                                         More often it happens that they are false positives.
                                         
                                         So those are packages that use malicious-looking behaviors
                                         
                                         in a legitimate way.
                                         
                                         But in this case, we took a detailed look
                                         
                                         and spotted a well-observed obfuscation pattern of base64 encoding and Zlib compression.
                                         
                                         So when we encounter something like that,
                                         
                                         even though sometimes you find something using it in legitimate purposes,
                                         
                                         most often it is malware.
                                         
    
                                         Well, what was the process for deobfuscating the code,
                                         
                                         and what sort of things did you uncover
                                         
                                         once that obfuscation was stripped away?
                                         
                                         Well, this is not... it's Python, it's a cryptic language, so it's fast to write your own code
                                         
                                         to decrypt something like this.
                                         
                                         It's basically several rounds of base64 decoding and reversing spring and doing Zlib decompression.
                                         
                                         So you do what
                                         
                                         the attacker did just in reverse order.
                                         
    
                                         So it's not hard
                                         
                                         to perform
                                         
                                         the obfuscation if you have
                                         
                                         a little of coding experience.
                                         
                                         And what we observed is
                                         
                                         that we
                                         
                                         had malicious
                                         
                                         code aiming to steal sensitive information
                                         
    
                                         relating to crypto trading.
                                         
                                         So the goal was financial gain.
                                         
                                         Most often we see in the latest time
                                         
                                         threat actors try to steal cryptocurrencies
                                         
                                         and secrets related to cryptocurrency trading
                                         
                                         to quickly get to financial gain.
                                         
                                         I mean, looking at some of the threat actors' tactics, how was the AIO CPA campaign different
                                         
                                         than some of the more common supply chain attacks?
                                         
    
                                         You know, things like typo squatting or impersonation.
                                         
                                         Were there some differentiators here?
                                         
                                         Yeah.
                                         
                                         Well, as you said, typoquatting is the most common way.
                                         
                                         You take a popular package
                                         
                                         and typosquat it
                                         
                                         or try to impersonate it,
                                         
                                         but add a suffix to the name,
                                         
    
                                         let's say a legacy.
                                         
                                         I don't know,
                                         
                                         something related to Bitcoin,
                                         
                                         Ethereum or anything like that
                                         
                                         and add some suffix
                                         
                                         to make it look like a fork of a legitimate project
                                         
                                         and hope that someone will use it.
                                         
                                         And those are two most common ways of infecting targets.
                                         
    
                                         But in this case, we had a developer creating his own crypto trading tool,
                                         
                                         likely forking from some other legitimate previously deployed tool,
                                         
                                         and waiting for some time, several months in this case, to build up a user base and
                                         
                                         then publish malicious version to them.
                                         
                                         What was also interesting is that on day or two after the initial version of this package was published to PyPI, the developer behind this package
                                         
                                         tried to retake another
                                         
                                         already existing PyPI package named pay.
                                         
                                         So they were likely trying to
                                         
    
                                         make a more sound name. So it's probably
                                         
                                         more likely someone will use a package named pay
                                         
                                         than A-I-C-O-P-A,
                                         
                                         or however you spell it.
                                         
                                         Right.
                                         
                                         Well, how was the malicious code
                                         
                                         injected into the PyPI package
                                         
                                         without being added to the GitHub repository?
                                         
    
                                         Well, that's not really hard to do.
                                         
                                         In most
                                         
                                         cases, developers don't set up
                                         
                                         automated
                                         
                                         publishing to GitHub, and
                                         
                                         you can separate your publishing
                                         
                                         process. If you control
                                         
                                         your publishing process, you can separate it
                                         
    
                                         however you wish. So you have
                                         
                                         PyPI SS tokens,
                                         
                                         you can publish source code
                                         
                                         to GitHub, and then take your source code,
                                         
                                         add some malicious code to it, package it to the PyPI package format and publish that
                                         
                                         version to PyPI.
                                         
                                         It's not something we have never seen before.
                                         
                                         Often there are crypto miners in NPM packages which didn't have the crypto mining functionality in
                                         
    
                                         GitHub repository source code because
                                         
                                         somebody expects that you will take a look at their source code and you won't find that.
                                         
                                         And threat actors know that it's harder to
                                         
                                         analyze compiled packages,
                                         
                                         in this case PyPI packages, in this case, PyPI packages,
                                         
                                         which are in binary format,
                                         
                                         then to look at source code on GitHub,
                                         
                                         which is created to make it as comfortable as possible
                                         
    
                                         to perform code review.
                                         
                                         Can you help us understand
                                         
                                         how your use of differential analysis
                                         
                                         between those package versions helped reveal the malicious behavior?
                                         
                                         Yeah, well, indicator behaviors are proprietary technology,
                                         
                                         and they are based on observing some low set of behaviors and creating threat hunting rules based on them.
                                         
                                         So basically what you do, you take a PyPI package through a static analysis engine and
                                         
                                         it extracts indicator behaviors.
                                         
    
                                         That's how we call them.
                                         
                                         So they're basically text-written descriptions of what the code does. So if some part of code uses some of HTTP
                                         
                                         libraries, it will say, okay, this code is capable of performing, I don't know, HTTP requests. In this case,
                                         
                                         what we had, we had behavior indicators for decoding data with base64 algorithm
                                         
                                         and importing of Zlib module,
                                         
                                         which is used for Zlib compression and decompression.
                                         
                                         And we also have unusually long strings present in the source code.
                                         
                                         So those three indicators on their own aren't necessarily
                                         
    
                                         specific
                                         
                                         but when you come
                                         
                                         when you put them
                                         
                                         together
                                         
                                         and
                                         
                                         above that
                                         
                                         you have
                                         
                                         expression execution
                                         
    
                                         in the code
                                         
                                         that's something
                                         
                                         that you want
                                         
                                         to look at
                                         
                                         and
                                         
                                         here
                                         
                                         differential behavior
                                         
                                         analysis
                                         
    
                                         is
                                         
                                         the feature
                                         
                                         of our tool
                                         
                                         which enables you to give two package versions
                                         
                                         to the engine tool, and it performs that comparison based on file level.
                                         
                                         So it takes equally named files on equal parts from both of packages.
                                         
                                         files on equal parts from both of packages.
                                         
                                         So like in GitHub you have version diff on the source code. In this case you have version diff on
                                         
    
                                         behaviors, extracted behaviors. So basically it compares same
                                         
                                         files from two different versions of the package and
                                         
                                         tells you what behaviors have changed in those files.
                                         
                                         You don't have to understand source code
                                         
                                         to get a sense of what's happening,
                                         
                                         what here has been introduced in your version.
                                         
                                         So somebody perhaps wouldn't know
                                         
                                         what requests library does.
                                         
    
                                         This way you have easy textual explanation,
                                         
                                         which tells you, okay, this is used to perform
                                         
                                         HTTP
                                         
                                         requests.
                                         
                                         So basically, you compare two versions of package
                                         
                                         and have
                                         
                                         and gives you
                                         
                                         a way to see what
                                         
    
                                         behaviors have been introduced
                                         
                                         in a new version.
                                         
                                         We'll be right back.
                                         
                                         Do you know the status of your compliance controls right now?
                                         
                                         Like, right now.
                                         
                                         We know that real-time visibility is critical for security,
                                         
                                         but when it comes to our GRC programs,
                                         
                                         we rely on point-in-time checks. But get this. More than 8,000 companies like Atlassian and Quora
                                         
    
                                         have continuous visibility into their controls with Vanta. Here's the gist. Vanta brings automation
                                         
                                         to evidence collection across 30 frameworks, like SOC 2 and ISO 27001.
                                         
                                         They also centralize key workflows
                                         
                                         like policies, access reviews, and reporting,
                                         
                                         and helps you get security questionnaires done
                                         
                                         five times faster with AI.
                                         
                                         Now that's a new way to GRC.
                                         
                                         Get $1,000 off Vanta when you go to vanta.com slash cyber.
                                         
    
                                         That's vanta.com slash cyber for $1,000 off.
                                         
                                         And now a message from Black Cloak.
                                         
                                         And now, a message from Black Cloak.
                                         
                                         Did you know the easiest way for cybercriminals to bypass your company's defenses is by targeting your executives and their families at home?
                                         
                                         Black Cloak's award-winning digital executive protection platform secures their personal devices, home networks, and connected lives.
                                         
                                         Because when executives are compromised at home, your company is at risk.
                                         
                                         In fact, over one-third of new members discover they've already been breached.
                                         
                                         Protect your executives and their families 24-7, 365, with Black Cloak.
                                         
    
                                         Learn more at blackcloak.io. Well, looking at the potential impact here, I mean, what was the harm that this package
                                         
                                         could have caused had you all not discovered it?
                                         
                                         Well, basically, it could result in financial loss for the users who installed this package
                                         
                                         and used it in their projects.
                                         
                                         So basically, it's stealing of cryptocurrencies.
                                         
                                         And what sort of insights does this research give everyone
                                         
                                         on some of the risks of relying on open-source repositories
                                         
                                         in the software supply chain?
                                         
    
                                         Well, the risks are emerging on your basis.
                                         
                                         It's not just the amount of malware present there,
                                         
                                         but it's also the level of sophistication that we see each new year.
                                         
                                         A year ago, the main attack vector was
                                         
                                         typosquatting
                                         
                                         and
                                         
                                         let's say
                                         
                                         malicious
                                         
    
                                         actor
                                         
                                         didn't
                                         
                                         put a lot
                                         
                                         of effort
                                         
                                         to
                                         
                                         develop
                                         
                                         sophisticated
                                         
                                         malicious
                                         
    
                                         packages
                                         
                                         so they
                                         
                                         just
                                         
                                         hoped
                                         
                                         that somebody
                                         
                                         will hop
                                         
                                         onto their
                                         
                                         package
                                         
    
                                         and get
                                         
                                         it installed
                                         
                                         as the
                                         
                                         time passes the attackers get more sophisticated.
                                         
                                         We have seen two really sophisticated attacks
                                         
                                         the last two weeks,
                                         
                                         targeting cryptocurrency and artificial intelligence,
                                         
                                         ML tools, Ultralytics and so on,
                                         
    
                                         web packages on NPM and PyPI, respectively.
                                         
                                         So in those cases, the build environments were compromised and that type of attacks were even harder to detect.
                                         
                                         And since they have a big user base,
                                         
                                         like, I don't know, the analytics package was downloaded in, let's say,
                                         
                                         60 million downloads, I believe.
                                         
                                         That's a big user base to quickly deploy your malware to.
                                         
                                         And that's a pretty high risk.
                                         
                                         And in this case, even smaller package projects
                                         
    
                                         which quickly grow user community
                                         
                                         are also a good vector to infect.
                                         
                                         A big number of users
                                         
                                         with low effort and almost no cost.
                                         
                                         Basically, you don't need to
                                         
                                         either host your infrastructure.
                                         
                                         You can use open source
                                         
                                         package repository for distribution, you can use GitHub
                                         
    
                                         for data exfiltration or Dropbox or any tools
                                         
                                         that can be used to upload or download files.
                                         
                                         Well, what was the process of
                                         
                                         reporting this package to PyPI and how did they respond?
                                         
                                         Well, the PyPI community and people behind it
                                         
                                         invest a lot of time to improve the security
                                         
                                         of the entire ecosystems.
                                         
                                         And I believe somewhere in January, February,
                                         
    
                                         or March of this year,
                                         
                                         they introduced a reporting feature
                                         
                                         to their package repository.
                                         
                                         So basically you have a button where you can report malware
                                         
                                         and point to the line of code where you have found malware in,
                                         
                                         and they put that package into quarantine until they determine
                                         
                                         if the package is truly malicious
                                         
                                         or it was false positive reporting.
                                         
    
                                         And their response were quick.
                                         
                                         I believe in just a few hours,
                                         
                                         the package was quarantined.
                                         
                                         And in a few days later, it was removed.
                                         
                                         And we had really good communication
                                         
                                         with the security team behind that product.
                                         
                                         That's good to hear. What are your recommendations then? I mean based on
                                         
                                         the research here, what should people do to better protect themselves?
                                         
    
                                         Well, as I said, there are various package repositories. Not all package repositories
                                         
                                         put the same effort into improving security environment they operate in.
                                         
                                         So basically, on most repositories, it's up to you to make sure that something you're installing is not malicious.
                                         
                                         You should double check everything you have in your code base. So basically security wet,
                                         
                                         everything you plan to use
                                         
                                         from open source package repositories
                                         
                                         because even trustworthy packages
                                         
                                         with millions of downloads
                                         
    
                                         can quickly become malicious
                                         
                                         if somebody compromises
                                         
                                         either deployment account
                                         
                                         or deployment environment.
                                         
                                         So you can't have trust based on good reputation.
                                         
                                         It's not enough anymore.
                                         
                                         There are too many compromises already happened.
                                         
                                         You need to perform security vetting from your side.
                                         
    
                                         If you don't have enough resources or secure knowledge,
                                         
                                         you should use
                                         
                                         dedicated tools
                                         
                                         to perform
                                         
                                         that security
                                         
                                         wetting.
                                         
                                         Versing Labs
                                         
                                         has Spectre
                                         
    
                                         Assure for
                                         
                                         Community,
                                         
                                         which is a
                                         
                                         free repository
                                         
                                         where you can
                                         
                                         see and check
                                         
                                         if there is
                                         
                                         anything known
                                         
    
                                         malicious about
                                         
                                         some package.
                                         
                                         You can go
                                         
                                         there and
                                         
                                         check it.
                                         
                                         If you use hundreds or thousands of projects,
                                         
                                         you likely want to use some commercial tool
                                         
                                         or commercial version of a tool to double check
                                         
    
                                         that your dependencies are clear of malicious content
                                         
                                         and also that you are not introducing
                                         
                                         some type of vulnerability to your software project because
                                         
                                         let's get real
                                         
                                         projects are rarely built of one
                                         
                                         or two libraries
                                         
                                         usually those are dozens or
                                         
                                         tens or hundreds of
                                         
    
                                         open source library in the dependency
                                         
                                         tree and it's hard for
                                         
                                         developer organization
                                         
                                         to security wet everything
                                         
                                         there is in your open source luggage.
                                         
                                         Our thanks to Carlo Zanke from Reversing Labs for joining us.
                                         
                                         The research is titled Malicious PyPI Crypto Pay Package Implants Info Stealer Code.
                                         
                                         We'll have a link in the show notes.
                                         
    
                                         We'd love to know what you think of this podcast.
                                         
                                         Your feedback ensures we deliver the insights that keep you a step ahead in the rapidly changing world of cybersecurity.
                                         
                                         If you like our show, please share a rating and review in your favorite podcast app. Please also fill out the survey in the show notes
                                         
                                         or send an email to cyberwire at n2k.com. We're privileged that N2K Cyber Wire is part of the
                                         
                                         daily routine of the most influential leaders and operators in the public and private sector,
                                         
                                         from the Fortune 500 to many of the world's preeminent intelligence and law enforcement Thank you. by Elliot Peltzman and Trey Hester. Our executive producer is Jennifer Iben. Our executive editor is Brandon Carp.
                                         
                                         Simone Petrella is our president.
                                         
                                         Peter Kilby is our publisher.
                                         
    
                                         And I'm Dave Bittner.
                                         
                                         Thanks for listening.
                                         
                                         We'll see you back here next time. Thank you.
                                         
