CyberWire Daily - Hiding in plain sight with vibe coding.

Starting point is 00:00:00 You're listening to the CyberWire Network, powered by N2K. And now a word from our sponsor, ThreatLocker. Keeping your system secure shouldn't mean constantly reacting to threats. ThreatLocker helps you take a different approach by giving you full control over what software can run in your environment. If it's not approved, it doesn't run. Simple as that. It's a way to stop ransomware and other attacks before they start without adding extra complexity

Starting point is 00:00:37 to your day. See how Threat Locker can help you lock down your environment at www.threatlocker.com. Hello everyone and welcome to the CyberWires Research Saturday. I'm Dave Bittner and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems and protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us. Pillar Security, we spent the last year and a half spending a lot of time with the emerging

Starting point is 00:01:27 attack vectors that put AI-powered applications at risk. So first of all, we got to learn and get our hands around new attack vectors such as prompt injection, indirect injections, and all sorts of evasion techniques that turn these attacks to be basically invisible to the human eye and most of the security tools out there. That's Ziv Karlener, Pillar Security's co-founder and CTO. The research we're discussing today is titled New Vulnerability in GitHub Copilot and Cursor. How hackers can weaponize code agents. So take that together with the fact that we ourselves are utilizing these amazing coding copilots that on their own are utilizing LLM and its base.

Starting point is 00:02:27 We got us thinking about how the combination of the new attack vectors and the actual, I would say, some of the most popular use cases for the AI-powered applications, which are coding assistants, how this really combines together and sparked our imagination about what can potentially go wrong. Well, at the root of this is what you all refer to as the rules file back door. Can you describe that for us? What exactly are we talking about here? Sure, so maybe one step back. So what are rule files? So think about coding agents this day. You can think about them as another engineer, developer that joined the theme and now helps you

Starting point is 00:03:20 complete a project much quicker. So rule files are basically a way to onboard the coding agent to your project, to your team, to tell it what are the best practices that are being used in a project, what software stack are we using, specific syntax or any guidance and context that is relevant just to the project that we are working on right now. So think about this the first day and in the job for a new developer that joins the team. So that will be the rule files, basically text files

Starting point is 00:03:59 that these coding assistants allow users to define that contain all of the examples and the instructions of how to write code in the best way that suits the project in scope. So these are rule files. The interesting thing when you think about it, and this is basically context, additional context that is being fed into the conversation flow

Starting point is 00:04:28 with the coding agent. And really it's part of the instructions. It's part of the instruction layer, the context layer that is taken into account when the model takes in a request to write new code. This context is added to it before the developer gets back the code suggestions and edits. A rule file backdoor is basically when attackers can embed

Starting point is 00:04:58 malicious instructions in this context that impact any code that is being generated by the coding assistant to create actual backdoors in the generated code. So this is what we shown in example. On its own, it sounds pretty straightforward maybe to protect. But what we uncovered in our research is that, first of all, you have marketplaces. You have now open source marketplaces where rule files are being shared between organizations, which creates a supply chain vector, combined with the fact that you can add hidden instructions.

Starting point is 00:05:40 That's the, I would say, the second risky part here. The some kind of technique that is called hidden Unicode characters, which basically means that when developers look at the rule file, it looks completely legitimate, but it actually contains hidden instructions that only the AI agent understands and acts on. So that's really the, I would say like the perfect scenario where you can hide in plain sight in some of these marketplaces and compromise the underlying developers

Starting point is 00:06:16 that are taking these rule files to improve their projects. Well, can we walk through an example here? I mean, suppose an attacker wants to make use of this. Let's go through the process of how they would go about doing that. For sure. So in our research, we walked through a simple example, a step-by-step example.

Starting point is 00:06:42 So for instance, let's think about an attacker that wants to compromise any Next.js application and how you can do that. So basically the marketplaces for rule files will have directories with basically you can think about it as a directory of every available coding stack. And you can actually commit and add suggestions to these marketplaces and basically hubs of rule files that are being shared between developers.

Starting point is 00:07:23 So let's take the Next.js example. I will go as an attacker to this repo. I will craft a legitimately looking instruction file about Next.js best practices, and I will embed hidden text into this file using the hidden Unicode characters technique. We'll commit this, let's say, to GitHub or with some kind of a web form to this marketplace.

Starting point is 00:07:49 What we also uncovered during the research is that in GitHub itself, it was invisible. Basically, when you commit code that contains these hidden instructions, a developer that is now going to approve this basically addition request is not going to see anything, is not going to get alerted. This is actually something that was solved by GitHub

Starting point is 00:08:15 early this month in one of the vulnerability patches. So now we have this rule file with hidden instructions live on GitHub. And unsuspecting developer that wants to get better results with his coding project when using cursor or GitHub is basically copying this file and adding it to its own project, also sharing it with his team just in order to improve the quality of code for the full team. And now when it's going to, let's say, request an addition of a simple page to his application, the rule file that contains instructions

Starting point is 00:08:56 to add basically malicious JavaScript code to each new HTML file that is being created, it's going to happen only when the agent loads this file, takes in the additional hidden instructions, and generates the additional code on the fly. Now, the interesting thing that we showed on our research paper is basically that in the attack itself, in the malicious instructions, an attacker could also use the agent, I would say intelligence, to its advantage. So what we've shown is that a

Starting point is 00:09:36 developer can then ask, hey, why this code snippet was added to the code that was generated? And the AI agent will say, oh, this is the security best practices of our organization. So the attacker instructions could actually be used not only to inject malicious code, it's also being used to trick the user, kind of social engineer it, to believe that this was the goal in the first place.

Starting point is 00:10:05 So this is utilizing the AI agent intelligence against the end user. This was, I would say, the most interesting finding for us. We'll be right back. We'll be right back. So the hidden instructions using Unicode can also include instructions to mislead someone who's inquiring as to why things are a certain way. Exactly. So I can add on that some of the most I would say popular terms these days is human in the loop. So human in the loop is basically when we're talking about responsibility models and how autonomous agents will be part of the future workforce. So human in the loop is the point in the autonomous processes where an AI agent goes back and asks for approval from the user that tries to achieve some kind of goal. So in this case, most of the coding agents these days, when doing I would, more risky actions like changing, deleting a file or

Starting point is 00:11:27 creating a web request, they will actually stop and ask the user, are you sure you want to complete the next action? This is like the classic human in the loop flow. So one of the things that we've shown here in the blog is basically that if the attack itself is completely hidden to a human are humans really equipped to be in the loop? That's one of the thoughts that got got us more concerned I would say. A lot of the responsibility is moving to the users, but are we actually equipped to deal with this kind of attacks? I mean, it really speaks to that kind of inherent inability to view inside what's really going

Starting point is 00:12:18 on an AI assistant, right? Exactly. And even if you think you're seeing what is going on, the assistants understand, I would say, every language that was ever spoken or written together with hidden Unicode characters, encoded strings like Bay64. For instance, they just understand it as plain English without the need to compute or run any additional processes. So we are kind of not in an even situation between the, I would say, the auditor, which is now basically

Starting point is 00:12:57 every person that needs to observe and kind of decide if an AI agent is allowed or not allowed to do something, and the agents themselves. So that's like, that goes beyond the coding agents, I would say. Well, let's talk about mitigation. What sort of steps can developers take to detect and prevent these sorts of things? Of course. So first of all, I would say as silly as it may sound, sanitation. So think about reducing basically the input options that you have when interacting with the model, even in the language level.

Starting point is 00:13:49 I can actually describe another mitigation that lucky for us as a developer community was actually taken by GitHub based on this research, which is they actually added a new capability in GitHub itself to alert and basically show a warning message whenever there is a hidden instruction or hidden Unicode text that is now part of a text file that is going to be edited. So this is, I would say, kind of a text file that is going to be edited. So this is, I would say, a risk reduction effort

Starting point is 00:14:29 that has been released for every developer that uses GitHub, which is almost everyone. Another part, which is more on the agent builder side, is to take into place different guardrails that can be placed around the models when interacting with them. So for instance, detection of evasion techniques, detection of malicious instructions, jailbreak attempts, and indirect injection attacks, which are part of these new attack vectors that are really becoming more and more relevant with AI-powered applications. There are some great work around uncovering this full attack surface

Starting point is 00:15:16 with OWASP Top 10 for LLMs and MITRE Atlas and other great initiatives that really talk this new risk language and create the right terminology around it. So I would say awareness is the first step as well. What do you suppose this vulnerability reveals about the current state of things? When it comes to AI integration and software development, which I think it's fair to say there's a lot of enthusiasm for it. It's certainly a powerful tool.

Starting point is 00:15:52 And yet we have these things. I mean, is it still early enough days that there's lots to be, these things are important to consider as we go forward. For sure. So, we're still in the early days, but I would say, coming actually myself from experiencing the cloud security space and also in the software supply chain security space, we had, I would say amazing progress with software supply chain security space. We had, I would say, amazing progress with software supply chain security over the last decade with S-bombs becoming a standard and the vulnerability programs. We put a lot of guardrails

Starting point is 00:16:37 inside the CI-CD pipelines and got, I would say, a lot of awareness around it. And on the other hand, we now have this amazing phenomena of, I would call it like the intelligence age, the AI transformation that doesn't leave any, I would say, vertical in the industry or role untouched, but is moving really fast. So there is kind of a challenge here when both the attack vectors are being discovered as we go, but adoption is moving faster than I have ever seen in my career.

Starting point is 00:17:20 So it's a combination, I would say both, and I would say like the security industry in general, you see a lot of awareness, a lot of community efforts to really surface these new emerging threats. Even before we saw, you know, attack vectors being utilized in the wild. I can give an example that one of the accelerators for safer CI-CD pipelines or solar wind

Starting point is 00:17:50 that we are all familiar with. So this really didn't happen yet in the AI security space. I guess it's as always, it's a matter of time and until something becomes more public because we are at a pace of adoption that is only accelerating, I would say. And the opportunities are, I would say, that there are great opportunities these days

Starting point is 00:18:21 for developer teams to move much faster and build even higher quality code if they utilize these tools in the right ways with the right context. But I would say human supervision is still much needed, especially from the right security expertise. And in order to do that, ourselves as a company, we put also one of our main goals is to help increase

Starting point is 00:18:54 awareness with this type of research to really also, I would say put more effort on the responsibility metrics. Who is really responsible for the security issues at hand? Is it on the developers that utilize these two amazing tools? Is it on the tool builders? On the model providers, there is a few different players here that are trying to put these new risks under control and I would say, walking progress. Our thanks to Ziv Karliner from Pillar Security for joining us.

Starting point is 00:19:49 The research is titled New Vulnerability in GitHub Copilot and Cursor. How Hackers Can Weaponize Code Agents. We'll have a link in the show notes. And that's Research Saturday brought to you by N2K Cyberwire. We'd love to hear from you. We're conducting our annual audience survey to learn more about our listeners. We're collecting your insights

Starting point is 00:20:11 through August 31st of this year. There's a link in the show notes. We hope you'll check it out. This episode was produced by Liz Stokes. We're mixed by Elliot Peltsman and Trey Hester. Our executive producer is Jennifer Iben. Peter Kilpey is our publisher, and I'm Dave Bittner. Thanks for listening. We'll see you back here next time. you

CyberWire Daily - Hiding in plain sight with vibe coding.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.