CyberWire Daily - Hiding in plain sight with vibe coding.
Episode Date: June 14, 2025This week, Dave is joined by Ziv Karliner, Pillar Security’s Co-Founder and CTO, sharing details on their work on "New Vulnerability in GitHub Copilot and Cursor: How Hackers Can Weaponi...ze Code Agents." Vibe Coding - where developers use AI assistants like GitHub Copilot and Cursor to generate code almost instantly - has become central to how enterprises build software today. But while it’s turbo-charging development, it’s also introducing new and largely unseen cyber threats. The team at Pillar Security identified a novel attack vector, the "Rules File Backdoor", which allows attackers to manipulate these platforms into generating malicious code. It represents a new class of supply chain attacks that weaponizes AI itself, where the malicious code suggestions blend seamlessly with legitimate ones, bypassing human review and security tools. The research can be found here: New Vulnerability in GitHub Copilot and Cursor: How Hackers Can Weaponize Code Agents Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
You're listening to the CyberWire Network, powered by N2K.
And now a word from our sponsor, ThreatLocker.
Keeping your system secure shouldn't mean constantly reacting to threats.
ThreatLocker helps you take a different approach by giving you full control over what software
can run in your environment.
If it's not approved, it doesn't run.
Simple as that.
It's a way to stop ransomware and other attacks before they start without adding extra complexity
to your day.
See how Threat Locker can help you lock down your environment at www.threatlocker.com.
Hello everyone and welcome to the CyberWires Research Saturday.
I'm Dave Bittner and this is our weekly conversation
with researchers and analysts tracking down the threats and vulnerabilities, solving
some of the hard problems and protecting ourselves in a rapidly evolving
cyberspace. Thanks for joining us.
Pillar Security, we spent the last year and a half spending a lot of time with the emerging
attack vectors that put AI-powered applications at risk.
So first of all, we got to learn and get our hands around new attack vectors such as prompt
injection, indirect injections, and all sorts of evasion techniques that turn these attacks to be basically invisible to the human eye and most of the security tools out there.
That's Ziv Karlener, Pillar Security's co-founder and CTO.
The research we're discussing today is titled New Vulnerability in GitHub Copilot and Cursor.
How hackers can weaponize code agents.
So take that together with the fact that we ourselves are utilizing these amazing
coding copilots that on their own are utilizing LLM and its base.
We got us thinking about how the combination of the new attack vectors and the actual,
I would say, some of the most popular use cases for the AI-powered applications,
which are coding assistants, how this really combines
together and sparked our imagination about what can potentially go wrong.
Well, at the root of this is what you all refer to as the rules file back door.
Can you describe that for us? What exactly are we talking about here? Sure, so maybe one step back. So what are rule files? So think about
coding agents this day. You can think about them as
another engineer, developer that joined the theme and now helps you
complete a project much quicker.
So rule files are basically a way to onboard the coding agent to your project, to your team, to tell it what are the best practices that are being used in a project, what
software stack are we using, specific syntax or any guidance and context
that is relevant just to the project
that we are working on right now.
So think about this the first day
and in the job for a new developer that joins the team.
So that will be the rule files, basically text files
that these coding assistants allow users to define that
contain all of the examples and the instructions
of how to write code in the best way
that suits the project in scope.
So these are rule files.
The interesting thing when you think about it,
and this is basically context, additional context
that is being fed into the conversation flow
with the coding agent.
And really it's part of the instructions.
It's part of the instruction layer,
the context layer that is taken into account
when the model takes in a request to write new code.
This context is added to it before the developer
gets back the code suggestions and edits.
A rule file backdoor is basically when attackers can embed
malicious instructions in this context that impact
any code that is being generated by the coding assistant
to create actual backdoors in the generated code.
So this is what we shown in example.
On its own, it sounds pretty straightforward maybe to protect.
But what we uncovered in our research is that, first of all, you have marketplaces.
You have now open source marketplaces where rule files are being shared between organizations,
which creates a supply chain vector, combined with the fact that you can add hidden instructions.
That's the, I would say, the second risky part here. The some kind of technique that is called hidden Unicode characters,
which basically means that when developers look at the rule file,
it looks completely legitimate,
but it actually contains hidden instructions that only the AI agent
understands and acts on. So that's really the, I would say like the perfect scenario
where you can hide in plain sight
in some of these marketplaces
and compromise the underlying developers
that are taking these rule files to improve their projects.
Well, can we walk through an example here?
I mean, suppose an attacker wants to make use of this.
Let's go through the process
of how they would go about doing that.
For sure.
So in our research, we walked through a simple example,
a step-by-step example.
So for instance, let's think about an attacker
that wants to compromise any Next.js application
and how you can do that.
So basically the marketplaces for rule files
will have directories with basically you can think about it as a directory
of every available coding stack.
And you can actually commit and add suggestions to these marketplaces and basically hubs of
rule files that are being shared between developers.
So let's take the Next.js example.
I will go as an attacker to this repo.
I will craft a legitimately looking instruction file
about Next.js best practices,
and I will embed hidden text into this file using
the hidden Unicode characters technique.
We'll commit this, let's say,
to GitHub or with some kind of a web form to this marketplace.
What we also uncovered during the research
is that in GitHub itself, it was invisible.
Basically, when you commit code that
contains these hidden instructions,
a developer that is now going to approve this
basically addition request is not going to see anything,
is not going to get alerted.
This is actually something that was solved by GitHub
early this month in one of the vulnerability patches.
So now we have this rule file with hidden instructions live on GitHub.
And unsuspecting developer that wants to get better results with his coding project when using cursor or GitHub
is basically copying this file and adding it to its own project,
also sharing it with his team just in order to improve the quality of code for the full team.
And now when it's going to, let's say,
request an addition of a simple page to his application,
the rule file that contains instructions
to add basically malicious JavaScript code
to each new HTML file that is being created, it's going to happen only when the agent loads this file,
takes in the additional hidden instructions,
and generates the additional code on the fly.
Now, the interesting thing that we showed on
our research paper is basically that in the attack itself,
in the malicious instructions, an attacker
could also use the agent, I would say intelligence, to its advantage. So what we've shown is that a
developer can then ask, hey, why this code snippet was added to the code that was generated?
And the AI agent will say, oh, this
is the security best practices of our organization.
So the attacker instructions could actually
be used not only to inject malicious code,
it's also being used to trick the user,
kind of social engineer it, to believe that this
was the goal in the first place.
So this is utilizing the AI agent intelligence against the end user.
This was, I would say, the most interesting finding for us.
We'll be right back.
We'll be right back.
So the hidden instructions using Unicode can also include instructions to mislead someone who's inquiring as to why things are a certain way. Exactly. So I can add on that some of the most I would say popular terms these days is human in the loop. So human in the loop is basically when we're talking about responsibility models
and how autonomous agents will be part of the future workforce. So human in the loop is the point in the autonomous processes where an AI agent goes back
and asks for approval from the user that tries to achieve some kind of goal. So in this case,
most of the coding agents these days, when doing I would, more risky actions like changing, deleting a file or
creating a web request, they will actually stop and ask the user, are you
sure you want to complete the next action? This is like the classic human
in the loop flow. So one of the things that we've shown here in the blog
is basically that if the attack itself is completely hidden to a human
are humans really equipped to be in the loop? That's one of the thoughts that got got us more
concerned I would say. A lot of the responsibility is moving to the users, but are we actually equipped to deal with this
kind of attacks?
I mean, it really speaks to that kind of inherent inability to view inside what's really going
on an AI assistant, right?
Exactly. And even if you think you're seeing what is going on, the assistants understand, I
would say, every language that was ever spoken or written together with hidden Unicode characters,
encoded strings like Bay64.
For instance, they just understand it as plain English without the need to compute or run
any additional processes.
So we are kind of not in an even situation between the,
I would say, the auditor, which is now basically
every person that needs to observe and kind of decide if an AI agent is allowed or not
allowed to do something, and the agents themselves.
So that's like, that goes beyond the coding agents, I would say.
Well, let's talk about mitigation.
What sort of steps can developers take to detect and prevent these sorts of things?
Of course. So first of all, I would say as silly as it may sound, sanitation.
So think about reducing basically the input options that you have when interacting with the model,
even in the language level.
I can actually describe another mitigation that lucky for us as
a developer community was actually taken by GitHub based on this research,
which is they actually added a new capability in GitHub itself
to alert and basically show a warning message
whenever there is a hidden instruction or hidden Unicode text
that is now part of a text file that is going to be edited.
So this is, I would say, kind of a text file that is going to be edited.
So this is, I would say, a risk reduction effort
that has been released for every developer that uses GitHub,
which is almost everyone.
Another part, which is more on the agent builder side,
is to take into place different guardrails that can be
placed around the models when interacting with them. So for instance, detection of evasion
techniques, detection of malicious instructions, jailbreak attempts, and indirect injection
attacks, which are part of these new attack vectors that are really becoming more and more relevant with
AI-powered applications. There are some great work around uncovering this full attack surface
with OWASP Top 10 for LLMs and MITRE Atlas and other great initiatives that really talk this new risk language
and create the right terminology around it.
So I would say awareness is the first step as well.
What do you suppose this vulnerability reveals
about the current state of things?
When it comes to AI integration and software development,
which I think it's fair to say there's a lot of enthusiasm for it.
It's certainly a powerful tool.
And yet we have these things.
I mean, is it still early enough days that there's lots to be, these things are important
to consider as we go forward.
For sure.
So, we're still in the early days, but I would say, coming actually myself from
experiencing the cloud security space and also in the software supply chain
security space, we had, I would say amazing progress with software supply chain security space. We had, I would say, amazing progress with software supply chain security over the last decade with S-bombs becoming a
standard and the vulnerability programs. We put a lot of guardrails
inside the CI-CD pipelines and got, I would say, a lot of awareness around it.
And on the other hand, we now have this amazing phenomena
of, I would call it like the intelligence age,
the AI transformation that doesn't leave any,
I would say, vertical in the industry or role untouched,
but is moving really fast.
So there is kind of a challenge here when both the attack vectors are being discovered
as we go, but adoption is moving faster than I have ever seen in my career.
So it's a combination, I would say both, and I would say like the security industry in general,
you see a lot of awareness,
a lot of community efforts to really surface
these new emerging threats.
Even before we saw, you know,
attack vectors being utilized in the wild.
I can give an example that one of the accelerators
for safer CI-CD pipelines or solar wind
that we are all familiar with.
So this really didn't happen yet in the AI security space.
I guess it's as always, it's a matter of time
and until something becomes more public
because we are at a pace of adoption that is only
accelerating, I would say.
And the opportunities are, I would say,
that there are great opportunities these days
for developer teams to move much faster
and build even higher quality code
if they utilize these tools in the right ways
with the right context.
But I would say human supervision is still much needed,
especially from the right security expertise.
And in order to do that, ourselves as a company,
we put also one of our main goals is to help increase
awareness with this type of research to really also,
I would say put more effort on the responsibility metrics.
Who is really responsible for the security issues at hand?
Is it on the developers that utilize these two amazing tools?
Is it on the tool builders?
On the model providers, there is a few different players here that are trying to put these
new risks under control and I would say, walking progress.
Our thanks to Ziv Karliner from Pillar Security for joining us.
The research is titled New Vulnerability in GitHub Copilot and Cursor.
How Hackers Can Weaponize Code Agents.
We'll have a link in the show notes.
And that's Research Saturday brought to you by N2K Cyberwire.
We'd love to hear from you.
We're conducting our annual audience survey
to learn more about our listeners.
We're collecting your insights
through August 31st of this year.
There's a link in the show notes.
We hope you'll check it out.
This episode was produced by Liz Stokes.
We're mixed by Elliot Peltsman and Trey Hester.
Our executive producer is Jennifer Iben.
Peter Kilpey is our publisher,
and I'm Dave Bittner. Thanks for listening. We'll see you back here next time. you
