Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 718: Agent Risk, Security, and AI Sprawl in 2026: Why AI That Acts Changes Everything (Start Here Series Vol 9)
Episode Date: February 20, 2026The perfect AI storm happened, and no one has noticed yet. 😬And when everyone finally realizes, they'll be too busy focused on the shiny new AI capabilities to see the straight up agentic nigh...tmare looming right in front of them. I'll be honest though. ↳ Yeah, there's a dark side to AI agents. ↳ Yeah, it's worse than you think. ↳ Yeah, we're gonna talk about. Agent Risk, Security, and AI Sprawl in 2026: Why AI That Acts Changes Everything – An Everyday AI Chat with Jordan Wilson (Start Here Series Volume 9)Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion on LinkedIn: Thoughts on this? Join the convo on LinkedIn and connect with other AI leaders.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Evolution of AI Agent Risk (2022-2026)Agentic AI Security Threats ExplainedTypes of Agent Sprawl & Dark AIKey Differences: Chatbot vs Agent RiskMajor AI Labs' Risk Response StrategiesAgent Actions, Outputs, and Enterprise RiskMonday Morning Playbook for AI SecurityAgent Skill Marketplace & Supply Chain RiskTimestamps:00:00 AI Risks: A Rapid Evolution06:22 "Applying Buzzwords to Practice"07:31 AI Evolution: Dumb to Proactive13:23 "Expanding Blast Radius with Tools"14:04 "AI Actions: Enterprise Nightmare"17:16 "Start Here: AI Basics"22:50 "Agentic Web and Business Value"24:09 "Advances Driving AI Agent Efficiency"27:30 AI Labs: Balancing Risk and Research32:04 "Autonomous AI Model Concerns"33:43 "Managing Autonomous Agent Risks"37:08 "Foundations First for Future Success"41:00 "Understanding Risks and Opportunities"Keywords: AI risk, agent risk, security risk, AI sprawl, AI agents, agentic AI, AI model actions, business AI risk, corporate AI security, agent risk mental model, OpenAI, Google, Anthropic, Microsoft, governance, audit logs, isolation protocols, agentic capabilities, smart AI models, proactive AI, autonomous agents, agent hallucination, rogue AI agents, prompt injection, malware agents, spyware agents, ransomware agents, agent input risk, tools risk, outputSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome, the assistant accelerates execution.
There's always been AI risk.
But in the early days of large language models and chatbots,
that risk was like Bill getting something wrong in the blog post
or Deborah putting up a hallucinated stat in the onboarding guide
that was littered with m-dashes and delves.
But AI risk today is a legit different ballgame than risk was three and a half years ago.
I mean, heck, AI risk today is unrecognizable from what it was three and a half months ago.
And that's not an exaggeration.
Because after hearing for like five years that were six months away from real AI agents, well, it finally happened.
And it was actually this perfect storm of multiple events that led to an unexpected business scenarios that the corporate world now faces today.
You either get on board with AI agents quickly or get left behind.
but do it too quickly and you could go under.
The risk, the security, and the sprawl are real.
And so we're going to tackle it all today on Everyday AI.
The Start Here series edition.
All right.
Well, welcome to Everyday AI.
My name is Jordan Wilson.
And if you're new here, this thing's for you.
It's your daily live stream podcast and free daily newsletter,
helping everyday business leaders like you and me keep up with all the news,
like a thousand new agents a day.
What do we do?
What do we try?
Well, tune in, I tell you, and help you make the right decisions to grow your company and your career.
So after 700 plus episodes, I realized I couldn't answer the most common question that people had for me.
Like Jordan, you have a lot of episodes.
Where do I start?
Well, that's why I created the Start Here series.
So the Start Here series is the Essential Podcast series to both learn the AI basics and to double down on your knowledge.
So this is Volume 9, and you can go listen to all of the,
episodes in our Start Here series if you just go to start here series.com.
So that will give you free access to our inner circle community and it'll put you
right in the Start Here series space.
So you can go listen to all of them, watch them, read about them all in one place, interact
with others who are doing the same.
All right.
So one other thing that you need to do before we get started, my gosh, y'all, you have
to go listen to these.
Episode 712 and 713.
That is our 2026 AI prediction and roadmap series.
That's like a culmination of a thousand hours of work over the years or over the past year to give you guys the blueprint for 2026.
So make sure you go listen to those.
All right.
If this sounds kind of like our last episode in the start here series, not exactly.
So make sure you go listen to that one if you didn't already.
So this is more of just the state of AI agents, where they are, what they are, how we should
them, should we use them? So make sure you go listen to Volume 8 from yesterday. That's episode
717. But today we're here to talk about the other side, the risk, the security, and the sprawl.
That's changing everything. So here's what we're going to be covering on today's show. And
well, why it urgently matters because AI models didn't just get smarter. They got hands, right?
the risk model changed when AI moved from generating tax like it was three and a half years ago
to now it's taking real actions.
And a lot of times actions we're not aware of.
And that's the scary part.
And an agent connected to your email and calendar or your company's data can act fast, confident, and wrong.
In the same way that AI models hallucinated three years ago, well, they can still hallucinate now.
And this isn't a future concern.
This is happening right now.
So on today's show,
We're going to go over a simple mental model for why agent risk is fundamentally different
from chatbot risk.
We're going to talk about what OpenAI, Google Anthropic and Microsoft are actually building
to address this risk.
And I'm going to give you a practical Monday morning playbook that you can start using this week
to address the risk security and sprawl.
Sound good?
Yeah.
Sounds good to me.
I'm excited and I wrote all this.
All right.
So let's get a quick little catch up here.
So probably if you're listening to this show, AI is not new to you, right?
But let me just give everyone the briefer here, right?
So essentially from 2022 to mid-20203, early 2024, large language models were largely
text systems, right?
With the risk was just limited to misinformation and data leaks and, you know,
using hallucinations and looking foolish.
but that started to change, I'd say, in the mid-2020-5.
So that's when the models started getting, well, exponentially more capable and agentic
by default.
That's the thing that people don't understand, right?
I've actually had two great conversations with two really smart minds, kind of the head
of agents at Cloudflare and then the head of Microsoft research.
And I learned a lot by talking to them, both, you know, before and after the show as well.
But one thing that kind of came true is, well, no one really knows what constitutes an agent and what does it.
But I think that we can all agree that even today's large language models, they're agentic by nature, right?
They, well, a lot of us are giving them access to all of our data.
And not in all cases, do they have right access?
But in many cases, they do.
So when these models can think and act on their own and, you know, spit up a virtual environment,
and a terminal and access your computer, right?
I keep saying this.
Even right now, I have multiple agents running on my computer, right?
I have codex and clod code going right now.
I'll probably spin up anti-gravity later.
I always have agents running, and they have access,
and I don't necessarily know what they're doing in between.
I always go back and look when they're done.
But that's what's really changed in mid, you know, 20-25.
But here we are in early 2026.
And this is where now we're going to start talking about and actually putting into practice all those buzzwords, right?
The people just, you know, started chatting about in like 2024 to, you know, sounds super smart.
Like, oh, governance and audit logs and, you know, isolation protocols, right?
Like all those.
Well, okay.
Well, good thing that everyone was talking about it.
Because, you know, we were trying to get our buzzword bingo.
Well, now we actually need it.
Right.
Now it's time to talk about ethics and governance and guardrails when it comes to
agented capabilities.
I like to put it like this.
Keep it very simple.
2022 and before, I'd say that AI was a dumb stationary brain, but it was a brain, right?
So everyone was blown away.
Like, oh, my gosh, you can think.
Well, it was dumb.
It was stationary.
Didn't move.
Couldn't really think ahead.
Right.
And in 2023, well, it became.
became a dumb stationary brain with tools.
Right?
That's when, you know, the early version of chat GBT plus, you know, GBT4, it had tools, right?
It could go on the internet as an example.
So that's when it really started to open up what it could do or at least the information
that it could access.
In 2024, well, it was still a stationary brain with tools, but it went from a dumb brain to
a smart brain.
I would say in 2024 was the first time that we actually had smart models because at the
end of 2024, that's when we got reasoning models. And then in 2025, I think we still had smart brains
with tools, but the difference now, instead of it being a stationary brain, it was a proactive brain in
2025. It could go out, at least especially at the end of 2025, the second and third quarter.
It can make moves on its own, right? Proactively could schedule things and or just, you know,
they can start acting over long periods.
And then what brings it to 2026 is, well, now we have that smart, proactive brain with tools and arms, right?
So tools are cool.
But when you have arms, you can actually use them in a real way.
And I think that's where, you know, agents now have teeth when they are autonomous, proactive and smart.
So here's why agent risk, I think, feels different.
because, you know, like I said, with the chatbots, there was always risk, but was it really?
I mean, yeah, worst thing you can do is you get in trouble putting out something hallucinated and you look foolish, right?
Does your company go under?
Probably not.
Right?
Do you expose every single dark seek, right?
It's not that bad.
I mean, it is, but it's not.
But that the new agent layer is just a whole new type of risk that we're not even ready for.
If I'm being honest, if you listen to the show, I say this a lot.
I'm like, this year is going to be scary.
People are not ready.
Businesses are not ready.
And I mean that, right?
That kind of business predicament that I talked about in the opening of the show there,
that is real.
If you don't run to use AI agents this year, you toast.
You are literally toast.
I don't care if you're a small business or a, you know,
$20 billion revenue business, you're toast, right?
But if you sprint too quickly, you can go under because you could make devastating mistakes
that would be highly improbable to recover from because AI agents can do things much worse than
even the worst human, right?
Have you ever heard stories or maybe you've experienced this, right?
Kind of a rogue employee happens too often, right?
someone's bitter maybe about getting fired or not getting a promotion and they do something absolutely
crazy right maybe expose all the company secrets or you know release some files i don't know okay that's a
human that's a single human and you can see that human you have eyes on that human right bill and
i t is watching that human agents are different agents move a hundred a thousand times faster than
that one person but you can always see agents
And guess what?
That one disgruntled employee, that's one.
This isn't a video game.
He can't respond 10 times.
Agents can't.
Agents can spawn subagents like that.
And those subagents can spawn like that.
Right.
So think of in the way like a virus might spread across your computer or across the human body.
Right?
And replicate and duplicate.
It's the same thing with AI agents.
A rogue employee can't do that.
And that's why this risk is very different.
It is very real.
So I think that we spend too much time thinking about the positives and the optimistic side of AI agents, which is great.
Right.
Yes.
Now all of a sudden I have, you know, 320 AI agents, you know, doing all my work for me around the clock.
Oh, that's cool.
Right.
But what about the risk, the security and the sprawl?
And more teams are experimenting now than ever before because no one's got it all figured out.
And that means more sprawl and more exposure, especially if you don't have guardrails up.
So here's essentially the three surfaces where agent risk actually lives.
Number one is the input.
That's kind of that untrusted content that can contain hidden instructions, agents,
you know, treat like real commands.
All right.
That I don't think is going to be as big of a deal, right?
Your inputs, don't get me wrong.
Things can go wrong, right?
People are blindly copying and pasting so you can copy and paste prompt and
injections, but prompt injections are the big thing. And we're going to talk about that here in a second.
But still, inputs, they can be poisoned, right? There's, there's things that you might not know.
And in the same way, a rogue human can create a lot of rogue AI agents that can create a lot of risk and a lot of sprawl that's going to be uncontrollable.
So, you know, inputs, I think you really only have to worry about it. You know, if someone really doesn't know what they're doing or if someone is trying to be malicious, which, again, those things are going to come up.
But think of that one bad employee, what they can now do if they know agentic,
AI, right? Yeah. Talk about malware or spyware, right, ransomware that we, you know, these
companies that have paid, you know, millions of dollars, hundreds, tens of millions of dollars for
ransomware. It's going to be way worse with agents. So the second layer is tools. And I think this is
where it starts to get a little dicey in terms of, well, the capabilities are wild, right? So every
permission and connector that you add expands that blast radius essentially when something goes wrong.
same way that I walked you guys through the, oh, it's a dumb stationary brain. But, hey, once that
brain gets tools, okay, now it can start doing a lot of things. And the same thing on the agent
gone wrong side, right? If it was a dumb stationary agent with no tools and no arms,
it's like, all right, well, have fun, buddy, right? You're in a glass case of emotion just shaking
up. When they have tools, that's where things go wrong, right? When they have access to your
terminal, right, to your computer terminal, that's where things can go wrong.
when they can run code on a machine.
That's where things can go very wrong.
And then last but not least, and this is the big one, this is actions.
And this is the biggest thing that if I had to boil down the biggest change in risk
and why it matters now more than ever, it's outputs to actions, right?
What we had to worry about a couple of years ago from AI in terms of risk was the output.
Now we have to worry about the actions.
But it's actions at scale and actions that we might not even necessarily be able to see.
right silent unintended workflows that you may not be tracing tracing and really what this comes down to
well it's this combination of increased capabilities from agents and moving in the shadows and
that's an enterprise nightmare that is literally the formula for an enterprise nightmare so some stats
here for you right so right now 57% of employees at least admit to using personal AI accounts for work
It's way more than that.
Let's be honest.
That's just the number that admit.
A third admit to inputting sensitive data into unapproved tools, right?
So your shadow AI use case there.
And here's the thing.
You can't govern what you can't see.
And most organizations can't see their agent footprint at scale.
This is what I call these three types of dark AI.
For the most part, you're not going to find, you know,
the three dark types of, or the three types of dark AI online is something that I've kind of
categorized them in. But I think it's really helpful to get a better glimpse and a better
understanding of the categories of risk. So number one, we all know this, shadow AI, right?
That's just essentially unapproved or unknown AI use, right? If you've been using, you know,
Chad GPT on your personal computer, because co-pilot is the AI that's approved,
but you want to use Chad TPT and you copy and paste things over as an example, right?
That's shadow a yacht, but that's been around.
Everyone knows that.
Right.
What you've maybe heard of, maybe not, it's the next kind of tier.
And that's Agent Sprawl.
But Agent Sprawl is known, right?
So that's essentially when you have approved agents, but you're not sure how to wrangle them or observe them all.
You're like, oh, well, yeah, Bill Gables that, you know, agent to help with finances,
but we're not really sure what it's doing.
right we think it's doing good right i check the outputs but i don't really know how it's getting there
that's that's that's that's the beginning of agent sprawl but the thing is agent sprawl goes quickly
in the same way a snowball at the top of the mountain might come rumbling down a thousand times the
size that is where we get into then dark agent sprawl all right this sounds like a like a screen
name for aim back in the 90s.
You guys remember that?
Adobe just introduced an entirely new way to create, bringing the power and precision of
its creative suite into one conversational experience.
Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative
AI studio.
Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just
describe what you want, and shape the outcome as it takes form with the assistant.
The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life.
You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations.
Every step the assistant takes is visible, so you can refine.
redirect or take over at any time.
You stay in the driver's seat as the creative director.
Adobe Firefly AI assistant now in public beta.
See it today at firefly.adopi.com.
Right.
Aim, you know, instant messenger, right?
Dark agent sprawl, 2026.
That's going to be my username on the everyday AI inner circle community.
But essentially that's where there's, well, dark agent sprawl can be a couple of things.
because agent sprawl it's a problem but like i said for the most part that's a known risk and you're like
yeah we have these agents going all over the place we we have no guardrails we have no uh traceability
observability right it's but but you know of it dark agents for all is agents you don't know about
those are unapproved agents that are working in or on your company and you can't observe them
because you don't know about them so this is well this can be shadow a i i mean
you know, gone agentic, right?
So people plugging in agents that aren't approved and company doesn't know,
but there's also another kind, right?
And I think we're going to see a lot of this, maybe not in 2026, but in 2027.
That's going to be the equivalent of malware or spyware, but agents, right?
People sending, in seating agents out specifically in the same way that they would for spyware,
malware, ransomware, to make money, right?
to extract, you know, value from businesses.
And that's what's going to happen.
That's the next version of this.
That's not necessarily dark agent sprawl, but there's two sides, right?
Dark agent sprawl can start out innocent enough, right?
Oh my gosh.
You know, our company's not approving, you know, any of my AI agents.
Well, okay, whatever.
I'm going to go ahead and, you know, unleash, you know, cloud code and get, you know,
50 instances of Claude code going up and they're all going to spin their agents.
well, one person might know, but the rest of the company is in the dark.
And those agents could replicate, duplicate, and you can't observe them.
But the other thing is, well, bad actors, dark agent sprawl.
That's the thing, too.
So why is this all happening now?
All these risks and security concerns and the sprawl, why now?
Because like I said, we've been six months away for five years, right?
We're six months away from great agents.
But then it almost seems like we didn't get a six-month warning.
They just popped up, right?
like December 2025 right like everyone was winding down for you know the the holidays you know
especially here in the U.S.
And then you go back to work in January.
You're like what the frick happened right?
You're like it's like whoa whoa wait the agents are actually here now where's the six month
warning why don't we get get word of this in in June or July but there's three reasons why
and it is literally the perfect storm not just the ingredients happening but at the exact right
or wrong time.
All right.
So number one, the number one element that was mixed in the bowl and it's exploding
is the reasoning threshold.
Right.
So improved reasonings from models like, you know, GPT-5-2 from OpenAI, Gemini 3-1, their new model,
very impressive.
And, you know, Opis, Sonnet, 4-6 from Anthropic, these are all built to be agent-native.
That's how they build them now.
right they're not building models that are great at you know reading writing and
comprehension first no what they're focusing on is is the harness and and the
tool use that's what's first right even you know Google's you know their new
model yesterday everything that they highlighted was tool use right and and how you
know improved tool use in the scaffolding there has you know really help them
improve their outputs but that's that's the thing these models
Their reasoning ability is legit through the roof.
They can plan steps ahead, self-correct errors and move beyond reactive behavior into proactive.
And that's kind of reduced agent reliability from around 50% to 90%.
And that's perfect storm right there.
50% coin flip.
You're not doing that.
Would you hire an employee that has a 50% chance to fail right away?
Probably not, but 90%.
Okay.
That's an A employee.
Step two, computer use improvement.
This seems small.
This is big.
The models have to be insanely smart, right?
They have to be able to think, reason.
They have to have genius level IQ, which is what today's models do.
If you'll get offline IQ tests, they're scoring in the genius level, smarter than 99.9% of people, right?
But the computer use is important.
It doesn't matter if it can't go and use a computer because at least for now, that's how
businesses generate value.
I think eventually value will be just generated agentically, right?
the web has gone agetic, right?
Google and Microsoft announced support for essentially an MCP version of the web, where websites
can talk to each other, agents can talk to each other.
So we'll see how, you know, business value is extracted in the future because a lot of times
it's been through the website stack, the software stack, but, you know, who knows it'll be
in the future.
But right now, agents can use computers better than humans, right?
So computer use means these models can use a mouse computer.
They can click.
They can use APIs.
They can talk to each other.
But the big thing is,
the computer use capability gap
has essentially been solved.
So Claude Sonnet 4.6,
which came out,
well, earlier this week.
So if you're listening to this
in six months, right,
it came out in mid-February.
So imagine in six months,
these computer use models
are going to be insanely good.
But it's the first one
that scored better than humans.
Right?
So 72.5 success rate
on the over.
world benchmark, and that's surpassing human performance for the first time in nearly quintupling
the scores of 2024.
That's the thing.
And I think that's one of the reasons why, you know, even though the maybe the foundation
was there for AI agents, you know, a year, year and a half ago, they couldn't use the computer.
It was so slow, right?
It was, you know, very, very, very slow computer vision, right?
Every time, if you wanted to click something, it would take three minutes.
Okay.
Now, models can go as fast as humans, or at least, you know, not just clicking and navigating
interfaces, but just a range of computer use tasks.
And then the third thing that has created this perfect agentic storm.
Well, it's the context window in memory, right?
So being able to work on task for a long period of time, right?
I think my record so far, I think I got to about 10 hours overnight the other night on
codex, which was really fun to do, right? But it kept its memory persistent the whole time,
right? It didn't forget what I told it when I went to wake up. You know, when I woke up,
I spent time reading through the chain of thought. And I'm like, oh, sweet. Not only did it
remember anything, but I'm going through, you know, tracing it. I'm, you know, making sure it kind
of stayed within the confines of the instructions I gave it. And it did. But, you know, that's another
reason why it's happening. And y'all, let me tell you, if you haven't in the last like three, four
weeks, if you haven't went out and used chat, you know, Open AIs codex, if you haven't used,
you know, quad code or Claude Co work from Anthropic, if you haven't used anti-gravity from Google,
I'm not saying this to like be that guy. You're going to get left behind, right? And if you're
listening to this show, I don't want you left behind. I'm actually thinking, all right? I don't know
this yet. It might, might kind of get a little bit of a vibe working focus.
you know here in the first or second quarter so make sure if you're not already in our
inner circle community get in there um yeah i'm going to start putting putting some stuff in there
just keep keep your eyes open for that all right but let's keep going a little bit here as we
get going to be wrapping up here in a minute let's talk a little bit about how the biggest
AI companies are actually responding right now because the risk is well known because it has literally
been an explosion and it is going to get worse so i want to talk about a little bit about kind
of the big picture focus of the big four labs.
So right now, Open AI is taking more of a human approval approach, right?
So Codex is kind of their command center to review agent decisions.
Anthropic is really going on the defense against prompt injection for browser agents,
you know, with making sure that virtual machines are isolated and have minimal privileges
and going with domain allow list versus blacklists, right?
So everyone has a little bit of a different.
approach here, you know, Google as an example, their Project Mariner, browser agents run in virtual
machines as a safety to isolate them from anything else. They can kind of get their hands on.
And then Microsoft, right, there's millions of ways that Microsoft is doing it and millions of
ways that all these companies are doing it. I just kind of picked out, you know, different
aspects to illustrate their different approach. You know, Microsoft as an example with their
co-pilot studio, I mean, the governance is everywhere, right? The Sentinel monitoring, the per view
logging, you know, the agent, intra-ID, right? So Microsoft, it is very enterprise. Google is as well.
I was just giving an example of, you know, how their project manager kind of runs in an isolated
virtual machine. Because essentially, I think from the people I've talked to, at least three of those
four companies, they know risk is part of it, right? Like there's like any lab, right?
They call them frontier or AI labs for a reason.
You run experiments.
And part of it is, well, you know that things need to go wrong.
Right.
So these companies are intentionally trying to get all the risk and all the security nightmares
and all the sprawl they can.
Right.
So then when they set a new model out in the wild, when they're done with the red teaming,
you know, they hope that they have a good idea.
So the labs are working on this.
But here's the reality.
I think there's a pressure.
to allocate more resources to the development of models versus research and security, right?
I don't have that on, you know, privileged information.
That's just from, you know, talking to a lot of people and reading a lot and just the, I think,
the reality of the world that we live in.
And that's why AIS sprawl is going to be hard to miss for most business leaders.
because everything's unclear, right?
The tool, like, do you know?
Think of the AI model you use most.
Do you know the tools it has, right?
Even if you think of, you know, Chad, GBT, Gemini, Claude.
Do you know what tools the base models have?
Probably not.
You'd be surprised, right?
Like, I'm one of those weird guys that reads change logs and model cards.
Most people have no clue, right?
And this is much worse than classic shadow IT because agents can act across systems, not just within the confines of folders and files, which are very structured.
The whole kind of kind of like how people talk about hallucinations, right?
Oh, you know, they're not a bug.
They're not a bug.
They're a feature.
What makes agents great is also the built-in nightmare.
So if you want to have the good, you have to recognize the bad.
that they can do.
Right.
And that's because that's how they're made.
They are made to go build their own path.
They are made to blaze their own trail.
Right.
So by their very definition, agents aren't necessarily always good at staying within guardrails,
right?
Because if they think that the destination that has been given to them at times, right,
depends on the model to set up all that, right?
But it's very common for an agent to hop over guard rails in order to accomplish a task.
not knowing that that's a bad thing, right?
And every new skill that you add that you give to an ancient or a model,
every connector, every workflow, all that does is it expands the attack surface for everyone.
And that's another realization that I think most people are going to struggle to grasp,
right?
Because you think that you're expanding the amount of work that you can get done.
And are you? Sure. But I think you almost have to think of it from like a, I don't know, like an old school like military perspective, right? It's your land, right? And hey, we're going to go discover new land. We're going to go out and, you know, win new deals, right? So we're going to expand our land. I don't know if any of you have ever played like, you know, risk. I used to. I wish I still had time, right? It's one of those games, you know, it's like, oh, okay, this is good to make sure my brain still works. But it's kind of like that, right? Your agents.
without you knowing, though, are acquiring new territories.
And that's great.
And you think, oh, my gosh, look at these new skills and capabilities.
Right.
It's like, oh, it's like having a thousand new employees.
Okay, great.
But your surface area for risk, attacks, and sprawl is also multiplying at the same time as your capability.
But you're not focused on that.
You're focused on, oh, my gosh, now of a sudden I have a designer.
Now all of a sudden, I have a data analyst.
Okay.
Well, what's that data analyst?
doing with your data.
Are you going to check?
Right?
Is it using an open source, right?
This is one thing I was kind of blown away by it the other day.
Gave, you know, it wasn't anything bad, right?
Gave a model a very hard task.
This is one of those that overnight.
I came back and I realized, oh, it downloaded another large language model locally to do the task.
Right?
It did it locally, right?
So I kept all the data kind of on, on prime, on my local machine.
But what happens if in the future other AI models are just going to go use other AI models,
but on the web, right?
They can.
What if they do that without you telling or what if they go use a site that's insecure?
And this is why I think, you know, OpenClaw, as amazing as it is, this is why it's also scary.
Right?
because, you know, these, you know, autonomous open source agents, the open source, if I'm being
honest, and I'm sure, you know, I'm going to catch some flack from this from some OG open source
people, right?
OG open source, I'm not talking about you.
Today's open source, it's taken a weird turn over the past three to six months, right?
It's almost become this, you know, crypto-infused wild, wild webpoint for West, right?
It's not like traditional open source anymore.
So when we talk about open source AI agents, so many of what's, so much of what's out there is risk, right?
It is.
And it's unknown.
And it can be scary if you don't know exactly what you're getting into.
And I think this is why small agent pilots are becoming huge risk, right?
Because when you build your own agents without centralized visibility or governance, that's where you get in trouble.
So many people in there using their own agents, it's for them.
Right. But it can be going and doing work for the whole department, the whole company, or accessing data from the whole company.
But if one person has eyes on it and if there's not central organization, that's where you run into something.
So here's the playbook, right? If you're feeling a little scared, you know what? Maybe that was my intent.
I got you all pumped up and, you know, excited with yesterday show talking about, you know, the new capabilities of agents.
but you got to get it in check, right?
You have to balance that optimism with a little bit of realism.
But here is the Monday morning playbook for how to deal with the risk, the attacks in the sprawl.
So like what we talked about yesterday, touched on it briefly,
but you need to start with bounded autonomy, right?
When you think of autonomous agents, it's not, you know, zero to a hundred.
It's bounded autonomy.
What that means is you start with suggesting, then propose, then approve, then limited,
execution, right?
So many people go to full execution first.
It's baby steps, right?
A human being doesn't go from womb to sprint, right?
You go from womb to, you know, being on your stomach to sitting, to crawling, to walking, right?
Your agents have to go the same way.
Even if you know out of the box that agent can sprint, you're not prepared for that agent to sprint.
Part of it is for the humans, right?
We think it's more for understanding the agent's abilities.
It's not, right?
This is to make sure that you don't have a bunch of lazy human in the loop.
And instead, you have proactive expert-driven loops.
So start with lease privilege by default.
Read only first, write access only for narrow, defined task, right?
Especially in enterprise environments, you don't want to send out a bunch of read-only general agents.
start with read only first observe improve your loop then you go to limiting limited execution
and then you can finally get it to writing for narrow tests and then last in your
Monday morning playbook is you need to require human approvals for irreversible actions like
sends deletes purchases and permission changes these are all things agents can do
Right. Agents can literally do anything. Agents, right? And this is going to get even harder to manage when we start to see true agentic commerce, right? And I'm not just saying, oh, you know, my agent is going to go buy something, you know, from the Amazon agent. That's not what I'm saying, right? There's going to be agent bartering. There's going to be agent, you know, liaisons, right? So you have to be. You have to.
work on human approvals now, right? It's almost like you have to understand that agents are
capable of, you know, a 10, and you have to give them a two, because we as humans need to first
learn to adapt our behavior before just giving it to agents. And then you need to build
governance before you scale. And I think that's like just what I was getting at. We get all
excited for what we know it can do.
And they're like, wait, we got to do this boring stuff first?
Absolutely.
Because if you don't do the boring stuff first, if you don't go through the sitting on your tummy,
right, sitting up, crawling, walking, then you can't expect to understand the path of the
sprinting agent.
And every agent run needs a decision trace that you can expect after the fact.
So you have to be able to log all your tool calls.
capture you decision traces and monitor for abnormal action patterns.
This is one of the reasons why I said, I think we're going to see in 2026,
it's going to be extremely common to have agent opt teams, right?
In the same reason, or sorry, in the same vein, you know, dev op teams are very common, right?
We're going to have agent opt teams for this exact reason.
So here's what to watch forth for the rest of 2026 when it comes to agent risk security
and sprawl.
Browser agents are going to become a mainstream risk surface.
I think we know that.
I think we're going to see a major open claw type.
I'm not saying open call themselves,
but I think we're going to see a major open source agent crash.
I talked about that in the 2026 AI prediction and roadmap series.
I actually think it's going to come from, right?
There's this new trend like trend over the last week or two
where everyone is just pointing their agents to a website.
When they want their agent to do something,
they're not even saying, all right, let me let me the human understand this.
let me write it out. Let me apply this to my business logic.
Let me apply this with my guard. Nope. They're just saying pointing their agent.
Hey agent, here's this video. Go watch it and do it. Hey, agent, here's this website with a whole
bunch of cool stuff. Just go do it. You know, there's this whole concept now going around of,
you know, point your agent to a URL. Okay. Have fun with that. Don't do that. Right. Don't.
Right. Especially if that agent has its home dedicated machine. Right. And I'm saying,
If you're tinkering around, right, it's your personal computer.
You're the business owner.
If you want to take that risk, that's fine, right?
Especially if you're tinkering around, that's fine.
But don't do that in an enterprise.
And unfortunately, people are doing that.
That's a bad thing because guess what?
A lot of those sites, quote unquote, oh, you know, someone puts up a huge tutorial.
Okay.
They're putting it up on their personal website.
You know how easy it is to hack, to fish any of these websites?
Okay, you see, let me give me an example.
Something goes viral on Twitter.
It's some, you know, literally, some 20-year-old kid, you know, put a great, you know, guide up on the website.
Cool.
And, you know, everyone's just like, hey, agent, go read that, go read that.
And then you have millions of agents going to just read this, you know, this nice kid, put up a cool guy.
Okay, guess what's going to happen?
Someone's going to see that.
They're going to hack that site and they're going to inject malicious code in there that no one's going to see.
That's what's going to happen.
All right.
Next, the agent skill marketplace is going to.
to expand the supply chain risk.
So you inherit what you plug in.
And we've already seen that with some skills and plugins
for agents.
Some of the top ones were found to have malicious code in them.
We're going to see identity and permissions
becoming a board level compliance requirement,
not a nice to have.
And last but not least, the winners are going to treat agents
like production software, not side experiments.
All right, that's it.
That's a wrap.
Volume 9 of the start here series.
I hope this is helpful and I hope you understand a little bit more the risk security and the sprawl that you need to be aware of and how this having AI that acts, having AI, right, that has a smart brain, it's proactive.
It has tools and it has arms.
That changes everything, right?
This perfect storm that has happened in the last 30 days we've seen, I've said this point.
before. We've seen more, both on the opportunity and the risk side, we've seen more in the last 30
days than we've seen in the last probably two years. All right. So I wanted to take a moment
for the start here series to address this. And I hope this helps you not only understand the risk,
but if you want to take advantage of the opportunities, you can't do it without knowing the risk.
So now you do. All right. So I hope this.
was helpful. So if so, there's already eight other in our Start Here series that you should go check out.
So like I said, please go to starthereseries.com. That is going to give you free access to our
inner circle community. Once you sign up, you're going to be just straight up loving it,
hopefully in our Start Here series space and connecting with now more than a thousand people
in our Inner Circle community. So thank you for tuning in. Hope to
see you back for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in
Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own
words and the assistant handles the rest, orchestrating multi-step workflows across Adobe
Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution. Stand control with the ability
to step in and refine at any time. See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this
episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic,
visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next time.
