Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 718: Agent Risk, Security, and AI Sprawl in 2026: Why AI That Acts Changes Everything (Start Here Series Vol 9)

Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome, the assistant accelerates execution. There's always been AI risk. But in the early days of large language models and chatbots,

Starting point is 00:00:51 that risk was like Bill getting something wrong in the blog post or Deborah putting up a hallucinated stat in the onboarding guide that was littered with m-dashes and delves. But AI risk today is a legit different ballgame than risk was three and a half years ago. I mean, heck, AI risk today is unrecognizable from what it was three and a half months ago. And that's not an exaggeration. Because after hearing for like five years that were six months away from real AI agents, well, it finally happened. And it was actually this perfect storm of multiple events that led to an unexpected business scenarios that the corporate world now faces today.

Starting point is 00:01:34 You either get on board with AI agents quickly or get left behind. but do it too quickly and you could go under. The risk, the security, and the sprawl are real. And so we're going to tackle it all today on Everyday AI. The Start Here series edition. All right. Well, welcome to Everyday AI. My name is Jordan Wilson.

Starting point is 00:01:57 And if you're new here, this thing's for you. It's your daily live stream podcast and free daily newsletter, helping everyday business leaders like you and me keep up with all the news, like a thousand new agents a day. What do we do? What do we try? Well, tune in, I tell you, and help you make the right decisions to grow your company and your career. So after 700 plus episodes, I realized I couldn't answer the most common question that people had for me.

Starting point is 00:02:20 Like Jordan, you have a lot of episodes. Where do I start? Well, that's why I created the Start Here series. So the Start Here series is the Essential Podcast series to both learn the AI basics and to double down on your knowledge. So this is Volume 9, and you can go listen to all of the, episodes in our Start Here series if you just go to start here series.com. So that will give you free access to our inner circle community and it'll put you right in the Start Here series space.

Starting point is 00:02:52 So you can go listen to all of them, watch them, read about them all in one place, interact with others who are doing the same. All right. So one other thing that you need to do before we get started, my gosh, y'all, you have to go listen to these. Episode 712 and 713. That is our 2026 AI prediction and roadmap series. That's like a culmination of a thousand hours of work over the years or over the past year to give you guys the blueprint for 2026.

Starting point is 00:03:22 So make sure you go listen to those. All right. If this sounds kind of like our last episode in the start here series, not exactly. So make sure you go listen to that one if you didn't already. So this is more of just the state of AI agents, where they are, what they are, how we should them, should we use them? So make sure you go listen to Volume 8 from yesterday. That's episode 717. But today we're here to talk about the other side, the risk, the security, and the sprawl. That's changing everything. So here's what we're going to be covering on today's show. And

Starting point is 00:03:56 well, why it urgently matters because AI models didn't just get smarter. They got hands, right? the risk model changed when AI moved from generating tax like it was three and a half years ago to now it's taking real actions. And a lot of times actions we're not aware of. And that's the scary part. And an agent connected to your email and calendar or your company's data can act fast, confident, and wrong. In the same way that AI models hallucinated three years ago, well, they can still hallucinate now. And this isn't a future concern.

Starting point is 00:04:31 This is happening right now. So on today's show, We're going to go over a simple mental model for why agent risk is fundamentally different from chatbot risk. We're going to talk about what OpenAI, Google Anthropic and Microsoft are actually building to address this risk. And I'm going to give you a practical Monday morning playbook that you can start using this week to address the risk security and sprawl.

Starting point is 00:04:55 Sound good? Yeah. Sounds good to me. I'm excited and I wrote all this. All right. So let's get a quick little catch up here. So probably if you're listening to this show, AI is not new to you, right? But let me just give everyone the briefer here, right?

Starting point is 00:05:14 So essentially from 2022 to mid-20203, early 2024, large language models were largely text systems, right? With the risk was just limited to misinformation and data leaks and, you know, using hallucinations and looking foolish. but that started to change, I'd say, in the mid-2020-5. So that's when the models started getting, well, exponentially more capable and agentic by default. That's the thing that people don't understand, right?

Starting point is 00:05:45 I've actually had two great conversations with two really smart minds, kind of the head of agents at Cloudflare and then the head of Microsoft research. And I learned a lot by talking to them, both, you know, before and after the show as well. But one thing that kind of came true is, well, no one really knows what constitutes an agent and what does it. But I think that we can all agree that even today's large language models, they're agentic by nature, right? They, well, a lot of us are giving them access to all of our data. And not in all cases, do they have right access? But in many cases, they do.

Starting point is 00:06:23 So when these models can think and act on their own and, you know, spit up a virtual environment, and a terminal and access your computer, right? I keep saying this. Even right now, I have multiple agents running on my computer, right? I have codex and clod code going right now. I'll probably spin up anti-gravity later. I always have agents running, and they have access, and I don't necessarily know what they're doing in between.

Starting point is 00:06:50 I always go back and look when they're done. But that's what's really changed in mid, you know, 20-25. But here we are in early 2026. And this is where now we're going to start talking about and actually putting into practice all those buzzwords, right? The people just, you know, started chatting about in like 2024 to, you know, sounds super smart. Like, oh, governance and audit logs and, you know, isolation protocols, right? Like all those. Well, okay.

Starting point is 00:07:19 Well, good thing that everyone was talking about it. Because, you know, we were trying to get our buzzword bingo. Well, now we actually need it. Right. Now it's time to talk about ethics and governance and guardrails when it comes to agented capabilities. I like to put it like this. Keep it very simple.

Starting point is 00:07:46 2022 and before, I'd say that AI was a dumb stationary brain, but it was a brain, right? So everyone was blown away. Like, oh, my gosh, you can think. Well, it was dumb. It was stationary. Didn't move. Couldn't really think ahead. Right.

Starting point is 00:08:01 And in 2023, well, it became. became a dumb stationary brain with tools. Right? That's when, you know, the early version of chat GBT plus, you know, GBT4, it had tools, right? It could go on the internet as an example. So that's when it really started to open up what it could do or at least the information that it could access. In 2024, well, it was still a stationary brain with tools, but it went from a dumb brain to

Starting point is 00:08:27 a smart brain. I would say in 2024 was the first time that we actually had smart models because at the end of 2024, that's when we got reasoning models. And then in 2025, I think we still had smart brains with tools, but the difference now, instead of it being a stationary brain, it was a proactive brain in 2025. It could go out, at least especially at the end of 2025, the second and third quarter. It can make moves on its own, right? Proactively could schedule things and or just, you know, they can start acting over long periods. And then what brings it to 2026 is, well, now we have that smart, proactive brain with tools and arms, right?

Starting point is 00:09:12 So tools are cool. But when you have arms, you can actually use them in a real way. And I think that's where, you know, agents now have teeth when they are autonomous, proactive and smart. So here's why agent risk, I think, feels different. because, you know, like I said, with the chatbots, there was always risk, but was it really? I mean, yeah, worst thing you can do is you get in trouble putting out something hallucinated and you look foolish, right? Does your company go under? Probably not.

Starting point is 00:09:47 Right? Do you expose every single dark seek, right? It's not that bad. I mean, it is, but it's not. But that the new agent layer is just a whole new type of risk that we're not even ready for. If I'm being honest, if you listen to the show, I say this a lot. I'm like, this year is going to be scary. People are not ready.

Starting point is 00:10:08 Businesses are not ready. And I mean that, right? That kind of business predicament that I talked about in the opening of the show there, that is real. If you don't run to use AI agents this year, you toast. You are literally toast. I don't care if you're a small business or a, you know, $20 billion revenue business, you're toast, right?

Starting point is 00:10:35 But if you sprint too quickly, you can go under because you could make devastating mistakes that would be highly improbable to recover from because AI agents can do things much worse than even the worst human, right? Have you ever heard stories or maybe you've experienced this, right? Kind of a rogue employee happens too often, right? someone's bitter maybe about getting fired or not getting a promotion and they do something absolutely crazy right maybe expose all the company secrets or you know release some files i don't know okay that's a human that's a single human and you can see that human you have eyes on that human right bill and

Starting point is 00:11:19 i t is watching that human agents are different agents move a hundred a thousand times faster than that one person but you can always see agents And guess what? That one disgruntled employee, that's one. This isn't a video game. He can't respond 10 times. Agents can't. Agents can spawn subagents like that.

Starting point is 00:11:41 And those subagents can spawn like that. Right. So think of in the way like a virus might spread across your computer or across the human body. Right? And replicate and duplicate. It's the same thing with AI agents. A rogue employee can't do that. And that's why this risk is very different.

Starting point is 00:11:58 It is very real. So I think that we spend too much time thinking about the positives and the optimistic side of AI agents, which is great. Right. Yes. Now all of a sudden I have, you know, 320 AI agents, you know, doing all my work for me around the clock. Oh, that's cool. Right. But what about the risk, the security and the sprawl?

Starting point is 00:12:23 And more teams are experimenting now than ever before because no one's got it all figured out. And that means more sprawl and more exposure, especially if you don't have guardrails up. So here's essentially the three surfaces where agent risk actually lives. Number one is the input. That's kind of that untrusted content that can contain hidden instructions, agents, you know, treat like real commands. All right. That I don't think is going to be as big of a deal, right?

Starting point is 00:12:51 Your inputs, don't get me wrong. Things can go wrong, right? People are blindly copying and pasting so you can copy and paste prompt and injections, but prompt injections are the big thing. And we're going to talk about that here in a second. But still, inputs, they can be poisoned, right? There's, there's things that you might not know. And in the same way, a rogue human can create a lot of rogue AI agents that can create a lot of risk and a lot of sprawl that's going to be uncontrollable. So, you know, inputs, I think you really only have to worry about it. You know, if someone really doesn't know what they're doing or if someone is trying to be malicious, which, again, those things are going to come up. But think of that one bad employee, what they can now do if they know agentic,

Starting point is 00:13:30 AI, right? Yeah. Talk about malware or spyware, right, ransomware that we, you know, these companies that have paid, you know, millions of dollars, hundreds, tens of millions of dollars for ransomware. It's going to be way worse with agents. So the second layer is tools. And I think this is where it starts to get a little dicey in terms of, well, the capabilities are wild, right? So every permission and connector that you add expands that blast radius essentially when something goes wrong. same way that I walked you guys through the, oh, it's a dumb stationary brain. But, hey, once that brain gets tools, okay, now it can start doing a lot of things. And the same thing on the agent gone wrong side, right? If it was a dumb stationary agent with no tools and no arms,

Starting point is 00:14:15 it's like, all right, well, have fun, buddy, right? You're in a glass case of emotion just shaking up. When they have tools, that's where things go wrong, right? When they have access to your terminal, right, to your computer terminal, that's where things can go wrong. when they can run code on a machine. That's where things can go very wrong. And then last but not least, and this is the big one, this is actions. And this is the biggest thing that if I had to boil down the biggest change in risk and why it matters now more than ever, it's outputs to actions, right?

Starting point is 00:14:48 What we had to worry about a couple of years ago from AI in terms of risk was the output. Now we have to worry about the actions. But it's actions at scale and actions that we might not even necessarily be able to see. right silent unintended workflows that you may not be tracing tracing and really what this comes down to well it's this combination of increased capabilities from agents and moving in the shadows and that's an enterprise nightmare that is literally the formula for an enterprise nightmare so some stats here for you right so right now 57% of employees at least admit to using personal AI accounts for work It's way more than that.

Starting point is 00:15:29 Let's be honest. That's just the number that admit. A third admit to inputting sensitive data into unapproved tools, right? So your shadow AI use case there. And here's the thing. You can't govern what you can't see. And most organizations can't see their agent footprint at scale. This is what I call these three types of dark AI.

Starting point is 00:15:56 For the most part, you're not going to find, you know, the three dark types of, or the three types of dark AI online is something that I've kind of categorized them in. But I think it's really helpful to get a better glimpse and a better understanding of the categories of risk. So number one, we all know this, shadow AI, right? That's just essentially unapproved or unknown AI use, right? If you've been using, you know, Chad GPT on your personal computer, because co-pilot is the AI that's approved, but you want to use Chad TPT and you copy and paste things over as an example, right? That's shadow a yacht, but that's been around.

Starting point is 00:16:31 Everyone knows that. Right. What you've maybe heard of, maybe not, it's the next kind of tier. And that's Agent Sprawl. But Agent Sprawl is known, right? So that's essentially when you have approved agents, but you're not sure how to wrangle them or observe them all. You're like, oh, well, yeah, Bill Gables that, you know, agent to help with finances, but we're not really sure what it's doing.

Starting point is 00:16:58 right we think it's doing good right i check the outputs but i don't really know how it's getting there that's that's that's that's the beginning of agent sprawl but the thing is agent sprawl goes quickly in the same way a snowball at the top of the mountain might come rumbling down a thousand times the size that is where we get into then dark agent sprawl all right this sounds like a like a screen name for aim back in the 90s. You guys remember that? Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience.

Starting point is 00:17:43 Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio. Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible, so you can refine. redirect or take over at any time.

Starting point is 00:18:33 You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. Right. Aim, you know, instant messenger, right? Dark agent sprawl, 2026. That's going to be my username on the everyday AI inner circle community. But essentially that's where there's, well, dark agent sprawl can be a couple of things.

Starting point is 00:19:03 because agent sprawl it's a problem but like i said for the most part that's a known risk and you're like yeah we have these agents going all over the place we we have no guardrails we have no uh traceability observability right it's but but you know of it dark agents for all is agents you don't know about those are unapproved agents that are working in or on your company and you can't observe them because you don't know about them so this is well this can be shadow a i i mean you know, gone agentic, right? So people plugging in agents that aren't approved and company doesn't know, but there's also another kind, right?

Starting point is 00:19:42 And I think we're going to see a lot of this, maybe not in 2026, but in 2027. That's going to be the equivalent of malware or spyware, but agents, right? People sending, in seating agents out specifically in the same way that they would for spyware, malware, ransomware, to make money, right? to extract, you know, value from businesses. And that's what's going to happen. That's the next version of this. That's not necessarily dark agent sprawl, but there's two sides, right?

Starting point is 00:20:14 Dark agent sprawl can start out innocent enough, right? Oh my gosh. You know, our company's not approving, you know, any of my AI agents. Well, okay, whatever. I'm going to go ahead and, you know, unleash, you know, cloud code and get, you know, 50 instances of Claude code going up and they're all going to spin their agents. well, one person might know, but the rest of the company is in the dark. And those agents could replicate, duplicate, and you can't observe them.

Starting point is 00:20:38 But the other thing is, well, bad actors, dark agent sprawl. That's the thing, too. So why is this all happening now? All these risks and security concerns and the sprawl, why now? Because like I said, we've been six months away for five years, right? We're six months away from great agents. But then it almost seems like we didn't get a six-month warning. They just popped up, right?

Starting point is 00:21:01 like December 2025 right like everyone was winding down for you know the the holidays you know especially here in the U.S. And then you go back to work in January. You're like what the frick happened right? You're like it's like whoa whoa wait the agents are actually here now where's the six month warning why don't we get get word of this in in June or July but there's three reasons why and it is literally the perfect storm not just the ingredients happening but at the exact right or wrong time.

Starting point is 00:21:31 All right. So number one, the number one element that was mixed in the bowl and it's exploding is the reasoning threshold. Right. So improved reasonings from models like, you know, GPT-5-2 from OpenAI, Gemini 3-1, their new model, very impressive. And, you know, Opis, Sonnet, 4-6 from Anthropic, these are all built to be agent-native. That's how they build them now.

Starting point is 00:22:00 right they're not building models that are great at you know reading writing and comprehension first no what they're focusing on is is the harness and and the tool use that's what's first right even you know Google's you know their new model yesterday everything that they highlighted was tool use right and and how you know improved tool use in the scaffolding there has you know really help them improve their outputs but that's that's the thing these models Their reasoning ability is legit through the roof. They can plan steps ahead, self-correct errors and move beyond reactive behavior into proactive.

Starting point is 00:22:39 And that's kind of reduced agent reliability from around 50% to 90%. And that's perfect storm right there. 50% coin flip. You're not doing that. Would you hire an employee that has a 50% chance to fail right away? Probably not, but 90%. Okay. That's an A employee.

Starting point is 00:22:58 Step two, computer use improvement. This seems small. This is big. The models have to be insanely smart, right? They have to be able to think, reason. They have to have genius level IQ, which is what today's models do. If you'll get offline IQ tests, they're scoring in the genius level, smarter than 99.9% of people, right? But the computer use is important.

Starting point is 00:23:19 It doesn't matter if it can't go and use a computer because at least for now, that's how businesses generate value. I think eventually value will be just generated agentically, right? the web has gone agetic, right? Google and Microsoft announced support for essentially an MCP version of the web, where websites can talk to each other, agents can talk to each other. So we'll see how, you know, business value is extracted in the future because a lot of times it's been through the website stack, the software stack, but, you know, who knows it'll be

Starting point is 00:23:48 in the future. But right now, agents can use computers better than humans, right? So computer use means these models can use a mouse computer. They can click. They can use APIs. They can talk to each other. But the big thing is, the computer use capability gap

Starting point is 00:24:04 has essentially been solved. So Claude Sonnet 4.6, which came out, well, earlier this week. So if you're listening to this in six months, right, it came out in mid-February. So imagine in six months,

Starting point is 00:24:16 these computer use models are going to be insanely good. But it's the first one that scored better than humans. Right? So 72.5 success rate on the over. world benchmark, and that's surpassing human performance for the first time in nearly quintupling

Starting point is 00:24:36 the scores of 2024. That's the thing. And I think that's one of the reasons why, you know, even though the maybe the foundation was there for AI agents, you know, a year, year and a half ago, they couldn't use the computer. It was so slow, right? It was, you know, very, very, very slow computer vision, right? Every time, if you wanted to click something, it would take three minutes. Okay.

Starting point is 00:24:57 Now, models can go as fast as humans, or at least, you know, not just clicking and navigating interfaces, but just a range of computer use tasks. And then the third thing that has created this perfect agentic storm. Well, it's the context window in memory, right? So being able to work on task for a long period of time, right? I think my record so far, I think I got to about 10 hours overnight the other night on codex, which was really fun to do, right? But it kept its memory persistent the whole time, right? It didn't forget what I told it when I went to wake up. You know, when I woke up,

Starting point is 00:25:37 I spent time reading through the chain of thought. And I'm like, oh, sweet. Not only did it remember anything, but I'm going through, you know, tracing it. I'm, you know, making sure it kind of stayed within the confines of the instructions I gave it. And it did. But, you know, that's another reason why it's happening. And y'all, let me tell you, if you haven't in the last like three, four weeks, if you haven't went out and used chat, you know, Open AIs codex, if you haven't used, you know, quad code or Claude Co work from Anthropic, if you haven't used anti-gravity from Google, I'm not saying this to like be that guy. You're going to get left behind, right? And if you're listening to this show, I don't want you left behind. I'm actually thinking, all right? I don't know

Starting point is 00:26:18 this yet. It might, might kind of get a little bit of a vibe working focus. you know here in the first or second quarter so make sure if you're not already in our inner circle community get in there um yeah i'm going to start putting putting some stuff in there just keep keep your eyes open for that all right but let's keep going a little bit here as we get going to be wrapping up here in a minute let's talk a little bit about how the biggest AI companies are actually responding right now because the risk is well known because it has literally been an explosion and it is going to get worse so i want to talk about a little bit about kind of the big picture focus of the big four labs.

Starting point is 00:26:57 So right now, Open AI is taking more of a human approval approach, right? So Codex is kind of their command center to review agent decisions. Anthropic is really going on the defense against prompt injection for browser agents, you know, with making sure that virtual machines are isolated and have minimal privileges and going with domain allow list versus blacklists, right? So everyone has a little bit of a different. approach here, you know, Google as an example, their Project Mariner, browser agents run in virtual machines as a safety to isolate them from anything else. They can kind of get their hands on.

Starting point is 00:27:34 And then Microsoft, right, there's millions of ways that Microsoft is doing it and millions of ways that all these companies are doing it. I just kind of picked out, you know, different aspects to illustrate their different approach. You know, Microsoft as an example with their co-pilot studio, I mean, the governance is everywhere, right? The Sentinel monitoring, the per view logging, you know, the agent, intra-ID, right? So Microsoft, it is very enterprise. Google is as well. I was just giving an example of, you know, how their project manager kind of runs in an isolated virtual machine. Because essentially, I think from the people I've talked to, at least three of those four companies, they know risk is part of it, right? Like there's like any lab, right?

Starting point is 00:28:22 They call them frontier or AI labs for a reason. You run experiments. And part of it is, well, you know that things need to go wrong. Right. So these companies are intentionally trying to get all the risk and all the security nightmares and all the sprawl they can. Right. So then when they set a new model out in the wild, when they're done with the red teaming,

Starting point is 00:28:43 you know, they hope that they have a good idea. So the labs are working on this. But here's the reality. I think there's a pressure. to allocate more resources to the development of models versus research and security, right? I don't have that on, you know, privileged information. That's just from, you know, talking to a lot of people and reading a lot and just the, I think, the reality of the world that we live in.

Starting point is 00:29:12 And that's why AIS sprawl is going to be hard to miss for most business leaders. because everything's unclear, right? The tool, like, do you know? Think of the AI model you use most. Do you know the tools it has, right? Even if you think of, you know, Chad, GBT, Gemini, Claude. Do you know what tools the base models have? Probably not.

Starting point is 00:29:38 You'd be surprised, right? Like, I'm one of those weird guys that reads change logs and model cards. Most people have no clue, right? And this is much worse than classic shadow IT because agents can act across systems, not just within the confines of folders and files, which are very structured. The whole kind of kind of like how people talk about hallucinations, right? Oh, you know, they're not a bug. They're not a bug. They're a feature.

Starting point is 00:30:10 What makes agents great is also the built-in nightmare. So if you want to have the good, you have to recognize the bad. that they can do. Right. And that's because that's how they're made. They are made to go build their own path. They are made to blaze their own trail. Right.

Starting point is 00:30:29 So by their very definition, agents aren't necessarily always good at staying within guardrails, right? Because if they think that the destination that has been given to them at times, right, depends on the model to set up all that, right? But it's very common for an agent to hop over guard rails in order to accomplish a task. not knowing that that's a bad thing, right? And every new skill that you add that you give to an ancient or a model, every connector, every workflow, all that does is it expands the attack surface for everyone.

Starting point is 00:31:06 And that's another realization that I think most people are going to struggle to grasp, right? Because you think that you're expanding the amount of work that you can get done. And are you? Sure. But I think you almost have to think of it from like a, I don't know, like an old school like military perspective, right? It's your land, right? And hey, we're going to go discover new land. We're going to go out and, you know, win new deals, right? So we're going to expand our land. I don't know if any of you have ever played like, you know, risk. I used to. I wish I still had time, right? It's one of those games, you know, it's like, oh, okay, this is good to make sure my brain still works. But it's kind of like that, right? Your agents. without you knowing, though, are acquiring new territories. And that's great. And you think, oh, my gosh, look at these new skills and capabilities. Right.

Starting point is 00:31:55 It's like, oh, it's like having a thousand new employees. Okay, great. But your surface area for risk, attacks, and sprawl is also multiplying at the same time as your capability. But you're not focused on that. You're focused on, oh, my gosh, now of a sudden I have a designer. Now all of a sudden, I have a data analyst. Okay. Well, what's that data analyst?

Starting point is 00:32:17 doing with your data. Are you going to check? Right? Is it using an open source, right? This is one thing I was kind of blown away by it the other day. Gave, you know, it wasn't anything bad, right? Gave a model a very hard task. This is one of those that overnight.

Starting point is 00:32:38 I came back and I realized, oh, it downloaded another large language model locally to do the task. Right? It did it locally, right? So I kept all the data kind of on, on prime, on my local machine. But what happens if in the future other AI models are just going to go use other AI models, but on the web, right? They can. What if they do that without you telling or what if they go use a site that's insecure?

Starting point is 00:33:06 And this is why I think, you know, OpenClaw, as amazing as it is, this is why it's also scary. Right? because, you know, these, you know, autonomous open source agents, the open source, if I'm being honest, and I'm sure, you know, I'm going to catch some flack from this from some OG open source people, right? OG open source, I'm not talking about you. Today's open source, it's taken a weird turn over the past three to six months, right? It's almost become this, you know, crypto-infused wild, wild webpoint for West, right?

Starting point is 00:33:41 It's not like traditional open source anymore. So when we talk about open source AI agents, so many of what's, so much of what's out there is risk, right? It is. And it's unknown. And it can be scary if you don't know exactly what you're getting into. And I think this is why small agent pilots are becoming huge risk, right? Because when you build your own agents without centralized visibility or governance, that's where you get in trouble. So many people in there using their own agents, it's for them.

Starting point is 00:34:11 Right. But it can be going and doing work for the whole department, the whole company, or accessing data from the whole company. But if one person has eyes on it and if there's not central organization, that's where you run into something. So here's the playbook, right? If you're feeling a little scared, you know what? Maybe that was my intent. I got you all pumped up and, you know, excited with yesterday show talking about, you know, the new capabilities of agents. but you got to get it in check, right? You have to balance that optimism with a little bit of realism. But here is the Monday morning playbook for how to deal with the risk, the attacks in the sprawl. So like what we talked about yesterday, touched on it briefly,

Starting point is 00:34:55 but you need to start with bounded autonomy, right? When you think of autonomous agents, it's not, you know, zero to a hundred. It's bounded autonomy. What that means is you start with suggesting, then propose, then approve, then limited, execution, right? So many people go to full execution first. It's baby steps, right? A human being doesn't go from womb to sprint, right?

Starting point is 00:35:23 You go from womb to, you know, being on your stomach to sitting, to crawling, to walking, right? Your agents have to go the same way. Even if you know out of the box that agent can sprint, you're not prepared for that agent to sprint. Part of it is for the humans, right? We think it's more for understanding the agent's abilities. It's not, right? This is to make sure that you don't have a bunch of lazy human in the loop. And instead, you have proactive expert-driven loops.

Starting point is 00:35:57 So start with lease privilege by default. Read only first, write access only for narrow, defined task, right? Especially in enterprise environments, you don't want to send out a bunch of read-only general agents. start with read only first observe improve your loop then you go to limiting limited execution and then you can finally get it to writing for narrow tests and then last in your Monday morning playbook is you need to require human approvals for irreversible actions like sends deletes purchases and permission changes these are all things agents can do Right. Agents can literally do anything. Agents, right? And this is going to get even harder to manage when we start to see true agentic commerce, right? And I'm not just saying, oh, you know, my agent is going to go buy something, you know, from the Amazon agent. That's not what I'm saying, right? There's going to be agent bartering. There's going to be agent, you know, liaisons, right? So you have to be. You have to.

Starting point is 00:37:10 work on human approvals now, right? It's almost like you have to understand that agents are capable of, you know, a 10, and you have to give them a two, because we as humans need to first learn to adapt our behavior before just giving it to agents. And then you need to build governance before you scale. And I think that's like just what I was getting at. We get all excited for what we know it can do. And they're like, wait, we got to do this boring stuff first? Absolutely. Because if you don't do the boring stuff first, if you don't go through the sitting on your tummy,

Starting point is 00:37:51 right, sitting up, crawling, walking, then you can't expect to understand the path of the sprinting agent. And every agent run needs a decision trace that you can expect after the fact. So you have to be able to log all your tool calls. capture you decision traces and monitor for abnormal action patterns. This is one of the reasons why I said, I think we're going to see in 2026, it's going to be extremely common to have agent opt teams, right? In the same reason, or sorry, in the same vein, you know, dev op teams are very common, right?

Starting point is 00:38:23 We're going to have agent opt teams for this exact reason. So here's what to watch forth for the rest of 2026 when it comes to agent risk security and sprawl. Browser agents are going to become a mainstream risk surface. I think we know that. I think we're going to see a major open claw type. I'm not saying open call themselves, but I think we're going to see a major open source agent crash.

Starting point is 00:38:45 I talked about that in the 2026 AI prediction and roadmap series. I actually think it's going to come from, right? There's this new trend like trend over the last week or two where everyone is just pointing their agents to a website. When they want their agent to do something, they're not even saying, all right, let me let me the human understand this. let me write it out. Let me apply this to my business logic. Let me apply this with my guard. Nope. They're just saying pointing their agent.

Starting point is 00:39:14 Hey agent, here's this video. Go watch it and do it. Hey, agent, here's this website with a whole bunch of cool stuff. Just go do it. You know, there's this whole concept now going around of, you know, point your agent to a URL. Okay. Have fun with that. Don't do that. Right. Don't. Right. Especially if that agent has its home dedicated machine. Right. And I'm saying, If you're tinkering around, right, it's your personal computer. You're the business owner. If you want to take that risk, that's fine, right? Especially if you're tinkering around, that's fine.

Starting point is 00:39:45 But don't do that in an enterprise. And unfortunately, people are doing that. That's a bad thing because guess what? A lot of those sites, quote unquote, oh, you know, someone puts up a huge tutorial. Okay. They're putting it up on their personal website. You know how easy it is to hack, to fish any of these websites? Okay, you see, let me give me an example.

Starting point is 00:40:08 Something goes viral on Twitter. It's some, you know, literally, some 20-year-old kid, you know, put a great, you know, guide up on the website. Cool. And, you know, everyone's just like, hey, agent, go read that, go read that. And then you have millions of agents going to just read this, you know, this nice kid, put up a cool guy. Okay, guess what's going to happen? Someone's going to see that. They're going to hack that site and they're going to inject malicious code in there that no one's going to see.

Starting point is 00:40:32 That's what's going to happen. All right. Next, the agent skill marketplace is going to. to expand the supply chain risk. So you inherit what you plug in. And we've already seen that with some skills and plugins for agents. Some of the top ones were found to have malicious code in them.

Starting point is 00:40:48 We're going to see identity and permissions becoming a board level compliance requirement, not a nice to have. And last but not least, the winners are going to treat agents like production software, not side experiments. All right, that's it. That's a wrap. Volume 9 of the start here series.

Starting point is 00:41:05 I hope this is helpful and I hope you understand a little bit more the risk security and the sprawl that you need to be aware of and how this having AI that acts, having AI, right, that has a smart brain, it's proactive. It has tools and it has arms. That changes everything, right? This perfect storm that has happened in the last 30 days we've seen, I've said this point. before. We've seen more, both on the opportunity and the risk side, we've seen more in the last 30 days than we've seen in the last probably two years. All right. So I wanted to take a moment for the start here series to address this. And I hope this helps you not only understand the risk, but if you want to take advantage of the opportunities, you can't do it without knowing the risk.

Starting point is 00:42:03 So now you do. All right. So I hope this. was helpful. So if so, there's already eight other in our Start Here series that you should go check out. So like I said, please go to starthereseries.com. That is going to give you free access to our inner circle community. Once you sign up, you're going to be just straight up loving it, hopefully in our Start Here series space and connecting with now more than a thousand people in our Inner Circle community. So thank you for tuning in. Hope to see you back for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own

Starting point is 00:42:52 words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 718: Agent Risk, Security, and AI Sprawl in 2026: Why AI That Acts Changes Everything (Start Here Series Vol 9)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.