Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 745: From Chatbots to Super Agents: The 11 AI Tool Categories Explained (Start Here Series Ep 16)

Episode Date: March 31, 2026

Most people are either using too many AI tools or not enough. The real problem? Not understanding the categories and capabilities of the main AI tools that matter. And with constant updates, it&apos...;s pretty much impossible to get a decent lay of the AI land. We're changing that with this episode: From Chatbots to Super Agents: The 11 AI Tool Categories Explained -- An Everyday AI Chat With Jordan WilsonNewsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion on LinkedIn: Thoughts on this? Join the convo on LinkedIn and connect with other AI leaders.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:11 Essential AI Tool Categories OverviewText Reasoning Assistants ExplainedMultimodal AI Operating Systems BreakdownAI Search and Deep Research ToolsVoice and Speech AI Tech TrendsAI Image Generation Diffusion ModelsAI Video Generation & World ModelsAI Music Generation Tools and PlayersDesign and Visual Content AutomationVibe Coding App Builders FunctionalityAI Coding Copilots & Local AgentsAutonomous AI Agents and BrowsersBuilding a Competitive Personal AI StackTimestamps:00:00 Overview of 11 AI tool categories06:40 AI tools and their uses07:54 Old school AI systems11:23 Being patient with AI tools16:15 How AI models find answers17:22 Future of browsing with AI23:49 How AI generates images26:50 How video generation works29:01 Creative AI and music generation33:48 AI tools and natural language34:41 Explaining vibe coding and key players38:35 Understanding autonomous AI agents42:58 AI platforms and top players45:02 Choosing the right AI tools48:04 Choosing the right AI toolsKeywords: AI tool categories, chatbots, super agents, text reasoning assistants, multimodal AI platforms, AI search, deep research, voice and speech AI, speech to text, text to speech, image generation, diffusion models, video generation, Sora, worSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. You shouldn't use as many AI tools as I do.
Starting point is 00:00:48 It's actually a recipe for disaster. And I think that shiny object AI syndrome is one of the biggest problems in today's enterprise landscape. But although I don't think most business leaders should be using 20 plus AI tools or systems every week, I do think it's imperative to understand the landscape. And obviously it's ever-changing. Because in just about any field, sector, or modality, there's probably a juggernaut unicorn AI company
Starting point is 00:01:19 that has created a top-notch AI tool that's redefining work in that given space. So on today's show, I'm going to give you a lay of the AI land. Because I'd been there. I've wasted thousands of hours. So you can just learn from me as I tell you what types of tools are good for what reason. So on today's show in volume 16 of the start here series, we're going to walk from chat pots to AI agents and everything in between and quickly and simply lay out the 11 different AI parent categories where tens of thousands of AI tools mostly all fall under.
Starting point is 00:02:00 Because yes, there's a lot more to AI than chatbots and agents. So let's just go ahead and connect all the different dots and categories in between. All right. Let's get into it. If you're new here, this is Everyday AI and our Start Here series. So here's the big picture of what you need to know for today. Well, there's hundreds of AI tools literally launching every single week and most people feel overwhelmed. Well, I do too.
Starting point is 00:02:27 I try to keep up with it. And I don't recommend you try to keep up with the hundreds or thousands of tools in the AI space that are released every week because most of them are, well, kind of garbage and don't hold. a lot of utility. But almost every single one of those tools falls under, you know, maybe 11 of these parent categories that we've created for you today. And there's actually very few exceptions. And yes, some of these categories are extremely broad, but I did that for a reason. And I think once you understand these 11 categories, you can kind of stop chasing tools and start building a stack that makes sense for you. Because like I said, even within all of these individual categories, the functionality is,
Starting point is 00:03:08 changing constantly. So, you know, you might say, oh, there's 50 categories, but they're changing so much and adding so many new features and functions. Well, it's, I think it's easier to just understand just a couple, less than a dozen of these categories. So this episode is your map to every type of AI tool that exists right now. So on today's show, stick with me for the next 20-ish minutes. And you're going to learn why, well, while you're probably only using one or two of these AI categories and you're probably falling behind because of it. And again, you shouldn't use tools from all of them. You're also going to learn the single framework that makes sense of every AI tool from
Starting point is 00:03:48 Chachybt to cursor to Suno and how to build a personal AI stack that matches your job before your competitors find out. All right. Welcome to our Start Here series. This is the everyday AI essential podcast series to both learn the AI basics. and to double down on your AI knowledge. I created this series, well, because I didn't have a good answer when so many people said there's like 700 plus episodes, where do I start?
Starting point is 00:04:16 Well, you start here with the Start Here series. If you are picking this up midway through, that's okay, but I highly recommend you listen to all of these episodes in order. They're short-ish. They average like 29 minutes or something like that. And you can go to start here series.com. That's going to give you free access to our inner. Circle community and you can go check out our Start Here series channel in there where you can go listen,
Starting point is 00:04:42 read and just connect with other people who are going through this series in order. So if you have any questions, make sure to hit me up there after you join the Inner Circle community. And if you missed our last Start Here series episode, we talked about how everything is fake and how your company can leverage human expertise properly and fight AI work slot. That one was a good one. It's actually a fun one to do. All right, but let's get into the roadmap here in the 11 categories explain. Here they are right away. Number one, text reasoning assistance. Number two, multi-modal AI platforms.
Starting point is 00:05:17 Number three, AI search and research. Number four, voice and speech AI. Number five, image generation. Number six, video generation. Number seven, music generation. Number eight, design and visual content. Number nine, vibe coding app filters. Number 10, AI coding tools.
Starting point is 00:05:34 And number 11, AI, agents. All right. And let me just tell you this. I took a little bit of liberty and played around with these how to categories all these different tools, right? Because as an example, something like Gemini, well, they fit in basically all of these. Right. And in theory, we could just have combined number one, two and three and two large language models. But there are literally thousands of actually pretty good tools right out there, right? Even though I think that 99, percent of them are pretty garbage. There's literally hundreds of thousands of AI tools, but so there's actually thousands of
Starting point is 00:06:12 great ones that fit in to these 11 different categories. And so I did kind of take a little bit of leeway. And maybe this will make sense as we unpack this a little bit more. But you might be wondering, well, what's the difference between a text reasoning assistant and a multimodal AI platform? So yes, in the beginning, you might just be saying, well, isn't that just chat GPT or can't Gemini do, you know, nine out of these 11? categories fairly well. Well, yes, but as you'll see as we unpack this a little bit more,
Starting point is 00:06:41 there are some unique tools that only fit in maybe one of these categories. And I think as we start to think about how you can practically apply these different categories and kind of create that, you know, essential kind of tool category stack. It's important to understand the differences. Yes, there's crossover. But I think that we need to kind of properly go through this. So don't forget. AI is not new. All right. And I know if you're a long time listener, you're like, okay, Jordan, I get it.
Starting point is 00:07:12 Right. But I'm trying to keep this very basic. And I think this is going to be one of those episodes that a lot of people listen to. And I'm going to point people toward because, yes, there's, there's always an AI tool that can do something. Right. And even in these 11 different categories, all of these also have sector specific tools that really shine. Right. So as an example, there's, you know, even image generation.
Starting point is 00:07:36 There's image generation tools that are essentially just wrappers of, you know, Google's nanobanana or ChadD's image tool for different sectors, right? So also keep that in mind as you're using these different tools. A lot of them are just based off of, well, one of the core platforms from one of the big players. Not all of them, but a lot of them are. So also keep that in mind. But also keep in mind, AI is not new, right? I want to give a very, very quick kind of recap with the last 60 years.
Starting point is 00:08:08 This is to be the fastest supper, right? But AI has been around since the 50s. And the first AI chat about was actually Eliza in the 60s. And kind of the big boom of AI was actually in the 80s, right? A lot of people think it started with chat GPT, no. You had expert systems in the 80s. But essentially, these were very rigid, if then else logic, pieces of old. school AI algorithms and they weren't really meant for general use cases, right? A lot of them as an
Starting point is 00:08:39 example, you know, you could look at an old school artificial intelligence in banking. And all it was, it was a very complicated rules-based decision tree, right? That when it maybe an inquiry came into one of those old computer systems that were, you know, the size of a school bus and literally only did one thing very slowly, all they did was kind of traverse this path of this very rigid, if then else logic. And usually if there's one thing wrong, right, one extra comma, one extra space, the entire thing broke.
Starting point is 00:09:12 And that's obviously very different from today's generative AI, right? And a lot of that happened because of a very famous paper, essentially, from Google called Attention is All You Need and their 2017 Transformer Architecture. And that really unlocks, well, AI. That's not as,
Starting point is 00:09:33 rigid. And that's what, you know, kind of Chad GVT exploded this scene, which is generative AI. So, you know, when we had for decades AI that was very rigid, now it's moved into generative reasoning and intuition. And I think that Chad GBTT really helped launch this new category. Yes, there were other generative AI tools before Chad GBT, you know, maybe by a year or two. But I do think that it was the surgeons of chat gbtee that popularized AI tools, right? It brought in a lot of funding that really helped pave a lot of these different categories in all the different tool, kind of parent umbrellas that we're going to be going over today. So let's talk about category one.
Starting point is 00:10:17 That's just text reasoning assistance. This is you type a query into chat gvety into, you know, intrepic clawed into Gemini, you type in text. you get text out. And well, who is this for? It's for everyone. This is kind of the gateway drug of AI, right? But it's important to know by default.
Starting point is 00:10:40 Everything is reasoning now. So if you maybe were a heavier user of AI in 2022, 2020, 23, and you're like, this thing stinks. I'm never using it again. And you're just waking up now today. Yeah, the models are a lot better. They are smarter than most human experts when it comes to creating economically viable work.
Starting point is 00:11:00 But they do reason by default. They think by default. Right. So some of the bottles are slower. And I also, right, we've gone over this multiple times in the start here series, especially when you are using tools in category one, which is the thing is where a lot of people spend time. Do not sacrifice speed for using the right freaking tool, right? Don't use like an instant version or a non-thinking version of any of these tools, right?
Starting point is 00:11:27 We literally have. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools, across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills,
Starting point is 00:12:18 a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. Insanely helpful technology that can create outputs, deliverables, artifacts that are indistinguishable from human experts. Yet so many people aren't getting that just because they're impatient, right? And they don't want to maybe wait three, four, five minutes for a response. And instead will, well, they'll just say, I'm just going to get this fast one, or I'm going to use this
Starting point is 00:13:13 model that doesn't think or doesn't take as long. Number one, stop doing that, right? There's a reason why the people who are getting the most out of these models are, you know, usually have multiple screens and, you know, multiple platforms because you have to wait. You know, you have to wait. You have to be patient. You have to have a good workflow. This is why I usually recommend, yes, pick your AI operating system of choice, but have backups,
Starting point is 00:13:37 right? or be working on multiple projects at once. You do have to be a little bit, you know, okay at multitasking and bouncing between different projects because you should be taking your time when working in category one, which is our text reasoning assistance. All right, category two, this is multi-modal AI platforms. And for the most part, it's the same quote-unquote tools in category one
Starting point is 00:14:01 as they are in category two. And I think it's actually important to differentiate these use cases because I would say the overwhelming majority of the billions of people that use generative AI every single week are using these same tools, but they're only using them for, for the most part, category one, text reasoning. They're not using them as a multimodal AI platform. All right. So what that means is, well, multimodal in, multimodal out. All right. So the main players here are the same as the main players in number one.
Starting point is 00:14:33 So that's chat, GPT, Gemini, Claude, also Microsoft co-pilot. And they really are expanding into multimodal operating systems. So who is this category for? Well, anyone that wants one AI platform and across multiple modalities. So I think Google is by far the leader in this space. And as we go on, you'll probably see Google tools mentioned more than any of the others because they are absolutely dominating in multimodal AI. But what does it mean for a multimodal AI platform?
Starting point is 00:15:07 Let's say Google, right? A lot of people don't know in Google's AI Studio, which is a kind of more customizable or, you know, developer-friendly version of Google Gemini. You can upload videos, right? And Google Gemini 3.1 Pro doesn't just, you know, transcribe videos. It can see them, right?
Starting point is 00:15:27 Yes, it's very token intensive. You know, you might burn through a little bit of your budget, but it can actually see and understand videos. So that's the difference between and why I think it's important to have two different categories for a literal multimodal input. And a lot of people still, you know, don't take advantage of that in a chat GPT, Glawed co-pilot, right? That it can, you know, upload screenshots. You can, you know, output, you know, depending on which system, right? You can output code.
Starting point is 00:16:00 You can output graphics. You can output images. In some, you can output video. So it is important, even though a lot of people spend most of their time in category one, those, the three or four big boys in the room, they play in category one and two, which also leads me to, yes, I know I said we could in theory combine categories one through three because category three is AI search and deep research. And you have a lot of the same players, but they are kind of different products, right?
Starting point is 00:16:30 They have their kind of dedicated, you know, features or modes. So in Chad GBT, you have deep research. In Gemini, you have deep research. In Claude, you have extended, kind of the extended reasoning kind of version of this to take longer and do deeper research. And there is also a research tab inside Claude. So it is kind of a different platform. But then you have a completely new set of players in here as well. One of the ones that most people are probably most familiar with is perplexity.
Starting point is 00:17:04 You also have, you know, notebook LM. I think technically fits in this category as well, although, you know, there's completely so many different categories where you could place notebook LM. You have tools like, you know, you dot com. And then you have literally hundreds of great industry or sector specific researchers as well. So who uses this? Well, I think your average everyday professional that just needs to verify facts, but also researchers, analysts, strategists. And like I said, some of the more. You know, especially on the academic research side, legal research side, some of those tools are probably going to be a little bit more familiar for people working in those sectors.
Starting point is 00:17:44 So I'm not going to confuse everyone with all those names, right? The Harvey of the legal worlds. You know, there's so many on the academic side. But essentially, AI, you need to understand. And I think people always skip over this. The source of your answer is extremely important. So when you talk about large language models having AI search and deep research. research. It's it's actually not quite a novel concept. They've been doing this for multiple
Starting point is 00:18:12 years, but you have to realize there's really three different main sources for where a large language model gives you its answers, right? Number one is its own internal training data. So as an example, if you're using an offline, you know, model that you've downloaded and you're not connected to the internet, which a lot of people now are doing as we're moving into, you know, more powerful computers, more capable models, things like OpenClawe, where people are, you know, downloading open source models and running them locally. Those models are running off of usually very old training data. So you have old training data. You have data that you can upload, right, both individually in chats via connectors and integrations, you know, via projects,
Starting point is 00:19:00 GPTs, etc. So there's training data. data that you can upload or connect via, you know, different mechanisms within these large language models. And then there's the ability to browse the web. So you really have to have a handle and an understanding and make sure you're always checking the summarized chain of thought to see and ensure that they're looking at the right types of websites and they're not making things up or hallucinating anything. And I think this matters because I think humans aren't really going to be using the internet a lot in the future. That might sound, weird. I find myself personally using the internet less and less every day. And I'm spending more and more
Starting point is 00:19:39 time in these different platforms, right? Like your perplexity, notebook LM, you know, Javanize deep research, chatypt deep research, et cetera. And I do think that as agents become more prevalent, it's just going to be agents really browsing the web. That's why we have now all these agentic protocols that are making websites more accessible and, well, the data in the websites more readily available to agents, right? I could see a day where, you know, obviously we're already starting to see it, you know, even from a technical kind of SEO or AIEO or GEO, whatever you want to call the new SEO for AI, right, where there's multiple versions of a website, maybe there's one version that's for humans and one version that's for AIs. And probably the version for AIs is ultimately going to become
Starting point is 00:20:27 way more important for most businesses that rely on discoverability online. All right. Category four, another big one here. And it's an emerging one. And I think a trending one in 2026. And this is voice and speech AI. So this is both speech to text.
Starting point is 00:20:46 It is text to speech. So yeah, this is, if you didn't know, yeah, there's an AI for that. And they're good. Right.
Starting point is 00:20:55 I've even been building a secret little project. in this category and even the open source models, right? So not just the frontier models that you're paying for, but this I think is going to be really important moving forward. Not saying that people aren't going to be able to type or that everyone wants a text to speech voice narrating something, but you know, I usually see where I'm spending more and more of my time and I'm kind of forecasting that out and saying, okay, I think a lot more people are going to be using
Starting point is 00:21:27 this because like I said, I use hundreds of AI tools every year and I've used thousands of them in the three plus years I've been doing everyday AI. So, you know, I fancy myself the average knowledge worker, right? I'm not overly technical. I'm fairly, you know, technical. So I'm kind of in the middle, right? So I think things that at least for me personally stick and make sense. I think that they will make sense for many other people. And I think that's why voice has really been a trending category. So some of the main players here, 11 labs, Mistral actually has a great new model that they announced, which is Voxtrol, fireflies, you know, for, you know, AI meetings. You know, you have a lot of those within co-pilot within Google Gemini. They have their own
Starting point is 00:22:14 meeting assistant that transcribes meetings. You have Granola, you have Otter, right? There's so many big players, both on the text to speech and on the speech to text side. So who uses these? Well, anything from podcasters like myself, right, video creators, global teams, corporate training, accessibility. There's so many different use cases. But one of the reasons why I think this category is going to become increasingly more and more important is there is a literal knowledge layoff coming, right?
Starting point is 00:22:45 So both as companies are unfortunately, we're seeing because of AI laying off hundreds or thousands of workers for some of the bigger tech companies, I do think that's going to unfortunately sprinkle into, kind of corporate America throughout the second half of 2026 and into 2027. So I think you are not only companies are losing a lot of their institutional knowledge when they're laying people off, right? They're doing it.
Starting point is 00:23:10 Public companies are doing it because it helps their bottom line, right? And they can become leaner. And a lot of people don't know what the future AI jobs are going to look like. So they're like, well, we might as well get rid of the old jobs now, save up money. And so when we need those new AI jobs, we can go hire people in, well, will be leaner in the process. But you also have the silver tsunami, which I talk about a lot, right? You have more and more people who are retiring. And a lot of these more senior workers maybe have decades of institutional knowledge, subject matter expertise in their head. And I think that's another reason
Starting point is 00:23:46 why some of these platforms are becoming more and more important, right? Documenting this institutional knowledge, right? Doing Zoom meetings and just talking to senior people, right, that maybe they're leaping in a couple of years, you know, trying to understand their thought process. And then using that as well, your company's internal IP, I think these type of tools are extremely important because, and now it's also very common for, you know, it's starting more in the coding space, but it's more and more common for people to just be talking instead of typing, right? As large language models become better at understanding natural language, even for me,
Starting point is 00:24:23 I find myself, I'm a decent typeer, right? I'm in front of the computer 10, 12 hours every single day. Large language models are very good now, I think, at understanding natural voice. And you can speak about four times as fast as you can type for the most part. Right. So this is another reason why this category. And then on the flip side, right, text speech. I know this one is maybe me more personal.
Starting point is 00:24:50 I hate reading. I hate reading huge lines of verbose. AI generated text, right? Because that's where I spend the majority of my day, right? So everything I read for the most part is AI generated and extremely long, even with custom instructions and, you know, certain settings, right? So I think, you know, being able to use these text to speech tools or, you know, they're baked into a lot of, you know, chatyptee and Gemini as an example. You can just click the read aloud button. It's a big category in and of itself. All right. Number five, AI image generation. So how this works, it's a little different.
Starting point is 00:25:26 So for the most part, most of the things that we've talked about so far in the first categories were all based off some of that original, you know, transformer models, right? The original research from Google, you know, you can say it's next token prediction, but like on steroids. A.I image generation is a little bit different. So these are something called diffusion models. So essentially they start with random noise and they've been trained on a large data set and then it kind of refines it into coherent images. So you could kind of think of it like, you know, I don't know if you have that little petri dish of milk and you, you know, drop some black ink in it and it eventually diffuses into a photo, right? That's kind of what it does behind the scenes. I actually like, I really enjoyed the older versions of like, you know, Open AIs dolly when it was like really slow because you could actually kind of see, right, in some of the earlier mid-journey versions, right? When the images are real slow.
Starting point is 00:26:23 you could understand the diffusion process because you could see it and watch it, right? One image might take a couple of minutes and it turned out absolutely terrible, right? Like back in the Dolly two days. But then you could actually understand and see, okay, I kind of understand what this model is trying to do. All right. So some of the main players here, Mid Journey was one of the OGs, even though I don't think that they're any longer like a top three name, but you have to kind of tip the cap to them. Then you have flux, ideogram.
Starting point is 00:26:51 And then as always, you have the big players, right, which is GBT Image 1.5 from OpenAI, Nanobanana Pro 2. And then also Microsoft has recently announced some pretty decent, right, some top five-esque AI image models. So who uses these? Well, I think anyone can, right? I think especially with using something like Nano Banana Pro that can make slides. All right, we're to kind of get into that category.
Starting point is 00:27:21 a dedicated category, right? But you know, things like GBT Image 1.5 inside of chat GBT, Nano Banana Pro, inside of Google Gemini, being able to make infographics, things like that. So yes, I think people think that these are only for designers, marketers, content creators, but I think they can really be for anyone, right? Because we also have to understand, right? I say this as someone, I'll say mid-career, right? I think the younger generation doesn't really care about text, right?
Starting point is 00:27:51 care about videos. And that's also the base for creating AI videos is both AI image generation, but also just more interactive graphics and just better visual elements. I think businesses should be starting to experiment, not just on social media, on their website, on traditional marketing materials as well that are just, you know, blobs of thousands of words of text. This is a great, a great kind of tool category to start exploring if you haven't already. because I think standalone image tools are losing ground, right, as now we have these AI image generators that are really sweeping the field. All right, category six, this is AI video generation.
Starting point is 00:28:35 So here's how it works. Well, it's kind of like the image generation, but then like times 60 for all the different, you know, frames per seconds or 30 frames per second. So it's slightly similar technology as they have these diffusion models, but they They are also then extended into the time dimension. And then there's denoising frame sequences. The other thing with good video models is they are extremely expensive to run. So as an example, SORA, depending on when you're listening to this,
Starting point is 00:29:08 but so technically it was just last week, opening I shut down SORA. One of the reasons why, well, video generators are extremely expensive because they also understand the world, right? I do think that video generation in world model generation are going to start to blend, kind of like what we've seen out of runway. Because the video generators, good ones, they also have to understand gravity. They have to understand, you know, shadows, reflections, you know, sun, right? Like they have to understand things.
Starting point is 00:29:42 You know, character consistency, right? Like if it's a video generation of, you know, someone walking toward a pencil on the ground, but from the pencil's point of view, right, the object is going to change, change shape, dimension, all those things in real time. So the video generation, and I think the gap that was closed in 2025, because at the beginning of 2025, AI video is not good at all, right? And then fast forward to the end of 2025. And AI video is actually to the average eye at the end of 2025, indistinguishable for the most part, if it's done well.
Starting point is 00:30:21 And I think where we're at now in, you know, mid-20206 is even better, right? Now it's even people like me, right? I have a background in videography. It's getting harder for me to even realize. So some of the main players here, you have Google V031. You have runway gen 4.5. You have Kling3O. You have C-Dance, right?
Starting point is 00:30:39 Seed dance gone, it's gone mega viral recently because they're kind of making. Hollywood videos and kind of got in trouble. You have PICA 2.5. So who uses these? Obviously, if you are working in or around video, social media, I think these are great tools to use. And I think that video is one of the hottest and biggest growing categories of all of them.
Starting point is 00:31:00 All right. Moving into the next category, a category that's actually not super competitive. And I'd say it's kind of one of the last defensible creative AI frontiers right now. And that is Category 7, Music Generation. So this works a little bit differently than some of the other models. And I won't even get into the technical side because it technically uses like image generation to visualize what music looks like. But it is for the front end user how it works, not too worried about technically how it works under the hood for this one. But you can essentially in text describe the genre, the mood, the lyrics, and then you can receive a
Starting point is 00:31:42 fully mixed song in seconds. So some of the main players here, well, you have Suno v5, which just came out. Yes, I know, depending on when you're listening to, I don't know, maybe we're on Suno v7 by now. But right now, at the time of this recording, Suno v.55, you have UDO, you have 11 labs now. So 11 labs was originally a big text to speech player.
Starting point is 00:32:03 Now they're in the AI music space as well. Google, yeah, like I said, Google in like almost every category, because they're a leader in almost every single category. Their new Leria 3 Pro, actually really good, being able to create three-minute songs, and you don't even need a separate subscription. So there's a lot. So, you know, who uses this?
Starting point is 00:32:22 Again, content creators, anyone that's looking for, you know, music to accompany a video, anyone's needing royalty-free audio. And I think that, again, it's actually a shockingly a lot of money that these companies are making. I think the last I saw that Suno was at something like $300 million in revenue. So yeah, these companies, even if you haven't heard of them, they are really big. They are really good. And the cool thing, right, because I've always been kind of impressed. I've used Suno since the very early days.
Starting point is 00:32:54 And they're the biggest player in this space by far. We had their CEO on the show, you know, many years ago. I say many years ago, it feels like like 10, but it was like two and a half. They've been the platform that's been on fire. But you can literally describe anything. You can pull out, you know, just the guitar. And you can say, hey, just change the guitar on this, but keep everything else the same. Now you can experiment with your own voice, right?
Starting point is 00:33:21 So there's so many cool things that you can do. So, you know, maybe as you're understanding these categories a little bit more, maybe there's some more use cases for your company that you maybe didn't think of before. All right. Let's get to category eight. And that is design and visual content. This, I know, I could have. I've technically broken this one down into like three or four separate categories.
Starting point is 00:33:41 But instead, I decided to keep it a big parent category because I do think that like as an example with like notebook LM and you know, we're going to see a lot of this in Google with nanobanana. The lines on this are really going to start to blur. Like as an example, if you look at nanobanana, yes. Originally it was okay. It uses it's for photos and infographics, right? But then all of a sudden it's doing complete slide decks. And then it's the base for video slideshows, right? But so this category, it is AI that combines content generation, layout design, and imagery
Starting point is 00:34:20 into finished artifacts, whatever that artifact may be. So some of the big players here were, you know, kind of earlier in the game. And they're doing things like decks, right? So obviously, I think you now have Microsoft co-pilot, which is pretty good. good in creating decks inside PowerPoint. Actually, I was surprised how good the new task feature in Microsoft copilot was at creating decks. So even outside of the tools that we talked about, you also have to understand.
Starting point is 00:34:53 I'm not mentioning chatypt and claw on for each of these because they can all create slides. Right. But I think this is a little bit different because gamma, you know, Gamma is one of the big players in this space. You know, very unique. tool that can create, you know, 20 page slide deck can create, you know, websites. It can create anything visually. And I think it is a really, does a really good job.
Starting point is 00:35:17 Similarly, you know, Canvas AI Magic Suite falls in that category, beautiful AI. But then you have a whole other kind of genre of, you know, AI design and visual content, which would be AI avatars or digital twins, right? So you can upload your, you know, they're actually fairly easy to make. easier than they were like two or three years ago with tools like synphyseia, Hagen, etc. Then you also have other platforms that are, you know,
Starting point is 00:35:46 visual content such as Google Stitch. Right. So it helps you and, you know, Figma AI, right? There's so many, so many tools that could fall under here.
Starting point is 00:35:56 But, you know, there's just tools that you can type in a sentence and it will create an entire app layout. And then you can export that to, well, something in category nine. All right.
Starting point is 00:36:07 as we move on to category nine, and that is vibe coding app builders. So here's how it works. Again, natural language. Are you seeing the trend here, y'all? For the most part, natural language is a very important skill set, right? Because for the most part, this is a starting point for all of our categories is just natural language. So with vibe coding, right, if you slept through 2025 and missed vibe coding, I think being the word of the year for like dictionary.com or something like that, you know, you describe an
Starting point is 00:36:37 app in plain English and it can get deployed with a live URL. So again, between nine and ten categories nine and ten, the, uh, you know, vibe coding app builders and the AI coding code pilots. There's a lot of crossover, similar capabilities, right? Google AI studio could be in either. I think the main players that are specifically in the vibe coding app builders would be like lovable replet, bolt, uh, v0 by Vercell, base 44, right? Uh, there's, there's a lot of ones that are really just,
Starting point is 00:37:07 made to be a complete infrastructure for an entire app, right? And I'm going to talk a little bit why I even separated vibe coding app builders into AI coding co-pilots because, yes, they do have a lot of similarities, but they're definitely different. Right. So I'd say people using this are more of the non-technical people. So, you know, non-technical founders, entrepreneurs, marketers who are building tools or anyone with an app idea, right?
Starting point is 00:37:36 They are great platforms. I think earlier on in the, you know, 2024, early 2025, I wasn't a big fan of these tools, to be honest, right? But now they're much, they're much better, right? Now that they have complete, you know, front end, back end, off user management. Whereas before they're just, you know, you could make disposable apps and they were kind of disposable, right? Very similar to what you could make inside chat, UVT, clawed, Gemini.
Starting point is 00:38:03 Right? But now they have turned into. to kind of fully functioning tools. So I think this is one of the most heavily funded categories of 2026. And it's rewriting who builds software and how it's used. Yeah, like literally, if you look at the US stock market, traditional software has gotten squashed from category nine and category 10. So great segue to category 10, which is our AI coding co-pilots and agents.
Starting point is 00:38:32 So here's how it works. Well, AI just can understand your individual. entire code base. It can write code, test it, and ship new features. But it is not just coding. It is also any work that can be done on your desktop, right? Because there's this magical thing that, you know, most people, unless you're a dork like me, I've been playing around with different terminal tools for 20 years, right? But computers themselves are controlled by something called a terminal, right? So you can run any command on a computer, including accessing local files on your desktop, creating files, running code, all of those things.
Starting point is 00:39:09 So that's why you kind of have a little difference in the kind of vibe coding tools that are ultimately about like, okay, you're creating a piece of software where AI coding is a little different. They can create software. And yeah, you can vibe code in all of these tools. I do all the time because the main players here. You have cursor, Claude Code CodeCode, GitHub copilot, windsurf, codex from open AI, anti-gravity from Google, right?
Starting point is 00:39:34 But I think that we're starting to see it, these main players also dip their toe into non-technical knowledge work as well and just becoming agents in and of themselves. So who uses this? Well, professional software developers, engineering teams at every level. But like I said, I think with Claude Code and Claude Co-work, which I think I have on the next one here. And I think Codex as well, I think more and more. And I know you've heard me say this a lot. And if you are not a technical person, if you're not a software engineer, if you're not a dev, you don't fancy yourself any of those things.
Starting point is 00:40:12 You need to be using Claudecode. You need to be using Claudeco work. You need to be using codex. You should probably start using anti-gravity, right? Because as we start talking about the future agentic layer, which right now is, well, computer using agents, but I think that's not the ultimate or the end layer. But I think you really have to understand these tools, right? And we're going to get to kind of my takeaway advice on that here in a minute.
Starting point is 00:40:36 But every single coding tool right now is racing toward fully autonomous agent-driven development, which is why I think if you understand AI tools, right, yes, there's great agented capabilities that can happen in front-end AI chatbots. But you don't have every single program, every single file, every single app that you use. You can't connect those, right? A lot of those maybe live on your local machine, which is why I think it's important to pay attention to the AI coding co-pilots and agents category. It is important. And then last, but definitely not least.
Starting point is 00:41:14 And yeah, this one could literally be seven different categories. I've actually broken down AI agents into the seven different types of agents. But for ease of today's episode and understanding all the different AI software, not just agentic software, we're going to say category 11 is autonomous AI agents. and agentic browsers. So this is how it works. Well, you state a goal in the AI plans, the steps, it uses real tools, and it completes the work. So there's a lot of main players, all right?
Starting point is 00:41:43 And we already mentioned some that you might think were in this, like, replet, you know, maybe Claude Code. But I'm saying for the most part, these are more general agents or non-technical, non-dev type agents. So players like Manus, GenSpark, obviously, open claw, definitely fits in this. ChatTGPT's Atlas browser, a perplexity comets browser.
Starting point is 00:42:07 Their perplexity computer, personal computer, Claude Co-work, especially their new computer use that can click, use a mouse, can open different files, different programs on your computer,
Starting point is 00:42:20 which makes it really unique. You have Microsoft's new co-work. You know, you have Claude Dispatch. There's so many, right? And I think I would even probably throw code into this platform as well, or in this category. But this is, I'd say for more technically proficient knowledge workers, right, who are looking to delegate as many of their time-consuming mundane tasks as possible.
Starting point is 00:42:47 All right. And well, I think the main reason why this matters is the shift is happening right now, right? Where AI is no longer just something that you ask a question to. And it works reactively, right? Now agents are working proactively on schedules, right? Like if you're an old school, you know, technical person, right? He heard about, you know, cron jobs and, you know, RPA robotic process automation, right? Now that's what AI agents are doing, right?
Starting point is 00:43:16 They're running computers on schedules and they can do anything that us humans can do if you're using the right tool at the right time for the right purposes. All right. So those are the 11 categories. But I want to leave you with. this. I want to leave you with some advice. And we talked about, you know, why if you're using only one of the categories, you're probably failing and, you know, a framework to make sense of every tool and how to kind of build the right AI stack. Right. So here's some pieces of advice as we wrap up. So Gemini is an absolute beast. Right. We've actually seen in the recent weeks
Starting point is 00:43:54 open AI intentionally stepping back and pulling some of their modalities off the table. Right. So as an example, Open AI, you know, kind of killed Sora, very popular, you know, AI video generator. And I wouldn't be surprised if Open AI maybe slows down progress on some of their others to really just start focusing on more profitable sectors within the enterprise and just focusing more on their core models. But Gemini is an absolute beast. It does. I'm pretty sure as I'm looking at all my different categories here. Yeah, it does every single one. I'm just double checking. Yeah, image generation, AI, video generation, AI, music generation, design and visual. Yeah, it does it does everything in here. Right. So you can't talk about all these different categories of AI and not absolutely say that Google is dominating this space, right? Anthropic great models. Co-work Claude, Claude, some of the most widely used tools over the last three to six months. But Gemini is crushing it across the board in terms of not just multi-modality,
Starting point is 00:45:09 but in terms of having a very powerful tool in every single category. All right. The other thing, JGBTBT is winning the user's race, right? Because we did an earlier show on who's winning the AI race. But I think it's important to know as well as you're choosing, oh, you know, what should I be using for AI images? Should I be using, you know, Chad GBT or something else? Well, like we said, you know, Open AI seems to be cutting back and focusing on some of their services,
Starting point is 00:45:37 but there's a lot of other great tools that can lead the way. But still, Chad GPD is absolutely dominating users across the board. And I think understanding the players, even at a base level like I just went over today, is extremely important because I think that you have to understand where the platform does enough versus where the specialist wins, right? Because what can happen is, well, you can start drowning in tools because you can look at all these tools and say, oh, wow, I have all these capabilities now that I didn't have before.
Starting point is 00:46:12 And depending on what your job is, maybe that's part of your job, right? But I think some of the best things that the average everyday knowledge worker can do is say no to certain AI tools that would give them new capability. that they didn't have before, unless that is a nagging requirement, right? And it's hard to stay away from that, right? But I think you have to stick to where your specialty lies and what category of tools are going to help you expedite that. So here's some advice.
Starting point is 00:46:47 You don't, when putting together your ideal, you know, tech stack of the right categories. You don't need every single, you know, oh, I'm going to pick one tool from all these categories. Don't do that like I started with, right? Like I started the show by saying, you shouldn't use as many AI tools as I do. You shouldn't be using for the most part, right, unless you're a one person company, solopreneur, entrepreneur, you know, maybe you're a jack of all trades marketer that has to do way too much. But for the most part, you shouldn't be using one tool from all 11 categories. You know, maybe if you want to learn the basics, you know, set aside a weekend, you know, spend an hour in each category, sure. But you
Starting point is 00:47:25 should just be choosing two to three categories that most align with your daily work. But I will tell you this. You need to be extremely proficient in well, technically categories one, two, and three, because they're all large language model related. And you at least need to have one tool, uh, that you are very good at in category 10, which is the coding copilot, um, and category 11, which is the general super agents. Because when I talk about this shift from humans doing the, work to agents doing the work.
Starting point is 00:47:57 I think there's going to be a period of probably, I don't know, 18 to 36 months, where if you can commit to those categories or your department or your company, you are going to have an unfair head start on everyone else. I think in 18 to 36 months, these things are going to become the de facto way to work. But we have superhuman AI capabilities that are here now. And it's only a matter of time until that just, well, everyone learns to work that way. But they're here. They're live.
Starting point is 00:48:31 They don't require, you know, 30 pounds of duct tape. They're there. All right. So, you know, as an example, stick to the tool categories that make the most sense. So, you know, a marketer might pair AI search with design tools. And entrepreneur might, you know, pair vibe coding with agents, right? Really stick to the lanes of that align with your expertise and what your outputs need to be. But I think the real divide this year is going to be between those who build a real AI stack
Starting point is 00:49:00 versus ones who just uses one tool casually versus those who get distracted by a shiny AI syndrome. So you can either be the one that just sticks in one category, which is probably not enough. You could be the distracted one, the learner that sticks in every single category, but you're not excelling in any, or you've got to find that middle spot, that sweet spot, right? Knowing the three to four categories that I mentioned that I think are essential for any knowledge worker and then finding the right tool or maybe one or two additional categories in there that really align with your day-to-day work.
Starting point is 00:49:39 And understanding the full map is where your advantage comes in because I think most people are just using one AI category and completely missing the others that are around there. If you're kind of new-ish to AI, after listening to this, your mind might be completely blown. You're like, oh, I didn't know that Google had a tool that could do this, or I didn't know, you know, Suno could create all this music with your voice. I didn't know Gamma could, you know, take a blog post URL and create a deck out of it with your own image assets. Yes, right, they can do all these things. But you have to ignore shiny object AI syndrome because I think, for the most part, almost any AI product in the world can fit under one of those 11 parent
Starting point is 00:50:22 umbrellas. So pick your categories wisely, the ones that you need to compete in, choose your AI operating system or your AI tool of choice, and try as much as you can to ignore the others because you need to pick the categories that are going to move your work forward starting today. All right, I hope this is helpful. I know this is kind of a longer one, but I think it was an important show that we had to do because I get these questions all the time. What tools should I be using? Oh my gosh. Like what are AI's capabilities? Well, there you go. Obviously, you know, when it comes to, you know, April, May, June of 2026, I think capabilities are going to change. Maybe these categories are going to change. But right now, if you're listening to
Starting point is 00:51:01 this at least in the first half of 2026, you have a great advantage. You understand all of these AI tools, where they fit in the landscape. Now it's up to you to do something about it. I hope this was helpful. If so, please, if you're listening on the podcast, please make sure to subscribe to the show, leave us a rating if this was helpful. And then make sure you go to start here series.com. That is going to give you free access to our inner circle community. It's literally not listed anywhere. You can't find this anywhere on the internet except, well, unless I email it to you, but
Starting point is 00:51:32 otherwise by going to start here series.com, make sure to go, uh, join and check out, uh, all of the different episodes in the series that you can go listen to, watch for free now. So thank you for tuning in. Hope to see you back tomorrow and every day for more everyday AI. Thanks y'all. Meet Firefly AI Assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest,
Starting point is 00:52:05 orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating.
Starting point is 00:52:39 It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.