Everyday AI Podcast – An AI and ChatGPT Podcast - EP 506: How Distributed Computing is Unlocking Affordable AI at Scale

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. When chat GPT first came out, no one was talking about compute, right?

Starting point is 00:00:52 But over the last few years, as generative AI and large language models had become more prevalent, the concept of GPUs and compute has become almost like, you know, dinner time conversation, at least if you're, you know, crowding around the dinner table with a bunch of dorks like myself, right? But I think even more so the last few months, you know, as we've seen closed, or sorry, as we've seen open source models, really close the gap with proprietary in closed models. I think this concept of compute is even more important because now all of a sudden, you have a lot of, you know, probably millions of companies throughout the world, medium-sized companies that maybe weren't concerned or, you know,

Starting point is 00:01:39 weren't really paying attention to having their own compute, maybe two years ago, now of a sudden it might be a big priority because of the new possibilities that very capable, large language models, and even smaller in open source models, all these capabilities they're giving to so many people. So that's what one of the things we're going to be talking about today

Starting point is 00:01:59 and also how distributed computing is unlocking affordable AI at scale. All right, I'm excited for this conversation. Hope you are too. What's going on, y'all? My name's Jordan Wilson, and this is Everyday AI. So this is your daily live stream podcast and free daily newsletter, helping us all not just keep up with what's happening in the world of AI,

Starting point is 00:02:19 but how we can use it to get ahead to grow our companies and our careers. If that's exactly what you're doing, you're exactly in the right place. It starts here. This is where we learn from industry experts. We catch up with trends. But then the way you leverage this all is by going on our website. So go to your everyday AI.com. So there you'll sign up for our free daily newsletter.

Starting point is 00:02:41 We will be recapping the main points of today's conversation, as well as keeping you up to date with all of the other important AI news that matters for you to be the smartest person in AI at your company. All right. So enough chit chat, y'all. I'm excited for today's conversation. If you came in here to hear the AI news, technically we got a pre-recorded one debuting it live.

Starting point is 00:03:01 So we are going to have that AI news in the newsletter. So make sure you go check that out. All right, cool. I'm excited to chat a little bit about computing and how it's changing and making AI affordable at scale. So please help me welcome to the show. We have Tom Curry, the CEO and co-founder of Distribute AI. Tom, thank you so much for joining the Everyday AI show. Thanks for having me. Appreciate it.

Starting point is 00:03:26 Yeah, cool. So before we get into this conversation, which, hey, for you compute dorks, this is right up your alley. But for everyone else, Tom, tell us what does distribute? AI do? Yeah, so we're a distributed AI app layer. What that really means is we're basically going around in Katruth's computer. It could be your computer. It could be an English computer around the world. And we're basically leveraging that to create more affordable options for consumers, you know, businesses, things like that, mid-level businesses. And we're really, the goal is actually to create kind of a more open and accessible AI ecosystem. We want a lot

Starting point is 00:03:59 more people to be able to contribute, be able to leverage kind of the resources that we advocate. It's a pretty cool product. Cool. So, you know, give us, give us an example. So, you know, kind of even in my hypothetical, I just talked about, let's say there's, there's a medium-sized business, right? And maybe they haven't been big in the data game. Maybe they don't have their own servers and, you know, they're trying to figure it out. So what is kind of that problem that you all solve? Yeah, so it's a two-sided solution. It's a great example, right? You go to a business and they have, say, a bunch of computers sitting around in their offices. At night, they can connect into our network very quickly. We have a very quick, one-flick program to install.

Starting point is 00:04:37 They can run that at night and provide compute to the network. And then when they wake up the next day and they want to leverage some of the AI models that we run, they can quickly tap into our APIs and basically get access to all those models that we run on the network. So kind of two-sided, right? You can provide on one side and you can also use it on the other side. Very cool. All right. So let's get caught up a little bit with, you know, current day because like I talked about, right, I don't think, you know, compute and GPUs were at the top of, you know, those people's mind, you know, especially when, you know, the GPT technology came out in 2020, let alone in, you know, late 2020 when chat GPT was released. So why is compute now just like one of the leading, I mean, we're talking about national security. We're talking about 100 billion.

Starting point is 00:05:24 billion dollar infrastructure projects. Like, why is compute now this huge term when it comes to just the U.S. economy at large? Yeah, totally. So, I mean, five years ago, if you go back, right, gaming was the biggest use case for GPUs. Nowadays, it's all AI. Right. That's why there's huge demand for it. These models are getting bigger in some cases. They're also getting smaller chain of thought uses a ton of different tokens. So although the models are smaller, they still use a ton of resources. The reality is, is that silicon, uh, as it stands. today. One of our team members actually works on chips a little bit. We're basically reaching the peak capacity of what we can do with chips, right? We're definitely stretching then the current

Starting point is 00:06:03 technology that we have for chips. So although the models keep getting better, bigger, larger, more compute demand, the reality is that the technology is just not able to keep up. We're about 10 years out, give or date from actually having a new, basically a new technology for chips. Sure. And, and, you know, as we talk about current demand today, right? You know, you always see all these, you know, jokes online, you know, people are like, you know, we'll work for computers, right? And the big, the big tech companies, you know, open AI, right? When like, whenever they roll out a new feature, you know, a lot of times they're like, hey, our GPUs are melting. We're going to have to pause new user

Starting point is 00:06:44 signups. You know, why is it that even the biggest tech companies can't keep up with this demand? Yeah, I mean, it's a crazy system where Anthropic has the same issue, right? Where Claude tokens are still kind of limited to this degree. We're running to the point where you're basically running, you're stretching the power grid in, you're stretching every resource that we have in the world to run these different models.

Starting point is 00:07:08 At the end of the day, you know, Open AI, I think they use primarily Invidia for their data centers. But once again, Nvidia has demand all over the world for these chips. So they can't allocate all of their resources only to Open AI. So Open AI has certain, on certain threshold, that they rent from and use. But the reality is there's just there's too much demand.

Starting point is 00:07:27 You're talking about millions and millions of requests. And the request, for example, like image generation, these aren't like one second returns, right? You're talking about 10, 20 seconds to actually return these. And video models are even worse. You're talking about minutes potentially, even on H-100 to H-200. So the reality is, like I said, our compute,

Starting point is 00:07:46 or however it cannot possibly keep up demand. And we don't have the latest gen shift enough. So, you know, one thing, and you know, you kind of mentioned it, I think at the same time, we're seeing models become exponentially smaller and more powerful, right? Like as an example, OpenAI's GPT40 Mini, yet then you have these monster models like GPT45, right, which is reportedly like five to 10 times larger than GPT4, which was, I think, like a two trillion parameter model. So walk us through like this, like the whole concept of models both getting, you know, technically smaller and more efficient, yet models also at the same time getting bigger.

Starting point is 00:08:35 And then how does that impact, right, the industry as a whole because it seems like it's hard to keep up with. Yeah. On one end, it kind of reminds me of like cell phones back in the day, right, where we would progressively get them smaller and then eventually we had a new feature to get bigger and and kind of get smaller again. The reality is, is that a year ago, larger models, we were basically just throwing a million different data points into these models, which made the model much larger, and they were relatively good.

Starting point is 00:09:02 But the reality is that no one wants to run a $7 billion, you know, it's $70 billion, 700 billion parameter model, right? So we've gotten them smaller. They're still, now they're kind of working with the intricacies of how we're actually running these models. So chain of thought basically enables you to give a better prompt, right. It basically takes a human prompt, turns into what the system can read better, and then gives you a better output. And it also might run through a bunch of tokens to give you a

Starting point is 00:09:27 better output. So chain thoughts are really cool way to basically reduce the model size. But the reality is, is that although we're cutting the model size so we can put it on a smaller chip, the reality is you're still using a million tokens, which doesn't really actually help our compute issues. It's kind of a, it's kind of bad at work. It's like, yeah, it's, it is interesting, right? So yeah, you know, even now we have these, You know, newer hybrid models in Claude 37 sonnet in Gemini 2.5 pro. And, you know, you use them and they seem relatively fast. And, you know, if you don't know any better, you might say, okay, this seems sufficient.

Starting point is 00:10:02 But then if you look at the chain of thought or if you click like show thinking, you're like, my gosh, it just spit out 10,000 words to tell me, you know, what's the capital of Illinois or something like that, right? So, you know, as models get smaller, you know, this is something I'm always interested in. You know, might we see a future where, you know, that more, you know, hybrid models or the, you know, reasoning models, will they eventually become less efficient? Or is that always going to be something, you know, kind of like on one side, models get smaller, but they're getting smarter. And so they're going to have to just think more regardless. Yeah, to your question. I think that will get to the point where they're highly efficient. I mean, the realistically, the gains we've made with even deep seek is just incredible, right?

Starting point is 00:10:49 Even their 7 billion parameter model, which is relatively small, you can run on most consumer great chips. It's extremely good. The prompting is great. It obviously has a pretty good knowledge base. And once you really combine that with the ability to surf the internet and actually get more answers and use more data, that's where I think we'll get to, I wouldn't call it AGI, but we're very close to that, where basically you're adding in real-time data with the ability to kind of. reason a lot more. So I do think we'll get there. I think the profits that we made, although it seems like it's been forever since kind of the first models came out, the progress was insane and extremely quick. Yeah, I'm confident. Yeah. And you know, speaking of Deep Seek, I know it's been, you know,

Starting point is 00:11:31 all the rage to talk about Deep Seek over the last, you know, the last couple of months. But, I mean, I think you also have to call out Google, right, with their Gemma 3 model, which I believe is a 27 billion parameter, you know, greatly outperformed Deep Seek v3, which is, I think, 600 plus billion parameter, at least when it comes to ELO scores, and it's not even close, right? So what does this say about the future, right? I know I kind of named, you know, two open models there. You know, they're getting even, even the open, right, everyone's like, oh, deep seek is, you know, changing the industry. Well, I'm like, yo, look at Gemma three from Google. It is 5% the size and way more powerful when it comes to human preference, right? So what does this even mean for the future of edge

Starting point is 00:12:18 computing? And how does edge computing impact, you know, compute need or, you know, GPU demand? Yeah, well, we, we started this business. The reality was is that although we wanted to convince ourselves that open source models were good, we were based on. Right. Open source models were relatively bad. You know, open AI was extremely dominant at that time. It was, it was like, you couldn't even believe that anyone would ever catch on to open AI. Nowadays, we're probably running at like a one to two month lag between parity of private source, you know, and open source model, which is really interesting. And when you tie that in with the idea of kind of data privacy and things like that, I think there is a huge argument for basically edge compute taking

Starting point is 00:13:02 over a lot of the smaller daily tasks and then reserving some of the more private models and things like that and the larger models for things that might be a little bit more deep like research and things like that. But a lot of things that you do on a daily basis that AI can actually improve, I think you can run purely on edge compute and basically have your house and your couple computers and things like that, maybe your laptop or iPad, basically turning to this little tiny data center that allows you to run whatever model we want to run at that time. We're just really far away.

Starting point is 00:13:31 The reality is you can do that today, right? We could probably enable that in week. The only problem is that getting it from. teaching people to basically use that and set it up, right? It takes time for people to learn how to, oh, install your own model and start running. So it's more of like the, uh, the, uh, the UX of it more than anything. Yeah. You know, and that, you know, I always think, right, I always think, uh, with these models, uh,

Starting point is 00:13:55 becoming smaller, more capable, uh, you know, is, is, will most things be edge in the future, right? Like, you know, I even saw the, you know, Nvidia GTX, right, formerly called digits. You know, I did the math on that. I'm like, that would have cost five years ago. I think like $70,000. It wasn't even capable to do it anyways, right? Like, are we going to have the average, you know,

Starting point is 00:14:24 smartphone in five years? Will it be able to run state of the art large language model? And if so, like, how does that change the whole cloud computing conversation? It will be really interesting. I think you're 100% right. And I think five years might even be a stretch. I think what will come down to, like I said, is privacy. If people are really worried about their privacy,

Starting point is 00:14:44 then I think that people will push for edge compute to be running and you'll be able to run your own model that only uses access to your own data on your phone device, whatever it is, right? If people don't care about that as much, it might take a little bit longer just because people won't build that. But I really do think that are some teams that are building in that angle where essentially you're going to have your little data, of information about yourself and your life and your wife and whatever else.

Starting point is 00:15:08 And essentially, you'll be able to run all that stuff without ever touching any centralized model for obvious reasons, privacy reasons, things like that. We already give so much data to big tech, right? I think we're good on giving me any more and sharing any more intimate details about our lives. It'll be a good thing if we can do that. Yeah. And, you know, even as we start looking, you know, at this race, which, you know, if you looked at it two years ago, you know, I don't know if anyone, even the, the staunchest, you know,

Starting point is 00:15:37 open source believers would, would believe that we're at the point that we are now. But, you know, between whatever we're going to see from meta in their next Lama model, I've already talked, you know, we've already talked about Deep Seek and, you know, Gemma as well. And, you know, Open AI also has recently said that they're going to be releasing an open model. Totally suppose. Yeah, yeah, yeah. We'll see what happens. We don't buy any of that. Yeah, I remember the GPT two open fiasco, right? But regardless, I mean, what happens when and if open models are more powerful than closed in proprietary model? So number one, what happens from, you know, kind of a, you know, GPU and compute perspective?

Starting point is 00:16:24 But then how does that change, you know, the business leaders mindset as well? Yeah. So at that point, once things become commoditized, right, and the models are essentially all on the same level, give or take a little bit of change between a variation. The reality is that compute becomes the last denominator of basically being able to offer those models at the cheapest cost, right? So at that point, it basically comes down a race to the bottom in terms of who can get the cheapest compute and offer to people with the best selection of models. And UX and UI all kinds, isn't that right, marketing, things like that. assuming that that does happen. The question then comes down to what happens solely private circumstances, right?

Starting point is 00:17:04 Which my personal view on it is, is that there is probably a world where essentially open AI and anthropic eventually burns so much money, which they lose money every day already, that they don't get to the point where they're looking to get to. And essentially, they have to just either change business models or run out of money, right? I think that's probably a little bit of a point, a contentious point. But the reality is that right now we're running models that are very close to as good as what they have. And it's like at so what point does the marginal gain isn't work right? When H100's become a lot cheaper, we'll be able to run some of the biggest models very quickly and easy.

Starting point is 00:17:45 And the access will just be so good that it might not matter. the problem is that I mean I personally do I've always believed in private source I do believe that there's great use cases for it and reality is it's like whether you love Sam Altman or hate Sam Altman he's pushed things for forward a lot Greg he's been really productive for the entire environment so you don't want them to go bankrupt that I think they might just have to figure out a way to appeal to consumers or businesses in a different way as opposed to just general models which is what I do like I think you're a great in what they talked about with Syria and things like that. They'll probably figure out ways of time for the real way. So speaking of affordable AI, and you just brought up as well, you know, companies like Open AI in Anthropic, right?

Starting point is 00:18:33 Their burning of cash is well documented. You know, but I mean, does this at a certain point if large language models become commoditized because of open source models, is it just more of the kind of the application layer that becomes the thing, you know, these companies real differentiator, right? Because aside from, you know, open AI is $200 a month, you know, pro subscription,

Starting point is 00:18:59 it's like, okay, which they also said they're losing money on. Like, aside from that, you know, how else are these big companies that so many people rely on going to continue to exist five, 10 years after their, you know, $40 billion of funding, you know, might run out if they're not. I was not right at some point. We've been saying about this about Uber for how many years now, though, to be fair. These companies can exist a long time without being profit. But reality, I think the reality is that the one thing that the centralized type of providers

Starting point is 00:19:32 offer, like Open AI, is that they're able to work with a lot of data that would be very sensitive, primarily like health data and things like that. So I'm sure there's a lot of very good business use cases that they can provide to very large enterprise consumers, or not on the same business. And I don't really know what those are outside of like health and things like that, that data that's very private, you know, the government contracts and things like that.

Starting point is 00:19:56 Those models are super useful for that. But it will be tough. I mean, it would really, I mean, I feel like we're almost there already, to be honest with you. Like I said,

Starting point is 00:20:07 I don't think we're that far away from the point where people are like, why don't let me just cancel open AI and you don't use law. Like, let me go cancel and use jump. You know, all these different models that are out there. There's so many good ones at this. But it might be more integrations.

Starting point is 00:20:22 It might be more, like I said, UI, UX. It might be the fact that at the end of the day, we use iPhones every day and Androids, and maybe they just put a true monopoly on being able to use them, you know. So we'll see. Yeah, it's interesting. So, you know, we've covered a lot in today's conversation, Tom, when this concept of distributed computing

Starting point is 00:20:45 and, you know, how it's, you know, the race between, you know, open source AI and closed AI is really changing, you know, the compute landscape and just the AI landscape as a whole. But, you know, as we wrap up today's show, what's the one most important or the best piece of advice that you have for business leaders when it comes to making decisions, right, about how they are using AI at scale? Yeah, that's a great question. I think the best advice, the thing that we've learned the most from our personal business that I can provide, is that the landscape changes so fast. The last thing you can do is lock yourself into one specific provider or model. Don't allocate too many resources and sell the house on one specific setup because the next week something comes out and totally breaks everything before it, right? So make sure you're open.

Starting point is 00:21:37 Make sure you're flexible on what you're using and how you're using it and be ready for someone to come out and completely. break the mold and change the direction of everything. It's such a fast-paced environment. It's really hard to keep up. And, you know, I think we're just kind of still stretching the service where AI will actually agree with it. All right. Exciting conversation that I think a lot of people are going to find valuable.

Starting point is 00:22:00 So Tom, thank you so much for sharing your time and coming on the everyday AI show. We appreciate it. Thank you so much for having us. We really appreciate it. All right. And hey, as a reminder, you all, if you miss something in there, you know, a lot of A lot of big terms we're tossing around and getting a little geeky on the GPU side. Don't worry, we're going to be recapping it all in our free daily newsletter.

Starting point is 00:22:22 So if you want to know more about what we just talked about, make sure you go to your everyday AI.com. Sign it for the free daily newsletter. Thank you for tuning in. We hope to see you back tomorrow and every day for more everyday AI. Thanks, y'all. Thank you. Meet Firefly AI assistant.

Starting point is 00:22:41 Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words. and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premier Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com.

Starting point is 00:23:09 And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit your everyday AI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 506: How Distributed Computing is Unlocking Affordable AI at Scale

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.