The AI Daily Brief: Artificial Intelligence News and Analysis - Mobile LLMs? The Battle to Get AI on Our Phones

Starting point is 00:00:00 Today on the AI breakdown, we're looking at stability's latest model and what it says about the future of LLMs on our smartphones. Before that on the brief, how you can get access to Dolly 3 right now. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube channel, our newsletter, and our Discord. Welcome back to the AI breakdown brief. All the AI headline news you need in around five minutes. We kick off today with very exciting news for those of us who have been just pressing refresh on our chat TPT plus subscriptions, hoping to see Dolly 3 arrive. Well, now not only us, but basically anyone with a Microsoft account can use Dolly 3 directly via Bing.

Starting point is 00:00:46 To get access to Dolly 3, just go to Bing.com slash images slash create, and from there, it will ask you to either sign in or to create a new account. Now, each week, free users get 100 boosts that increase the speed with which images are created, but can also create more images after that at a slower pace. As you've probably seen, one of the things that makes Dolly 3 exciting for people who have been using tools like Stable Diffusion and Mid Journey is that it seems to handle text a lot better. It's not perfect, but it's certainly a huge upgrade from what we've had so far, at least among the main services that most people use. Now, of course, given that people are getting access to Dolly 3 now, there is a huge amount of discussion

Starting point is 00:01:25 on Twitter slash X around how it compares to mid-jurney. Dreaming Tulpa, I think, sums up a lot of what I'm seeing, which has to do with Dali 3's main performance differential, being at how good it can actually interpret the natural language prompts that people are trying to achieve. In other words, I haven't seen a lot of people say that Dali 3 is distinctly better than mid-jurney when it comes to the quality of imagery, but instead that the natural language prompting

Starting point is 00:01:52 allows people to get much closer to what they were actually imagining than they can using Mid Journey prompting. Indeed, in many ways, Mid Journey feels like an act of prompt engineering, while Dolly 3 promises just full-on natural language inputs and the ability to refine once again with natural language. Tauper says, I'm super impressed with how Dolly interpreted the prompt below. While Mid Journey's outputs are beautiful, it's nowhere near what I was looking for. Mid Journey recently said they're going to improve upon this and oh boy, I hope they do. This feels like the first time MJ got a serious competitor. TechHalla did something interesting where they tested 10 different prompts Mid Journey versus

Starting point is 00:02:27 Dali 3 and rated them on a scale using accuracy, aesthetics, detail, consistency, and believability. TechHala sums up, while I still believe Mid Journey is slightly better overall, Dolly 3 is very close. In fact, in terms of believability and accuracy, it's even ahead. Mad Pencil responds and says MJ did mostly better on aesthetics and details, but Dali stays faithful to the prompts. Now, moving on to a little bit of Apple AI news. In a recent discussion with UK press, Tim Cook said that that company was not only not planning on layoffs in the country, but that they would be increasing the size of their artificial

Starting point is 00:03:01 intelligence team in the United Kingdom. Now, there really weren't more details than that. Cook just responded to a question around AI saying, we're hiring in that area, yes, and so I do expect investment to increase. This is obviously a boon to a country whose prime minister has said very clearly that he wants them to be a leader in both the regulation and the development of artificial intelligence. Now, we have just come off of September, and in many ways it felt at the beginning, like a relatively quiet extension of the summer, only to, of course, over the last week and a half or so,

Starting point is 00:03:31 really pop off with the announcement of a huge slate of products. However, looking back, one of the things that was clear was just how much venture capital activity there was over the last month. Chief AI officer on Twitter sums up the top 10 AI startup funding rounds from last month, including that big $4 billion investment from Amazon into Anthropic, which for the sake of completeness isn't exactly a $4 billion investment right up front, but up to $4 billion being invested over time. There's also a $500 million investment into Databricks. Now, Databricks also made headlines for a deeper partnership with Microsoft

Starting point is 00:04:04 that some saw as Microsoft hedging their bets relative to their relationship with OpenAI. Other big investment rounds include a $220,000. $23 million investment into Helsing, which is an AI company focused on the defense industry that is backed by Spotify's Daniel Eck, $200 million to imbue, $125 million to in Fabrica, $110 million to D-Matrix, $100 million to writer, $100 million to Inceptive, 100 million to Pryon, and another $85 million to Pixus, which is AI for marketers. Nine-nine-figure rounds in a single month shows just how much activity there is in this space. Now, of course, the other big thing that happened in September was the culmination of

Starting point is 00:04:41 the writer's strike. The discourse around the compromise that they reach with AI has been really fascinating to me, as so far it seems a lot more like a Roershack test in terms of how people interpret it, than a clear win or loss for either AI or for the writers themselves. But whether it represents a win or a loss, another piece of news from this morning shows just how much AI is going to be a part of celebrity and pop culture in general going forward. Over the weekend, Tom Hanks posted a picture of himself from a video and said, Beware, there's a video out there promoting some dental plan with an AI version of me. I have nothing to do with it. Now, Hanks is interesting because he is very clearly not some visceral AI Luddite or someone who dismisses the excitement around the technology.

Starting point is 00:05:21 On a podcast recently, Hank said, anyone can now recreate themselves at any age they are by way of AI or deepfake technology. I could be hit by a bus tomorrow and that's it, but performances can go on and on and on. Outside the understanding of AI and deepfakes, there'll be nothing to tell you that it's not me and me alone. That's certainly an artistic challenge, but it's also a legal one. He added, without a doubt, people will be able to tell that's an AI, but the question is, will they care? There are some people that won't care that won't make that delineation. Finally today, another story at the intersection of artificial intelligence and health. Science Alert.com writes, AI identifies brain signals associated with recovering from depression.

Starting point is 00:05:56 The piece writes it could soon be possible to measure changes in depression levels like we can measure blood pressure or heart rate. So the piece is about a recent study, and effectively what that study was trying to show is that while currently, all we have to go, on when it comes to understanding levels of depression is patient self-reporting their mood, that's problematic because so many things can affect one's mood. A stressful thing that happened in the morning can impact how someone's day was, just as much as any sort of actual underlying depression issues. Given that, they write, scientists in the U.S. used a combination of electrode implants and AI analysis to try to pinpoint changes in brain activity patterns triggered by deep brain

Starting point is 00:06:32 stimulation. The result was that the team of researchers, which included people from the Georgia Institute of Technology, the Emory University School of M.R. University School of medicine, and the Icon School of Medicine at Mount Sinai, did end up identifying a brain signal that could be used as a biomarker linked to recovery from depression. So far, it seems to be more than 90% accurate in its feedback. Now, one of the things that makes the study most exciting is that each of these types of studies builds on itself, given that there's now a new data set for future AI to be trained on. As the piece writes, the AI was trained using images of the participants' brains at the

Starting point is 00:07:03 start and end of the process, giving it the opportunity to spot neurological differences that the human eye might miss. One of the patients responded well to treatment for four months before relapsing, for example, and the recovery signal disappeared a month before the relapse. Now that the AI has been trained, it can be used in further studies like this, giving researchers a much better set of data than they get with self-reporting alone. One of the things that I see starting to shake out is that as politicians are talking about weighing the risks of AI with the opportunities, the area that is clearest to people around those opportunities seems to be in the health sphere. The more studies we get like this, the more likely it is people fight to continue to be able to leverage those benefits

Starting point is 00:07:39 in the medical field, even as policy attempts to put guardrails around other types of AI in order to prevent future bad outcomes. In any case, that is going to do it for today's AI breakdown brief. Next up, the main AI breakdown. Hello friends, quickly before we get into the main episode, I wanted to tell you about one opportunity. It is the beginning of October, and that means we are refreshing the very limited number of personalized AI consulting sessions that we have for this month. These are short, high-impact consulting sessions where I will get into what you are trying to learn, how you are trying to apply AI to your business or life, and do my best to get you up and running with resources to take your efforts to the next level. If that's something that's interesting

Starting point is 00:08:20 to you, like I said, I make available an extremely limited number of slots for this per month. So shoot me a note at NLW at breakdown.network, and I will do my best to get you on the calendar. With that, let's listen to the rest of the episode. Welcome back to the AI breakdown. One of the most anticipated evolutions of the artificial intelligence space is the move to be able to run large language models on mobile devices such as smartphones. Now, by and large right now, models are too big to be able to run locally without serious performance degradation, but that hasn't stopped people speculating about local on-device type of LLMs being the future of the AI space.

Starting point is 00:08:56 All year, we've had articles about this evolution. in the LLM space. Back in July, the information wrote about it in a piece called small devices could soon handle large language models. The specific prompt for that piece was an announcement from an AI startup called OctoML, but it articulated some of the benefits as well. They write, running large language models on the edge could alleviate some of the exorbitant cloud

Starting point is 00:09:17 computing costs facing AI companies by taking advantage of computing power sitting idly on their customers' laptops and devices. That would also benefit cloud providers, which have to ration access to server hardware for their own internal teams. However, as they point out, historically, AI researchers have struggled to run sophisticated AI algorithms like LLMs on the Edge, since those models have to share computational resources and memory space with other important functions like, well, actually being able to use your phone, and are usually more compute-hungry than the voice recognition or computer vision models already running on devices.

Starting point is 00:09:48 Now, the piece also points out that Edge AI has other benefits, such as being able to run without an internet connection, which might be as banal a value proposition as being able to use it on an airplane without Wi-Fi, or as serious as having a medical device assistant during a high-risk surgery. They also discuss the benefits of latency. They write, in the case of Edge AI, processing data locally means there's no need to transmit data over the internet to a remote cloud server and back speeding up the process. Finally, there is the benefit of privacy. Simply put, AI models work better when they have access to more customized information about the person using them, especially when you're dealing with cloud services that presents a risk,

Starting point is 00:10:23 whereas if a model could run on device, users might have more confidence that their data wouldn't be leaving that device. Now, this seems to be part of the barrier that has held Apple back, for instance, from going deeper into the LLM and generative AI space. In an article a couple weeks ago about how Apple had increased its training spending to millions of dollars per day, the information once again pointed out this problem. Quote, questions linger over how Apple can incorporate LLMs into its products. The company's leaders prefer running software on devices, which improves privacy and performance,

Starting point is 00:10:54 as opposed to on cloud servers. So far, that just hasn't been feasible. And yet there has been a lot of discussion lately that that may be a limited time challenge. In August ZDNet wrote a piece called, Could you soon be running AI tasks right on your smartphone? Media Tech says yes. They write, Today the Taiwan-based semiconductor company announced that it is working with meta to port the social giants Lama 2, LLM,

Starting point is 00:11:16 in combination with the company's latest generation APUs and Neuripilot Software Development Program to run generative AI tasks on devices without relying on external processing. Still, they point out that even Lombollah, two small data set of 7 billion parameters represents a size of around 13 gigabytes, which, as they put it, is, quote, outside the practical capabilities of today's smartphones. And that's what made Stability AI's announcement a couple days ago all the more interesting. Yom Peleg tweets, Stability AI just casually dropped 3 billion parameters model, trained on 4 trillion tokens, outperforms most 7 billion models, and a 20 billion model. Now, very quickly, people started to

Starting point is 00:11:52 put this in the context of this question of smartphones running LLMs. Daniel Samanez, quote, tweeted Amad Mostak announcing the new LM Alpha model and adding likely it can run on iPhones and pixel phones. Indeed, Amad confirmed that in a later conversation. After AI content creator Igor Pogany wrote, can't wait until we can run LLMs like chat GPT locally. We'll make many, including me, way more comfortable with putting sensitive info like finances and health data. Plus, you'd have your AI buddy with you even when phone service isn't, like on a long hiking trip or out at sea. So many potential use cases should be happening within a year according to Amadma stock of Stability AI. While Amad jumped into the comments and said that the new stable LM Alpha, quote,

Starting point is 00:12:31 runs on a normal smartphone and we have much better coming. He also added in a separate tweet, only a short matter of time before an open 3B parameter model overtakes GPT 3.5 in my opinion. Then you can have swarms of them as experts as they run on your phone. Now clearly this shows the direction that Stability AI is heading with this smaller, more performant model. Even as others are trying to think about how to soup up hardware capacity to run these models, others like stability apparently, are trying to shrink the models sufficiently that they can be used on today's phones to accomplish actual useful things. Now, of course, as I just intimated, people aren't approaching this just from the smallifying LLM side. They're also thinking about it from a hardware perspective.

Starting point is 00:13:13 Dave Lee tweeted, I just successfully convinced ChatGPT that OpenAI should make their own LLM phone. An interesting question. Should OpenAI make a phone? In almost all cases, cases making a phone right now to compete against Android and Apple is akin to committing suicide. There's practically no chance of a new platform gaining much traction due to the dominance of the existing mega platforms of Android and iOS. However, the advent of GPT4-level LLMs like ChatGPT presents a unique challenge and opportunity. Android and Apple will likely make their LLMs the default AI interface for their mobile devices, especially as LLMs take up more and more of the user time on mobile devices. Especially for Google, this is existential as they are dependent

Starting point is 00:13:49 on search engine revenue, and as AI replaces much of search, it is imperative that Google equips Android devices with a Google LLM that is comparable to ChatGBT and GPT4. Dave basically concludes that the only way for OpenAI to fight this tide is to release a new phone. He writes, it could be centered around their LLM and be a completely new experience. They could give apps prioritize access to their APIs. It would be a new ecosystem in OS. Now, of course, Dave didn't just write this up. He asked GPD4 if it agreed. ChatGPT came back with some pros, including full integration, a unique user experience, competitive advantage, data, and ecosystem control, but then cons, including that market saturation, how resource intensive it is

Starting point is 00:14:28 to design hardware, the risk of failure, the distracted focus. And then it goes on and on with Dave continuing to try to argue to chat GPT that it should focus on doing its own thing rather than just partnering, ultimately leading to chat GPT to say, yes, open AI should consider making its own phone and operating system to fully realize the potential of its LLM technology in shaping the future of human-computer interaction. Now, of course, a lot of what we talked about was exactly this, that OpenAI CEO Sam Altman and Apple's former famous designer Johnny Ive have been in conversations around building what they call the, quote, iPhone of artificial intelligence, although it appears from sources with information that the actual form factor of this device isn't clear, just that it's

Starting point is 00:15:06 an AI native from the ground up, reimagining of a personal computing device for the AI era. Now, importantly, this is more than a few idle conversations over dinners in San Francisco, as they've also been discussing a potential billion dollar investment from SoftBank to get the venture started. And yet there are other efforts in this space as well that aren't strictly confined to just a phone. Another big announcement from last week was the meta-AI integrated Raybans that were announced at Meta's Connect event. This is, of course, an inherently mobile use case for artificial intelligence, given that you're wearing these things as you're walking around. And the type of information you're going to be asking is things like,

Starting point is 00:15:42 what am I seeing in front of me? How do I fix the problem of the appliance that I'm currently looking at, et cetera? There are also startups that are coming after the AI hardware device space. Probably the most notable of those is Humane, which had that very well-received demo at TED earlier this year that involved, among other things, a live voice translation where the speaker who was presenting had a statement that was translated into another language in his own voice as folks watched. Now, the people at Humane are mostly ex-Apple folks, and Sam Altman has been one of their biggest funders, suggesting some amount of continuity to this excitement and interest in a different type of approach to AI hardware. Currently, Humane is scheduled to unveil more details in a couple of weeks.

Starting point is 00:16:22 Interestingly, though, there is another argument that some are making that although it doesn't have all of the capacities of improved performance and privacy that would come with a truly edge AI that lived on device, that in many ways chat GPT with vision represents a first step towards this AI phone world. Sunny Mukerjee tweeted, Microsoft missed an opportunity here because if Windows phone was still around, they could have integrated chat GPT into it. Apple has the phone in the OS but no LLM yet, and Microsoft has the LLM but no phone. ChatGBTGBT can now see, hear, and speak. Now, at the end of last week, I shared some of the examples of how people who have early

Starting point is 00:16:59 access to chat GPT with Vision are using it, and the ability to take visual input from the world around you certainly does give off a sense of where things are headed and how an AI native phone or device might be a game changer. The example that you've got on your screen right now is from McKay Wrigley, who took a picture of his team's whiteboarding session, fed it into GPT with vision, and then had it write some actual working code. So summing up, it seems like there is a clear trajectory and trend to exploring the way that hardware, both existing modalities of hardware, as well as new attempts and new form factors of hardware, can transform how people integrate artificial intelligence into their daily lives. And of course, as much as they haven't made a big move

Starting point is 00:17:38 into this space yet, everyone continues to wait to see what Apple will do. As Robert Schoble pointed out in July, even if Open AI introduced a phone tomorrow, how will the world switch? It won't. Apple knows this. It has the only store in many cities where people buy new things from. So all in all, it is going to be a very exciting time to see what exactly companies do in this AI hardware space. I think the only thing that is for sure is that we're going to see more, not less attempts towards it. That I will, of course, keep you updated as we learn more. Thanks as always for listening or watching. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Mobile LLMs? The Battle to Get AI on Our Phones

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.