The AI Daily Brief: Artificial Intelligence News and Analysis - Mobile LLMs? The Battle to Get AI on Our Phones
Episode Date: October 2, 2023Stability AI has released a new 3b model that outperforms some 7b and 20b models, and previews the forthcoming battle to get local LLMs on our mobile devices. Before that on the Brief: DALL-E 3 is ava...ilable now for free for Bing users, with 100 speed boosts per week; Apple appears to be hiring for AI roles in the United Kingdom. TAKE OUR SURVEY ON EDUCATIONAL AND LEARNING RESOURCE CONTENT: https://bit.ly/aibreakdownsurvey ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're looking at stability's latest model and what it says about the future of LLMs on our smartphones.
Before that on the brief, how you can get access to Dolly 3 right now.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.network for more information about our YouTube channel, our newsletter, and our Discord.
Welcome back to the AI breakdown brief.
All the AI headline news you need in around five minutes.
We kick off today with very exciting news for those of us who have been just pressing refresh on our chat TPT plus subscriptions, hoping to see Dolly 3 arrive.
Well, now not only us, but basically anyone with a Microsoft account can use Dolly 3 directly via Bing.
To get access to Dolly 3, just go to Bing.com slash images slash create, and from there, it will ask you to either sign in or to create a new account.
Now, each week, free users get 100 boosts that increase the speed with which images are created,
but can also create more images after that at a slower pace.
As you've probably seen, one of the things that makes Dolly 3 exciting for people who have
been using tools like Stable Diffusion and Mid Journey is that it seems to handle text a lot better.
It's not perfect, but it's certainly a huge upgrade from what we've had so far, at least among
the main services that most people use.
Now, of course, given that people are getting access to Dolly 3 now, there is a huge amount of discussion
on Twitter slash X around how it compares to mid-jurney.
Dreaming Tulpa, I think, sums up a lot of what I'm seeing,
which has to do with Dali 3's main performance differential,
being at how good it can actually interpret the natural language prompts
that people are trying to achieve.
In other words, I haven't seen a lot of people say that Dali 3
is distinctly better than mid-jurney when it comes to the quality of imagery,
but instead that the natural language prompting
allows people to get much closer to what they were actually imagining than they can using
Mid Journey prompting. Indeed, in many ways, Mid Journey feels like an act of prompt engineering,
while Dolly 3 promises just full-on natural language inputs and the ability to refine once again
with natural language. Tauper says, I'm super impressed with how Dolly interpreted the prompt below.
While Mid Journey's outputs are beautiful, it's nowhere near what I was looking for.
Mid Journey recently said they're going to improve upon this and oh boy, I hope they do.
This feels like the first time MJ got a serious competitor.
TechHalla did something interesting where they tested 10 different prompts Mid Journey versus
Dali 3 and rated them on a scale using accuracy, aesthetics, detail, consistency, and believability.
TechHala sums up, while I still believe Mid Journey is slightly better overall, Dolly 3 is very close.
In fact, in terms of believability and accuracy, it's even ahead.
Mad Pencil responds and says MJ did mostly better on aesthetics and details,
but Dali stays faithful to the prompts.
Now, moving on to a little bit of Apple AI news.
In a recent discussion with UK press, Tim Cook said that that company was not only not planning
on layoffs in the country, but that they would be increasing the size of their artificial
intelligence team in the United Kingdom.
Now, there really weren't more details than that.
Cook just responded to a question around AI saying, we're hiring in that area, yes, and
so I do expect investment to increase.
This is obviously a boon to a country whose prime minister has said very clearly that he
wants them to be a leader in both the regulation and the development of artificial intelligence.
Now, we have just come off of September, and in many ways it felt at the beginning, like a relatively
quiet extension of the summer, only to, of course, over the last week and a half or so,
really pop off with the announcement of a huge slate of products. However, looking back, one of the
things that was clear was just how much venture capital activity there was over the last month.
Chief AI officer on Twitter sums up the top 10 AI startup funding rounds from last month,
including that big $4 billion investment from Amazon into Anthropic,
which for the sake of completeness isn't exactly a $4 billion investment right up front,
but up to $4 billion being invested over time.
There's also a $500 million investment into Databricks.
Now, Databricks also made headlines for a deeper partnership with Microsoft
that some saw as Microsoft hedging their bets relative to their relationship with OpenAI.
Other big investment rounds include a $220,000.
$23 million investment into Helsing, which is an AI company focused on the defense industry that is
backed by Spotify's Daniel Eck, $200 million to imbue, $125 million to in Fabrica, $110 million to
D-Matrix, $100 million to writer, $100 million to Inceptive, 100 million to Pryon, and another
$85 million to Pixus, which is AI for marketers.
Nine-nine-figure rounds in a single month shows just how much activity there is in this space.
Now, of course, the other big thing that happened in September was the culmination of
the writer's strike. The discourse around the compromise that they reach with AI has been really
fascinating to me, as so far it seems a lot more like a Roershack test in terms of how people interpret
it, than a clear win or loss for either AI or for the writers themselves. But whether it
represents a win or a loss, another piece of news from this morning shows just how much AI is going to be
a part of celebrity and pop culture in general going forward. Over the weekend, Tom Hanks posted a picture
of himself from a video and said, Beware, there's a video out there promoting some dental plan
with an AI version of me. I have nothing to do with it. Now, Hanks is interesting because he is very
clearly not some visceral AI Luddite or someone who dismisses the excitement around the technology.
On a podcast recently, Hank said, anyone can now recreate themselves at any age they are by way
of AI or deepfake technology. I could be hit by a bus tomorrow and that's it, but performances
can go on and on and on. Outside the understanding of AI and deepfakes, there'll be nothing to tell
you that it's not me and me alone. That's certainly an artistic challenge, but it's also a legal one.
He added, without a doubt, people will be able to tell that's an AI, but the question is, will they care?
There are some people that won't care that won't make that delineation.
Finally today, another story at the intersection of artificial intelligence and health.
Science Alert.com writes, AI identifies brain signals associated with recovering from depression.
The piece writes it could soon be possible to measure changes in depression levels like we can measure blood pressure or heart rate.
So the piece is about a recent study, and effectively what that study was trying to show is that while currently, all we have to go,
on when it comes to understanding levels of depression is patient self-reporting their mood,
that's problematic because so many things can affect one's mood.
A stressful thing that happened in the morning can impact how someone's day was, just as
much as any sort of actual underlying depression issues.
Given that, they write, scientists in the U.S. used a combination of electrode implants
and AI analysis to try to pinpoint changes in brain activity patterns triggered by deep brain
stimulation.
The result was that the team of researchers, which included people from the Georgia Institute
of Technology, the Emory University School of M.R. University School of
medicine, and the Icon School of Medicine at Mount Sinai, did end up identifying a brain signal
that could be used as a biomarker linked to recovery from depression. So far, it seems to be more than 90%
accurate in its feedback. Now, one of the things that makes the study most exciting is that each
of these types of studies builds on itself, given that there's now a new data set for future AI
to be trained on. As the piece writes, the AI was trained using images of the participants' brains at the
start and end of the process, giving it the opportunity to spot neurological differences that the
human eye might miss. One of the patients responded well to treatment for four months before relapsing,
for example, and the recovery signal disappeared a month before the relapse. Now that the AI has been
trained, it can be used in further studies like this, giving researchers a much better set of data
than they get with self-reporting alone. One of the things that I see starting to shake out is that
as politicians are talking about weighing the risks of AI with the opportunities, the area that is
clearest to people around those opportunities seems to be in the health sphere. The more studies we get
like this, the more likely it is people fight to continue to be able to leverage those benefits
in the medical field, even as policy attempts to put guardrails around other types of AI in order
to prevent future bad outcomes. In any case, that is going to do it for today's AI breakdown brief.
Next up, the main AI breakdown. Hello friends, quickly before we get into the main episode,
I wanted to tell you about one opportunity. It is the beginning of October, and that means we are
refreshing the very limited number of personalized AI consulting sessions that we have for this month.
These are short, high-impact consulting sessions where I will get into what you are trying to learn,
how you are trying to apply AI to your business or life, and do my best to get you up and running
with resources to take your efforts to the next level. If that's something that's interesting
to you, like I said, I make available an extremely limited number of slots for this per month.
So shoot me a note at NLW at breakdown.network, and I will do my best to get you on the calendar.
With that, let's listen to the rest of the episode.
Welcome back to the AI breakdown.
One of the most anticipated evolutions of the artificial intelligence space
is the move to be able to run large language models on mobile devices such as smartphones.
Now, by and large right now, models are too big to be able to run locally without serious performance degradation,
but that hasn't stopped people speculating about local on-device type of LLMs being the future of the AI space.
All year, we've had articles about this evolution.
in the LLM space.
Back in July, the information wrote about it in a piece called small devices could soon
handle large language models.
The specific prompt for that piece was an announcement from an AI startup called OctoML,
but it articulated some of the benefits as well.
They write,
running large language models on the edge could alleviate some of the exorbitant cloud
computing costs facing AI companies by taking advantage of computing power sitting idly
on their customers' laptops and devices.
That would also benefit cloud providers, which have to ration access to server hardware
for their own internal teams.
However, as they point out, historically, AI researchers have struggled to run sophisticated AI algorithms
like LLMs on the Edge, since those models have to share computational resources and memory space
with other important functions like, well, actually being able to use your phone, and are usually
more compute-hungry than the voice recognition or computer vision models already running on devices.
Now, the piece also points out that Edge AI has other benefits, such as being able to run without
an internet connection, which might be as banal a value proposition as being able to use it on an airplane
without Wi-Fi, or as serious as having a medical device assistant during a high-risk surgery.
They also discuss the benefits of latency. They write, in the case of Edge AI, processing data
locally means there's no need to transmit data over the internet to a remote cloud server
and back speeding up the process. Finally, there is the benefit of privacy. Simply put,
AI models work better when they have access to more customized information about the person
using them, especially when you're dealing with cloud services that presents a risk,
whereas if a model could run on device, users might have more confidence that their data
wouldn't be leaving that device.
Now, this seems to be part of the barrier that has held Apple back, for instance, from going
deeper into the LLM and generative AI space.
In an article a couple weeks ago about how Apple had increased its training spending to
millions of dollars per day, the information once again pointed out this problem.
Quote, questions linger over how Apple can incorporate LLMs into its products.
The company's leaders prefer running software on devices, which improves privacy and performance,
as opposed to on cloud servers.
So far, that just hasn't been feasible.
And yet there has been a lot of discussion lately that that may be a limited time challenge.
In August ZDNet wrote a piece called,
Could you soon be running AI tasks right on your smartphone? Media Tech says yes.
They write,
Today the Taiwan-based semiconductor company announced that it is working with meta to port
the social giants Lama 2, LLM,
in combination with the company's latest generation APUs and Neuripilot Software Development Program
to run generative AI tasks on devices without relying on external processing.
Still, they point out that even Lombollah,
two small data set of 7 billion parameters represents a size of around 13 gigabytes, which, as they put it,
is, quote, outside the practical capabilities of today's smartphones. And that's what made
Stability AI's announcement a couple days ago all the more interesting. Yom Peleg tweets,
Stability AI just casually dropped 3 billion parameters model, trained on 4 trillion tokens,
outperforms most 7 billion models, and a 20 billion model. Now, very quickly, people started to
put this in the context of this question of smartphones running LLMs. Daniel Samanez, quote,
tweeted Amad Mostak announcing the new LM Alpha model and adding likely it can run on iPhones and
pixel phones. Indeed, Amad confirmed that in a later conversation. After AI content creator Igor Pogany
wrote, can't wait until we can run LLMs like chat GPT locally. We'll make many, including me,
way more comfortable with putting sensitive info like finances and health data. Plus, you'd have
your AI buddy with you even when phone service isn't, like on a long hiking trip or out at sea.
So many potential use cases should be happening within a year according to Amadma stock of
Stability AI. While Amad jumped into the comments and said that the new stable LM Alpha, quote,
runs on a normal smartphone and we have much better coming. He also added in a separate tweet,
only a short matter of time before an open 3B parameter model overtakes GPT 3.5 in my opinion.
Then you can have swarms of them as experts as they run on your phone. Now clearly this shows
the direction that Stability AI is heading with this smaller, more performant model. Even as others are
trying to think about how to soup up hardware capacity to run these models, others like stability
apparently, are trying to shrink the models sufficiently that they can be used on today's phones
to accomplish actual useful things. Now, of course, as I just intimated, people aren't approaching
this just from the smallifying LLM side. They're also thinking about it from a hardware perspective.
Dave Lee tweeted, I just successfully convinced ChatGPT that OpenAI should make their own
LLM phone. An interesting question. Should OpenAI make a phone? In almost all cases,
cases making a phone right now to compete against Android and Apple is akin to committing suicide.
There's practically no chance of a new platform gaining much traction due to the dominance
of the existing mega platforms of Android and iOS. However, the advent of GPT4-level LLMs like
ChatGPT presents a unique challenge and opportunity. Android and Apple will likely make their
LLMs the default AI interface for their mobile devices, especially as LLMs take up more and more
of the user time on mobile devices. Especially for Google, this is existential as they are dependent
on search engine revenue, and as AI replaces much of search, it is imperative that Google
equips Android devices with a Google LLM that is comparable to ChatGBT and GPT4.
Dave basically concludes that the only way for OpenAI to fight this tide is to release a new
phone. He writes, it could be centered around their LLM and be a completely new experience.
They could give apps prioritize access to their APIs. It would be a new ecosystem in OS.
Now, of course, Dave didn't just write this up. He asked GPD4 if it agreed. ChatGPT came back with
some pros, including full integration, a unique user experience, competitive advantage, data,
and ecosystem control, but then cons, including that market saturation, how resource intensive it is
to design hardware, the risk of failure, the distracted focus. And then it goes on and on with Dave
continuing to try to argue to chat GPT that it should focus on doing its own thing rather than just
partnering, ultimately leading to chat GPT to say, yes, open AI should consider making its own phone
and operating system to fully realize the potential of its LLM technology in shaping the future
of human-computer interaction. Now, of course, a lot of what we talked about was exactly this,
that OpenAI CEO Sam Altman and Apple's former famous designer Johnny Ive have been in conversations
around building what they call the, quote, iPhone of artificial intelligence, although it appears
from sources with information that the actual form factor of this device isn't clear, just that it's
an AI native from the ground up, reimagining of a personal computing device for the AI era.
Now, importantly, this is more than a few idle conversations over dinners in San Francisco,
as they've also been discussing a potential billion dollar investment from SoftBank to get the venture started.
And yet there are other efforts in this space as well that aren't strictly confined to just a phone.
Another big announcement from last week was the meta-AI integrated Raybans that were announced at Meta's Connect event.
This is, of course, an inherently mobile use case for artificial intelligence,
given that you're wearing these things as you're walking around.
And the type of information you're going to be asking is things like,
what am I seeing in front of me? How do I fix the problem of the appliance that I'm currently
looking at, et cetera? There are also startups that are coming after the AI hardware device space.
Probably the most notable of those is Humane, which had that very well-received demo at TED earlier this
year that involved, among other things, a live voice translation where the speaker who was presenting
had a statement that was translated into another language in his own voice as folks watched.
Now, the people at Humane are mostly ex-Apple folks, and Sam Altman has been one of their biggest
funders, suggesting some amount of continuity to this excitement and interest in a different type of
approach to AI hardware. Currently, Humane is scheduled to unveil more details in a couple of weeks.
Interestingly, though, there is another argument that some are making that although it doesn't
have all of the capacities of improved performance and privacy that would come with a truly
edge AI that lived on device, that in many ways chat GPT with vision represents a first step towards
this AI phone world.
Sunny Mukerjee tweeted, Microsoft missed an opportunity here because if Windows phone was still
around, they could have integrated chat GPT into it. Apple has the phone in the OS but no LLM yet,
and Microsoft has the LLM but no phone. ChatGBTGBT can now see, hear, and speak.
Now, at the end of last week, I shared some of the examples of how people who have early
access to chat GPT with Vision are using it, and the ability to take visual input from the world
around you certainly does give off a sense of where things are headed and how an AI native phone or
device might be a game changer. The example that you've got on your screen right now is from
McKay Wrigley, who took a picture of his team's whiteboarding session, fed it into GPT with vision,
and then had it write some actual working code. So summing up, it seems like there is a clear
trajectory and trend to exploring the way that hardware, both existing modalities of hardware,
as well as new attempts and new form factors of hardware, can transform how people integrate
artificial intelligence into their daily lives. And of course, as much as they haven't made a big move
into this space yet, everyone continues to wait to see what Apple will do. As Robert
Schoble pointed out in July, even if Open AI introduced a phone tomorrow, how will the
world switch? It won't. Apple knows this. It has the only store in many cities where people
buy new things from. So all in all, it is going to be a very exciting time to see what exactly
companies do in this AI hardware space. I think the only thing that is for sure is that we're
going to see more, not less attempts towards it. That I will, of course, keep you updated as we
learn more. Thanks as always for listening or watching. Until next time, peace.
