The AI Daily Brief: Artificial Intelligence News and Analysis - The Most Important AI Stories Last Week
Episode Date: April 2, 2024On today's episode, NLW catches up on the big stories over the last week, including Grok 1.5, OpenAI courting Hollywood, and much much more. Be the first to learn about our new AI education platform:... https://besuper.ai/ ** ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI Breakdown, we're looking at the most important AI stories from the past week.
The AI breakdown is a daily podcast and video about the most important news and stories in AI.
Go to Breakdown.network for more information about our YouTube, or Discord, and our newsletter.
Welcome back to the AI breakdown.
As you guys know, I have been traveling for about a week and then was taken out by some post-travel sickness.
And so what I wanted to do today is a little bit something different.
Instead of our normal brief followed by main episode, we're just going to go through the biggest stories from the last.
week or so that we didn't have a chance to cover on the AI breakdown.
For those of you who have been paying attention and get your news from other sources as well,
I apologize for the repeats.
But for those of you who maybe dipped out for a minute, this should give you a pretty
good sense of everything that happened over the last, call it 10 days or so.
We kick off with the promised update from Elon Musk's GROC.
Last week, Elon announced GROC 1.5.
And according to people who had seen it, it represents a pretty significant upgrade over
GROC 1.0.
Of course, a couple weeks previous to this announcement, a version of GROC 1.5, a version of
GROC 1 had been released to open source, and so that gave people a sense of what was going on
under the hood with GROC, and in this version, the team at GROC points to improved reasoning
and problem-solving capabilities as the biggest advances.
They write, one of the most notable improvements in GROC 1.5 is its performance in coding
and math-related tasks.
In our tests, GROC 1.5 achieved a 50.6% score on the math benchmark and a 90% score
on the GSM-8K benchmark, two math benchmarks covering a wide range of grade school to high
school competition problems.
Additionally, it scored 74.1% on the Human Eval benchmark, which evaluates cogeneration and problem-solving abilities.
Now, by way of comparison, GROC 1.5 is scoring on the MMLU right around what Mistral Large is scoring,
but it's still a little bit behind Gemini Pro 1.5, GPD4 and Claude 3 Opus.
The context window has also increased significantly up to 128K tokens with GROC 1.5, with GROC writing.
In the needle in a haystack evaluation, GROC 1.5 demonstrated powerful retrieval capabilities for embedded texts,
within context of up to 128k tokens in length, achieving perfect retrieval results.
Elon also added the details, firstly, that Groc should be available on X this week,
and second, that Grok 2, which is currently in training, quote, should exceed current AI on all metrics.
So some more interesting things to be watching out for.
That said, it wasn't really Grok 1.5 that was the most buzzed about LLM last week,
but in fact, DBRX from Databricks.
Last week, Databricks wrote,
Today, we are excited to introduce DBRX, an open general-purpose LLM created by Databricks.
Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established open
LLMs. Moreover, it provides the open community and enterprises building their own LLMs with
capabilities that were previously limited to closed model APIs.
According to our measurements, it surpasses GPD 3.5 and is competitive with Gemini 1.0
Pro. It's an especially capable code model, surpassing specialized models like Code Lama 70B on programming.
Now, the reason that people were so excited about this was its open release.
Dylan Patel from semi-analysis writes,
Databricks' DBRX model is amazing, generally great, but crushes code.
Eli Goatsy from Databricks writes,
today we released an open-source model DBRX that beats all previous open-source models
on the standard benchmarks.
The model itself is a mixture of experts.
That's roughly twice the brains, $132 billion,
but half the cost, $36 billion of Lama 270B,
making it both smart and cheap.
Since only 36 billion expert parameters are used live,
it's close to twice the speed of Lama 270B.
We're excited to build custom versions of this for organizations that have proprietary data.
Another big piece of news from the open source AI world was a big leadership shakeup at
Stability AI.
Amman Mustak announced that he was stepping down in his role as CEO and in his position on
the board of directors to pursue something that he's calling decentralized AI.
As the company goes about looking for a replacement CEO, Stability's C-O-O and CTO are jumping
in as interim co-CEOs.
In an announcement post, Amad said,
I am proud two years after bringing on our first developer
to have led stability to hundreds of millions of downloads
and the best models across modalities.
I believe strongly in stability AI's mission
and feel the company is in capable hands.
It is now time to ensure AI remains open and decentralized.
He also retweeted the post on Twitter saying,
Not going to beat centralized AI with more centralized AI.
All in on decentralized AI.
Lots more soon.
He followed it up with a little bit more explanation, writing,
as my notifications are RIP some notes.
One, my shares have majority of vote at stability AI.
Two, they have full board control.
The concentration of power in AI is bad for us all.
I decided to step down to fix this at stability and elsewhere.
We'll be sharing more soon.
Now, while he hasn't given any details yet about what decentralized AI is actually going to look
like, he has promised much to the chagridden of the crypto crowd that there will be no
token or coin.
Over in the land of big companies, Apple got some excited buzz going when it announced the
official date of its worldwide developers conference, which is coming on June 10th. Bloomberg writes,
though Apple didn't say what it plans to unveil, people familiar with the matter have said that
the presentation will focus heavily on AI. They continue, Apple is expected to unveil its next major
software updates for the iPhone, iPad, Mac, Vision Pro headset, and Smartwatch, and its new AI strategy
will be front and center for the planned iOS 18 upgrade. In announcing the event, Apple Marketing
Executive Greg Josviak said, it's going to be absolutely incredible with absolutely an incredible,
both uppercased, as Bloomberg puts it a clear nod to AI.
Not to be nudged out of the news cycle by all these other Johnny Come Lately's, Open AI was very busy
last week with announcements as well. First of all, they shared a post that they called SORA
First Impressions. What seemed to come out of this and where most of the conversation in the
community was, was the notion that, with Sora, the company is going after a very different audience.
It sounds like they've been aggressively courting Hollywood and filmmakers, and also that the
cost of production with Sora right now just doesn't really make it viable as a consumer
product. OpenAI also announced that they've been working on a text-to-voice platform called
Voice Engine. TLDR, they say that this thing is too powerful for them to be comfortable releasing
as it is. They write, OpenAI is committed to developing safe and broadly beneficial AI. Today,
we are sharing preliminary insights and results from a small-scale preview of a model called
voice engine, which uses text input and a single 15-second audio sample to generate natural
sounding speech that closely resembles the original speaker. It is notable that a small model with a
single 15-second sample can create emotive and realistic voices. Basically, their blog post talks about
first, how voice engine works with some examples, but then second, about the challenges of, as they
call it, synthetic voice misuse. They're right, we hope to start a dialogue on the responsible deployment
of synthetic voices and how society can adapt to these new capabilities. Based on these conversations
and the results of these small-scale tests, we will make a more informed decision about whether and how
to deploy this technology at scale. They also wrote, we recognize that generating speech that
resembles people's voices has serious risks, which are especially top of mind in an election year.
We are engaging with U.S. and international partners from across government, media, entertainment,
education, civil society, and beyond, to ensure we are incorporating their feedback as we build.
One way that they're thinking about how to do this more safely, they write,
we believe that any broad deployment of synthetic voice technology should be accompanied by voice
authentication experiences that verify that the original speaker is knowingly adding their voice to the service
and a no-go voiceless that detects and prevents the creation of voices that are too similar to
prominent figures. There are a couple really interesting things about this. One, of course, is just the
ongoing question of safety and ethics that relates to any new advancement in AI. But second,
there is this constant question, it's something that I talked about with the latent space guys last
week, about whether specialized or generalist models will win the day. The more that OpenAI keeps
releasing, or at least showing off these models, that seem to just kick the slats out of the
specialist models in their category, seems to put more evidence in the category that these big
generalist models will be the ultimate winners, although obviously it's still a little bit too early to
tell. Another bit of OpenAI news was a report that Microsoft and OpenAI are collaborating
on a $100 billion data center. This came from a report from the information, and the details were
that the two companies are planning a data center project that, again, could cost as much as $100 billion,
would be set to launch in 2008 and would include an AI supercomputer that they're calling Stargate.
writes Reuters, the information reported that Microsoft would likely finance the project, which is expected
to be 100 times more costly than some of the biggest existing data centers.
The proposed U.S.-based supercomputer would be the biggest in a series that companies are looking to build over the next six years.
Meanwhile, on the other end of the computing spectrum, we also got information from Intel that Microsoft's co-pilot AI will soon be running locally on PCs without having to touch the cloud.
writes Tom's hardware. We've previously reported on industry rumors that Microsoft's copilot
AI service would soon run locally on PCs instead of in the cloud, and that Microsoft would
impose a requirement of 40 tops of performance on the neural processing unit, but we had been
unable to get an on-the-record verification of those rumors. That changed today at Intel's AI summit
in Taipei, where Intel executives in a Q&A session with Tom's hardware said that co-pilot elements
will soon run locally on PCs. Now, if you've been watching the Apple AI news at all, you'll
know that bringing this sort of capabilities on device and out of the cloud is something that they
are also hugely focused on, and so interesting to see that it might become part of just the
table stakes for AI computing going forward. Staying on the big company theme, Amazon announced
the second tranche of its funding for Anthropic, this time investing $2.75 billion for a total
investment of $4 billion, and reinforcing the close tie-ups between the two companies.
Finally, to make quick mention of what's going on outside of the tech world itself, the White
House last week announced a new policy through which every federal agency is being called on to
appoint a chief AI officer. According to this new policy, any agency that has not yet appointed a chief
AI officer must do so within the next 60 days. And as Ars Technica writes, if an official that has
already been appointed to that position doesn't have the necessary authority to coordinate AI
use within the agency, they must be granted that additional authority or have a new AI chief be named.
I mentioned it before, but one of the things that I think is extremely notable about the government's relationship with AI is that even as regulatory policy takes its time to work through the system, federal agencies, the military establishment, basically all of the mechanisms of government are not waiting around. They are figuring out how to integrate AI right now and doing it post-haste.
Now, where all of these federal agencies are going to get all of this talent and expertise remains to be seen, but it's certainly something worth watching.
For now, guys, that is going to do it for today's AI break.
Until next time, peace.
