The AI Daily Brief: Artificial Intelligence News and Analysis - The Most Important AI Stories Last Week

Starting point is 00:00:00 Today on the AI Breakdown, we're looking at the most important AI stories from the past week. The AI breakdown is a daily podcast and video about the most important news and stories in AI. Go to Breakdown.network for more information about our YouTube, or Discord, and our newsletter. Welcome back to the AI breakdown. As you guys know, I have been traveling for about a week and then was taken out by some post-travel sickness. And so what I wanted to do today is a little bit something different. Instead of our normal brief followed by main episode, we're just going to go through the biggest stories from the last. week or so that we didn't have a chance to cover on the AI breakdown.

Starting point is 00:00:43 For those of you who have been paying attention and get your news from other sources as well, I apologize for the repeats. But for those of you who maybe dipped out for a minute, this should give you a pretty good sense of everything that happened over the last, call it 10 days or so. We kick off with the promised update from Elon Musk's GROC. Last week, Elon announced GROC 1.5. And according to people who had seen it, it represents a pretty significant upgrade over GROC 1.0.

Starting point is 00:01:07 Of course, a couple weeks previous to this announcement, a version of GROC 1.5, a version of GROC 1 had been released to open source, and so that gave people a sense of what was going on under the hood with GROC, and in this version, the team at GROC points to improved reasoning and problem-solving capabilities as the biggest advances. They write, one of the most notable improvements in GROC 1.5 is its performance in coding and math-related tasks. In our tests, GROC 1.5 achieved a 50.6% score on the math benchmark and a 90% score on the GSM-8K benchmark, two math benchmarks covering a wide range of grade school to high

Starting point is 00:01:36 school competition problems. Additionally, it scored 74.1% on the Human Eval benchmark, which evaluates cogeneration and problem-solving abilities. Now, by way of comparison, GROC 1.5 is scoring on the MMLU right around what Mistral Large is scoring, but it's still a little bit behind Gemini Pro 1.5, GPD4 and Claude 3 Opus. The context window has also increased significantly up to 128K tokens with GROC 1.5, with GROC writing. In the needle in a haystack evaluation, GROC 1.5 demonstrated powerful retrieval capabilities for embedded texts, within context of up to 128k tokens in length, achieving perfect retrieval results. Elon also added the details, firstly, that Groc should be available on X this week,

Starting point is 00:02:16 and second, that Grok 2, which is currently in training, quote, should exceed current AI on all metrics. So some more interesting things to be watching out for. That said, it wasn't really Grok 1.5 that was the most buzzed about LLM last week, but in fact, DBRX from Databricks. Last week, Databricks wrote, Today, we are excited to introduce DBRX, an open general-purpose LLM created by Databricks. Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established open LLMs. Moreover, it provides the open community and enterprises building their own LLMs with

Starting point is 00:02:48 capabilities that were previously limited to closed model APIs. According to our measurements, it surpasses GPD 3.5 and is competitive with Gemini 1.0 Pro. It's an especially capable code model, surpassing specialized models like Code Lama 70B on programming. Now, the reason that people were so excited about this was its open release. Dylan Patel from semi-analysis writes, Databricks' DBRX model is amazing, generally great, but crushes code. Eli Goatsy from Databricks writes, today we released an open-source model DBRX that beats all previous open-source models

Starting point is 00:03:17 on the standard benchmarks. The model itself is a mixture of experts. That's roughly twice the brains, $132 billion, but half the cost, $36 billion of Lama 270B, making it both smart and cheap. Since only 36 billion expert parameters are used live, it's close to twice the speed of Lama 270B. We're excited to build custom versions of this for organizations that have proprietary data.

Starting point is 00:03:37 Another big piece of news from the open source AI world was a big leadership shakeup at Stability AI. Amman Mustak announced that he was stepping down in his role as CEO and in his position on the board of directors to pursue something that he's calling decentralized AI. As the company goes about looking for a replacement CEO, Stability's C-O-O and CTO are jumping in as interim co-CEOs. In an announcement post, Amad said, I am proud two years after bringing on our first developer

Starting point is 00:04:03 to have led stability to hundreds of millions of downloads and the best models across modalities. I believe strongly in stability AI's mission and feel the company is in capable hands. It is now time to ensure AI remains open and decentralized. He also retweeted the post on Twitter saying, Not going to beat centralized AI with more centralized AI. All in on decentralized AI.

Starting point is 00:04:22 Lots more soon. He followed it up with a little bit more explanation, writing, as my notifications are RIP some notes. One, my shares have majority of vote at stability AI. Two, they have full board control. The concentration of power in AI is bad for us all. I decided to step down to fix this at stability and elsewhere. We'll be sharing more soon.

Starting point is 00:04:41 Now, while he hasn't given any details yet about what decentralized AI is actually going to look like, he has promised much to the chagridden of the crypto crowd that there will be no token or coin. Over in the land of big companies, Apple got some excited buzz going when it announced the official date of its worldwide developers conference, which is coming on June 10th. Bloomberg writes, though Apple didn't say what it plans to unveil, people familiar with the matter have said that the presentation will focus heavily on AI. They continue, Apple is expected to unveil its next major software updates for the iPhone, iPad, Mac, Vision Pro headset, and Smartwatch, and its new AI strategy

Starting point is 00:05:15 will be front and center for the planned iOS 18 upgrade. In announcing the event, Apple Marketing Executive Greg Josviak said, it's going to be absolutely incredible with absolutely an incredible, both uppercased, as Bloomberg puts it a clear nod to AI. Not to be nudged out of the news cycle by all these other Johnny Come Lately's, Open AI was very busy last week with announcements as well. First of all, they shared a post that they called SORA First Impressions. What seemed to come out of this and where most of the conversation in the community was, was the notion that, with Sora, the company is going after a very different audience. It sounds like they've been aggressively courting Hollywood and filmmakers, and also that the

Starting point is 00:05:51 cost of production with Sora right now just doesn't really make it viable as a consumer product. OpenAI also announced that they've been working on a text-to-voice platform called Voice Engine. TLDR, they say that this thing is too powerful for them to be comfortable releasing as it is. They write, OpenAI is committed to developing safe and broadly beneficial AI. Today, we are sharing preliminary insights and results from a small-scale preview of a model called voice engine, which uses text input and a single 15-second audio sample to generate natural sounding speech that closely resembles the original speaker. It is notable that a small model with a single 15-second sample can create emotive and realistic voices. Basically, their blog post talks about

Starting point is 00:06:27 first, how voice engine works with some examples, but then second, about the challenges of, as they call it, synthetic voice misuse. They're right, we hope to start a dialogue on the responsible deployment of synthetic voices and how society can adapt to these new capabilities. Based on these conversations and the results of these small-scale tests, we will make a more informed decision about whether and how to deploy this technology at scale. They also wrote, we recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year. We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society, and beyond, to ensure we are incorporating their feedback as we build.

Starting point is 00:07:03 One way that they're thinking about how to do this more safely, they write, we believe that any broad deployment of synthetic voice technology should be accompanied by voice authentication experiences that verify that the original speaker is knowingly adding their voice to the service and a no-go voiceless that detects and prevents the creation of voices that are too similar to prominent figures. There are a couple really interesting things about this. One, of course, is just the ongoing question of safety and ethics that relates to any new advancement in AI. But second, there is this constant question, it's something that I talked about with the latent space guys last week, about whether specialized or generalist models will win the day. The more that OpenAI keeps

Starting point is 00:07:37 releasing, or at least showing off these models, that seem to just kick the slats out of the specialist models in their category, seems to put more evidence in the category that these big generalist models will be the ultimate winners, although obviously it's still a little bit too early to tell. Another bit of OpenAI news was a report that Microsoft and OpenAI are collaborating on a $100 billion data center. This came from a report from the information, and the details were that the two companies are planning a data center project that, again, could cost as much as $100 billion, would be set to launch in 2008 and would include an AI supercomputer that they're calling Stargate. writes Reuters, the information reported that Microsoft would likely finance the project, which is expected

Starting point is 00:08:15 to be 100 times more costly than some of the biggest existing data centers. The proposed U.S.-based supercomputer would be the biggest in a series that companies are looking to build over the next six years. Meanwhile, on the other end of the computing spectrum, we also got information from Intel that Microsoft's co-pilot AI will soon be running locally on PCs without having to touch the cloud. writes Tom's hardware. We've previously reported on industry rumors that Microsoft's copilot AI service would soon run locally on PCs instead of in the cloud, and that Microsoft would impose a requirement of 40 tops of performance on the neural processing unit, but we had been unable to get an on-the-record verification of those rumors. That changed today at Intel's AI summit in Taipei, where Intel executives in a Q&A session with Tom's hardware said that co-pilot elements

Starting point is 00:08:56 will soon run locally on PCs. Now, if you've been watching the Apple AI news at all, you'll know that bringing this sort of capabilities on device and out of the cloud is something that they are also hugely focused on, and so interesting to see that it might become part of just the table stakes for AI computing going forward. Staying on the big company theme, Amazon announced the second tranche of its funding for Anthropic, this time investing $2.75 billion for a total investment of $4 billion, and reinforcing the close tie-ups between the two companies. Finally, to make quick mention of what's going on outside of the tech world itself, the White House last week announced a new policy through which every federal agency is being called on to

Starting point is 00:09:34 appoint a chief AI officer. According to this new policy, any agency that has not yet appointed a chief AI officer must do so within the next 60 days. And as Ars Technica writes, if an official that has already been appointed to that position doesn't have the necessary authority to coordinate AI use within the agency, they must be granted that additional authority or have a new AI chief be named. I mentioned it before, but one of the things that I think is extremely notable about the government's relationship with AI is that even as regulatory policy takes its time to work through the system, federal agencies, the military establishment, basically all of the mechanisms of government are not waiting around. They are figuring out how to integrate AI right now and doing it post-haste. Now, where all of these federal agencies are going to get all of this talent and expertise remains to be seen, but it's certainly something worth watching. For now, guys, that is going to do it for today's AI break. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The Most Important AI Stories Last Week

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.