The AI Daily Brief: Artificial Intelligence News and Analysis - No GPT-5 This Year, But Maybe Gemini 2.0

Episode Date: October 29, 2024

OpenAI has clarified that no model codenamed "Orion" or GPT-5 will arrive this year, though Google is targeting December for the release of its own next-gen model, Gemini 2.0. Meanwhile, Google and An...thropic are working on AI models capable of using computer interfaces, signaling rapid advances in agent-like capabilities. AI research circles are watching closely to see if new models deliver revolutionary gains or merely incremental improvements. Concerned about being spied on? Tired of censored responses? AI Daily Brief listeners receive a 20% discount on Venice Pro. Visit ⁠⁠⁠https://venice.ai/nlw⁠⁠⁠ and enter the discount code NLWDAILYBRIEF. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, OpenAI denies rumors to release Orion this year, but Google does seem on the verge of launching Gemini 2.0. Before that, in the headlines, Meta has released an open version of Google's Notebook L.M. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes. One of the busiest and most exciting pieces of generative AI software,
Starting point is 00:00:33 out there right now is Google's notebook LM. Specifically, of course, people have been super excited about their audio overviews feature, which takes documents of any length and turns them into a 8 to 12 minute podcast-style conversation between two hosts, complete with slightly cringy interactions. Now, Notebook LM has opened up all sorts of new thinking around how people consume information. While nominally, it could create some competition in the podcast sphere, where I think it's actually going to play out is people are just going to get used to summarizing big buckets
Starting point is 00:01:02 of knowledge as a way of starting to understand things. Anytime students are trying to get their head around a complex topic, it's going to start by dumping in the information into Notebook LM and getting that primer out. Now, recently, Notebook LM has added some of the most requested features, including the ability to better guide the outputs of Notebook LM, which has done nothing but increased people's excitement around the tool. Well, now Meta has released their own version of the tool called, perhaps unsurprisingly, notebook Lama. And very clearly, it focuses on the part of Notebook LM that people are really excited about, this ability to create audio summaries in the form of a back and forth between generated podcast hosts. In terms of what was actually released,
Starting point is 00:01:44 this definitely follows Meta's approach to releasing semi-finished open-source software that people can start messing around with. This is part of the new way that meta works. Instead of always releasing polished things, a lot of their value proposition is moving quickly in getting developers tools that they can build on top of. In this case, Meta release the AI workflow for how the process works. The user first prompts an LLM to summarize the documents while retaining contexts. Then it prompts the LLM to turn it into a podcast transcript. Then they prompt the LLM to make this podcast more dramatic.
Starting point is 00:02:13 Finally, the audio is generated using a combination of Parlor and Suno. One of the most interesting pieces of this is that meta suggests combining different LLMs to cut down on inference costs. Lama 3.21B is used to produce the summary. then the company's frontier model Lama 3.170B is used to generate the podcast script, and Lama 3.1.8B is used to punch up the script. So sort of an interesting look at the idea of smaller models as part of a broader AI use case or workflow. In many ways, the release is more of a recipe for making a notebook LM clone rather than a product itself. And because of that,
Starting point is 00:02:44 it's tech agnostic. Users can take this workflow and apply it to whatever models they have access to. Alas, tech crunch notes that the results so far are a little lackluster writing, the results don't sound nearly as good as the notebook llama samples I've listened to, the voices have a very obviously robotic quality to them, and tend to talk over each other at odd points. At this stage, meta researchers say that the current limitation is the underlying models themselves. Google is using their proprietary model while meta is leveraging open source models which aren't on the cutting edge.
Starting point is 00:03:11 However, they did propose some ways to improve the workflow. Writing on their GitHub page, another approach of writing the podcast would be having two agents debate the topic of interest and write the podcast outline. Right now, we use a single model to write the podcast outline. Overall, not nearly as finished as Google's product, but still a fascinating look at how easy it will be to spin up competing services from readily available tools. Next up, the latest big numbers from Perplexity, the company says it's now serving 100 million search queries every week. That is almost double the pace from July.
Starting point is 00:03:41 In the David v. Goliath battle where Google is Goliath, Google Search still serves up around 8 billion queries per day. But Perplexity's niche of AI-powered search is rapidly expanding. In response to an Elon Musk tweet commenting that Google and Microsoft control almost 100% of search volume, Perplexity CEO Aravan Shrinivas responded, this will change in the next three years. It's also pretty incredible to see the pace at which the team is executing. They expect to have an AI shopping experience ready for Black Friday for pro subscribers, and early adopters that churned out of the product are noticing a big improvement when they come back. Andrew Gao, a Stanford student, tweeted,
Starting point is 00:04:15 I haven't tweeted about Perplexity mainly because I wasn't that impressed with the technology. It seemed to be just taking my search query, Googling it, and then summarizing the results. I decided to give it another try yesterday since they announced new reasoning. I was really impressed. I think my searches capture the importance of reasoning and research. Old perplexity was basically a glorified summarize. New perplexity can actually do helpful research that saves me time. A query such as the one in the screenshots naturally requires several branching steps.
Starting point is 00:04:40 You can't just Google the query and read the top articles because there is no article about it. Lastly today, one from the world of AI and geopolitics, TSM has cut supply to a client after they were found to be funneling chips to Huawei. Two weeks ago, news broke the TSM chips were showing up in Huawei hardware in an apparent breach of export controls. Throttling chip supply to the Chinese firm has been a national security priority in the U.S. since 2020. The news triggered internal investigations and a rumor of a U.S. government probe. It's now revealed that the Huawei supply was traced to a single customer. On Thursday, Taiwan's economic ministry said that they had been informed, but the customer had not been identified. They added, there was already an
Starting point is 00:05:19 interaction and a contractual partnership in place, so it's an old client. The firm said that they provided detailed reports to TSM in order to prove that they have no links to Huawei. For their part, TSMC released a statement which said, TSM is a law-abiding company, and we are committed to complying with all applicable rules and regulations, including applicable export controls. We proactively communicated with the U.S. Commerce Department regarding the matter in the report. We are not aware of TSM being the subject of any investigation at this time. Still, this is being treated extremely seriously by some in Washington. Republican Representative John Moulinar, Chairman of the House China Select Committee,
Starting point is 00:05:51 said on Wednesday that the reports, quote, represented a catastrophic failure of U.S. export control policy. He called for immediate answers from both the Commerce Department and TSM and the scope and volume of this disaster. Meanwhile, TSM's retired founder, Morris Chang, is lamenting the severe challenges to growth embodied in the export controls. At a TSMC event this weekend, he said, TSM is now truly a turf war all major powers want to secure.
Starting point is 00:06:15 Free trade of semiconductors, particularly the most advanced semiconductors, has died. In such an environment, our challenge lies in how to continue to drive growth. Interesting stuff, but that is going to do it
Starting point is 00:06:24 for today's AI Daily Brief Headlines edition. Next up, the main episode. Today's episode is brought to you by Venice. Venice is a private, uncensored, generative AI app. It accesses open source models to enable text, image, and code generation without the fear of being spied on or having your data exploited.
Starting point is 00:06:42 Discuss anything with Venice without concern about it being monitored, sold, or given to advertisers and governments. Venice is different because your conversations and creations are kept securely within the browser, never stored or accessible by Venice. Unlike other AI apps, Venice won't tell you what's okay to say or not. Venice won't patronize you. It simply provides direct access to machine intelligence, no topics are off limits, no ideas or taboo. With Venice, you're in control of the AI as you should be. Pro subscriptions are available for $49 a year or $8.
Starting point is 00:07:10 per month. AI Daily Brief listeners receive a 20% discount on Venice Pro. Visit venice.a.I slash NLW and enter the discount code NLW Daily Brief. That's NLW Daily Brief All One Word. Today's episode is brought to you by Super Intelligent. Every single business workflow and function is being remade and reimagined with artificial intelligence. There is a huge challenge, however, of going from the potential of AI to actually capturing that value, and that gap is what Superintelligent is dedicated to filling. Superintelligent accelerates AI adoption and engagement to help teams actually use AI to increase productivity and drive business value.
Starting point is 00:07:49 An interactive AI use case registry gives your company full visibility into how people are using artificial intelligence right now. Pair that with capabilities building content in the form of tutorials, learning paths, and a use case library, and Superintelligent helps people inside your company show how they're getting value out of AI while providing resources for people. to put that inspiration into action. The next three teams that sign up with 100 or more seats are going to get free embedded consulting. That's a process by which our super intelligent team sits with your organization, figures out the specific use cases that matter most to you, and helps actually ensure
Starting point is 00:08:23 support for adoption of those use cases to drive real value. Go to Bsuper.a.I to learn more about this AI enablement network. And now back to the show. Welcome back to the AI Daily Brief. OpenAI has certainly had no shortage of cool feature launches this year. We finally got our hands on advanced voice mode. We recently got the very exciting O-1, the new reasoning model which pushes us in an agentic direction. And yet, I would be lying if I said that everyone wasn't really just focused on their next big frontier model drop, whether it's called GBT5 or Orion, as it appears to be codenamed. I think there is broadly a sense that a lot of the future of AI is going to be shaped by what OpenAI can do with GPT-5. If this model represents a total sea change and some dramatically advanced
Starting point is 00:09:12 capabilities, it will ignite this space in an incredible way. If on the other hand, it's just only incrementally better, I would expect to see a lot more people coming to argue as some do that we may be reaching some sort of plateau with today's architectures. Now, the big labs keep pushing back in saying that we are not reaching those plateaus, which is one of the reasons that people are so eagerly anticipating this drop. For that reason, it was very exciting last week when the Verge and others started reporting that OpenAI planned to release GPT5 or Orion as early as December. Alas, those hopes were dashed when OpenAI responded to those rumors by saying, quote, we don't have any plans to release a model codenamed Orion this year. We do plan to release a lot of other great
Starting point is 00:09:54 technology. We also got commentary from Sam Altman himself, who posted on the Verge's quote-unquote scoop, fake news out of control. Sam also took some time to tweet with the chattering classes, including Greg, who wrote Sam, my chat GPT is pretty woke, could you please fix that? To which Sam Altman wrote, it's really not, but enjoy your doge coins. Now, because it's fun to parse every little thing, TechCrunch noted the awkward wording of the correction. Remember, they said, we don't have any plans to release a model code named Orion this year. We do plan to release a lot of other great technology. Basically, that doesn't say that we're not releasing a frontier model, just that we're not
Starting point is 00:10:30 releasing a model codenamed Orion. As TechCrunch wrote, OpenAI statement, substantial wiggle room. It could be that the company's next major model isn't, in fact, Orion. Or perhaps OpenAI will release a new model by December, but one less capable than Orion. At this point, it's anyone's guess. And yet with all of that, there are very clearly some trends that are here and are driving a lot of developments in the space right now. One of the biggest, if not the biggest story last week, was Anthropics release of their computer use model, which basically allowed Claude to take over a computer using a cursor to do basic tasks on the web. We're now getting news that Google is planning to launch their Next Frontier model Gemini 2.0 by the end of the year as well.
Starting point is 00:11:11 Not to let Open AI steal all the headlines, Google is also targeting a December release for their new model. Not much is known about how the competition will shake out at this point. However, Alex Heath of the Verge wrote, I've heard that the model isn't showing the performance gains the Demis Hasabas-led team had hoped for, though I would still expect some interesting new capabilities. The chatter I'm hearing in AI circles is that this trend is happening across companies developing leading larger models. Heath also noted that Noam Shazir, that AI researcher Google poach from character AI, or really lured back to Google, is working on a separate reasoning model aimed at competing with OpenAIs 01.
Starting point is 00:11:43 Going back to what we were just discussing with GBT5, this iteration of models is set to be an order of magnitude more expensive to train, so labs are taking a big bet on revolutionary power rather than iterative improvements. The question among AI researchers I've spoken with is whether these ultra-exensive frontier models are all converging in their capabilities and ultimately commoditizing each other. As I've written before, the real value is increasingly flowing to the products these models power and not the models themselves. Still, it's clear that for the foreseeable future, the top AI developers will continue racing
Starting point is 00:12:11 to release even bigger and more expensive models as fast as they can. Even if the performance improvements start leveling off, the competitive pressure to have the leading model isn't going away anytime soon. Now, separately, as I mentioned, the information reports that Google is preparing to preview a computer use feature, co-named Project Jarvis, after of course Tony Stark's digital assistant, the feature would only allow Google's models to access a web browser rather than having control of the entire interface, but still, it's designed to automate everyday web-based tasks, including clicking buttons, and entering text. Sources say the feature could automate gathering research, purchasing a product, or booking a flight.
Starting point is 00:12:44 Or really, as I suspect, just show off what could be capabilities in the future, but if it's limited to things like booking a flight, it's probably not all that valuable in the short term. The information noted that plans to preview the feature in December are subject to change, and that Google are still considering releasing it to a small number of testers to iron. out bugs. While this era of allowing AI agents to take over a computer is arriving quickly, the ways various labs are approaching the feature shows the tradeoffs between power and safety. Google and Microsoft are taking a very cautious approach, sandboxing the agents to a web browser and limiting their capabilities. Anthropic, meanwhile, has taken explicitly the opposite approach,
Starting point is 00:13:18 launching a version of the feature with relatively few constraints, but that didn't have access to browsing the web. Anthropic noted that the feature is still, quote, cumbersome and error-proned, however, they spun it as the safest way to deploy a feature that will one day be widespread. They said this means we can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks. Anyways, we are still living in the realm of rumor, but it feels like we are on the cusp of the next generation, and it is going to be very exciting to watch to see how it rolls out. For now, though, that is going to do it for today's AI Daily Brief.
Starting point is 00:13:50 Until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.