The AI Daily Brief: Artificial Intelligence News and Analysis - Is the GPT-4.5 "Leak" Real?

Starting point is 00:00:00 Today on the AI breakdown, are the GPT 4.5 leaks real? Before that on the brief, the Pope is calling for an international treaty on AI. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube, our newsletter, and our Discord. Welcome back to the AI breakdown brief, all the AI headline news you need in around five minutes. We have a very random global leader-e-themed AI breakdown brief today, kicking off with a message. from the leader of the Catholic Church and one of the more recognizable figures in the world, Pope Francis. Roiders and others report that Pope Francis is calling for a global treaty to regulate AI because of concerns that algorithms could replace human values. Apparently each year, the Pope

Starting point is 00:00:53 sends a message to world leaders and the heads of big international institutions such as the United Nations in advance of the Roman Catholic Church's World Day of Peace, which is celebrated on New Year's day. This year's message was called artificial intelligence and peace. The number of note wrote, The global scale of artificial intelligence makes it clear, alongside the responsibility of sovereign states to regulate its use internally, international organizations can play a decisive role in reaching multilateral agreements and coordinating their application and enforcement. I urge the global community of nations to work together in order to adopt a binding international treaty that regulates the development and use of artificial intelligence in its

Starting point is 00:01:28 many forms. But the letter went on from there. From Reuters, Francis called for ethical scrutiny of the, quote, aims and interest of AI's owners and developers. warning that some applications of AI, quote, may pose a risk to our survival and endanger our common home, a reference to the earth. He wrote, in an obsessive desire to control everything, we risk losing control over ourselves. In the quest for an absolute freedom, we risk falling into the spiral of a technological dictatorship. He also specifically called out the risk of AI in the weapons industry.

Starting point is 00:01:56 He wrote, research on emerging technologies in the area of so-called lethal autonomous weapons systems, including the weaponization of artificial intelligence, is a cause for grave ethical consumption. concern. Autonomous weapon systems can never be morally responsible subjects. The unique human capacity for moral judgment and ethical decision-making could not be left to a machine, he said, adding that, quote, it is imperative to ensure adequate, meaningful, and consistent human oversight of weapon systems. Now, the cardinals that surrounded this Pope made it clear that this is not some Luddite message and that Pope Francis was very interested in new technology and technological advancement. Instead, they said that he was particularly concerned about AI because it is, quote,

Starting point is 00:02:33 perhaps the highest stake gamble of our future. So there you have it another ratcheting up of the rhetoric and conversation around the ethics of artificial intelligence. Now another world leader talking about AI comes in the form of Vladimir Putin. Today Putin appeared at what is apparently an annual news conference where callers from around the country get to ask him questions via video. One of those appeared to be an AI generated double of Putin and said, hello, I am a student at St. Petersburg State University. I want to ask, is it true you have a lot of doubles? And also, how do you view the dangers that artificial intelligence and neural networks bring into our lives? Mr. President, good afternoon.

Starting point is 00:03:11 I'm a student and I study at the St. Peter Institute. Do you have a lot of twins. And another point, what is your attitude towards the dangers fraught with the neural networks and the artificial intelligence? So he did not introduce himself. This person from St. Petersburg, you can talk like me. use my voice, my pitch, but I figured that only one person could speak like myself and use my voice, and this is going to be me. Reuters reports, quote,

Starting point is 00:03:45 The question prompted a rare hesitation from Putin, already in his fourth hour of taking questions at the marathon event. He said, I see you may resemble me and speak with my voice, but I have thought about it and decided that only one person must be like me and speak with my voice, and that will be me. This is my first double, by the way. Now, of course, the reference there is that there has been much recent. speculation that perhaps Putin has one or more body doubles that are covering for him or trying to cover up health problems. So clearly this is a way to discuss artificial intelligence on the one hand, but also have a media moment to bite back a little bit at that narrative. Staying on the theme of

Starting point is 00:04:19 AI doubles for a moment, the New York Times published an interesting piece called Dream of talking to Vincent Van Gogh. AI tries to resurrect the artist. Can doppelgangers of the Dutch painter help museums generate new interest in income? Now what they're talking about specifically is a program called Bonjour Vincent. It's at the Mouz d'Orsay in Paris, and basically allows visitors to interact with Van Gogh, asking him questions about his life and death, and having the AI version of him answer, replete as they put it with machine learning flubs. So the source material for this were the 900 letters that the artist wrote, as well as a number of early biographies that were written about him. According to the article, the most common question is why Van Gogh killed

Starting point is 00:04:56 himself for some version thereof. Now, obviously this is just a sort of little human interest piece, but I think certainly reflects a trend that we're likely to see a lot more of in the future, which is that I don't think we'll be able to resist the temptation to bring famous people back from the dead, to interact with them even if it's just in a ghostly sort of way. Prior and Cantatim for those Harry Potter fans out there. Also, if you'll permit me one recommendation, if you are interested in Van Gogh, the piece of popular media that I think is most powerful is a snippet from Doctor Who where Van Gogh is brought to a modern gallery, where an uncredited cameo Bill Nye explains why he

Starting point is 00:05:29 believes that Van Gogh was the greatest living artist. It's a very touching scene. You can't recommend it enough. Now, speaking of YouTube, see that transition there? That's why they pay me the big bucks. Again, another report from the Times found that a pro-China YouTube network has used AI to generate negative opinions of the United States. They say content from at least 30 channels in the network drew 120 million views and 730 subscribers since last year. This comes from a report from the Australian Strategic Policy Institute. And by way of example, they write, it a faintly still to tone, and with slightly awkward grammar, the American-accented voice on YouTube last month ridiculed Washington's handling of the war between Israel and Hamas, claiming that the United

Starting point is 00:06:08 States was unable to play its role as a mediator like China and now finds itself in a position of significant isolation. Basically, this network is using AI avatars and voice generators to create that is subtly or not so subtly anti-U.S. and pro-China. Now, of course, disinformation efforts are nothing new, but the scale at which they can operate and the quality of material they think can create is different because of AI. Said Jacinto Kest, an analyst at the Australian Institute, this campaign actually leverages AI, which gives it the ability to create persuasive threat content at scale and at very limited cost compared to previous campaigns we've seen. Apparently, other reports are suggesting that these sort of propaganda efforts are increasing in the wake

Starting point is 00:06:44 of the Israel-Hamas conflict. Now, since we started with big, heavy things, questions from world leaders, disinformation campaigns, let's end on some exciting technology itself. First of all, the never-s-still stability AI team has released Stable-1-2-3. It's a model for generating 3D objects from a single starting image. The introductory blog post writes, stable 0123 generates novel views of an object, demonstrating 3D understanding of the object's appearance from various angles with notably improved quality over 0-1-3 or 0-1-2-3xl due to improved training datasets and elevation conditioning. Now, as per usual for Stability AI, this model is being released for non-commercial and research purposes. I definitely think that we are going to see a flowering of animation and moving graphics and

Starting point is 00:07:28 filmmaking as these 3D and animation technologies catch up to the image generation technologies, which really came into their own over the course of this year. One more very hyped thing that is going all around Twitter slash X right now is called Outfit Anyone. It's a model that takes a person and a garment, tries those clothes on the person, and then animates them in action. It can be used for realistic images of people as well as for illustrations, anime, and cartoons. Now, there are a bunch different uses for this type of technology. One is, of course, the very obvious and probably lucrative virtual home try-on. As many efforts as there have been in that area, it just hasn't really been cracked yet,

Starting point is 00:08:03 but to the extent that retailers could get people to have a much more realistic imagination of what they might look like in clothes, that could significantly decrease the costs associated with refunds and returns. However, there is also just the entertainment and content creation use case. This technology doesn't have to be applied to real people and real outfits. It can also be applied to characters. Again, all part and parcel of what I think is going to be a big trend for 2024, which is the mass expansion of creativity in the digital motion graphics animation and filmmaking space.

Starting point is 00:08:32 However, for now, that is going to do it for today's AI breakdown brief. Next up, the main AI breakdown, and boy, is this a juicy one. Hey, guys, before we get into the main part of the episode, I wanted to mention just briefly that we are now in the midst, we're actually just closing out the first week of the AI breakdown AI education and learning beta. a community of learners where each day I'm dropping in tutorials, case studies, challenges, and a community of people are discussing them, going out and doing those challenges, in other words, learning AI by doing, and getting a chance to ask questions and talk with people who

Starting point is 00:09:04 are experiencing similar problems, taking advantage of similar opportunities, and generally adapting to this new AI-powered world. I'm incredibly encouraged by how it's going so far, and in about a week I'll be opening up registration for next month's second beta test for January. For now, I wanted to let you guys know that that was coming. And if you are interested in getting on the wait list for that, go to bit.ly slash AI beta. You'll see the short write-up that I did of December's beta, plus a link to a form where you can sign up for the wait list. I'd love to have you participate in January. So again, that's bit.ly slash AI beta.

Starting point is 00:09:37 And now, let's get to the main episode. My goodness, the AI rumor mill is in full swing right now. And the things coming out this morning are some of the juiciest that we've seen yet. So, just to give you a sense of what dropped, this screenshot that you're seeing now or hearing about, if you're listening to the podcast, is what looks like a pricing page from OpenAI. You can see a little URL up in the top, OpenAI.com slash pricing slash edit draft, suggesting that it's a draft page. And right at the top, it says GPT 4.5.

Starting point is 00:10:09 It's described as, our most advanced model brings multimodal capabilities across language, audio, vision, video, and 3D, alongside complex reasoning and cross-modal understanding. that it has pricing for three different models, GBT 4.5, GPT 4.5664K, and GPT4.5 Audio and Speech. It also has a separate link that's not shared for a vision and 3D pricing calculator. So this started flying around this morning, and as we'll get into, nobody is quite sure if it's real or not. But let's back up and give a little bit of the context for why this might be happening right now, and that is, of course, the reemergence of Google onto the scene. Last week, Google announced Gemini.

Starting point is 00:10:47 Initially, the excitement was palpable. For the first time, we had something that could actually compete with GPT4, at least it seemed. Indeed, Gemini's Ultra model had actually outperformed GPT4 on the MMLU. The problem was that Ultra wasn't actually available and it wasn't going to be available until sometime next year, which meant that what was coming, which wasn't even available that day, much to many developer chagrin, was Gemini Pro, which was something closer to GBT3.5 level. For many, after reading the 11 page or whatever it was announcement blog, it felt like a bit of a rug pull, or at least a classic announcement of an announcement kind of thing.

Starting point is 00:11:22 This was contrast, of course, with Mistral at basically the same time, dropping a torrent link to their newest model without any explanation for around three days after it got out there. Still, when all was said and done, people were excited and are excited that Google is back in the game in a more meaningful way. And yesterday, CEO Sundarba Chai tweeted, today developers can start building with our first version of Gemini Pro through Google. AIS studio at A.ai.gov.govore. Developers have a free quota and access to a full range of features, including function calling, embeddings, semantic retrieval, custom knowledge grounding, chat functionality, and more. It supports 38 languages across 180 countries. Gemini Ultra is coming early next year. We're excited to see what you build. The blog post was called it's time for developers and enterprises to build with Gemini Pro. And so the question was, of course, what would the early

Starting point is 00:12:09 reviews be? Sully Omar writes, some initial impressions of Gemini Pro. Much better. at understanding complicated queries than 3.5. Has really good function calling with minimal prompts. It's a little chatty but solid model so far. Now, asking a question that I think many were feeling, Chase MC 67 responded, it's 2023. Why are we comparing anything to GPT 3.5 anymore? That's the model you use when you don't need the output to be smarter than a 5-year-old. Do people actually find that useful? Or is it just to say they have it? Sully responded to that. 3.5 is still useful because it's cheap. You don't need to use state-of-the-art models for every query. It's too expensive. Offshoring dumb models for simple queries is a valid strategy.

Starting point is 00:12:46 Others weren't as impressed. Entrepreneur Bindu Ready wrote, Gemini Pro is more expensive than GPT 3.5. I'm not sure what the point is, given that they are a GPT 3.5 class model. Not to mention that we and a bunch of other startups have mixtral MOE, another 3.5 class model available at 2X cheaper than GPT3.5. Langchain had a more positive assessment. They wrote,

Starting point is 00:13:06 Gemini is natively multimodal, but how does it stack up against GPT4V? We put Gemini ProVision head-to-head with the reigning champ to see how well it could answer questions based on a multimodal slide deck. The results? Gemini seems to be a formidable model matching GPT4V's performance when using the same open-clip embedding model and only missing one question when retrieving based on embedded image summaries. So obviously a more positive assessment here. For those without a real dog in the fight, a fairly decent summary of what has changed comes from the information with their latest piece, how Google got back on its feet in the AI race. Early this year, the rise of OpenAI seemed to spell Doom for Google, but the tech giant has

Starting point is 00:13:44 quieted the squabbling between its AI researchers and is finally playing offense with its latest AI technology, Gemini. Now, the hard part starts. Now, the gist of the article is effectively that Google has stemmed the bleeding after a period in which people were very surprised to see how far ahead of them Open AI really got. It's not exactly arguing that they've fully made up that ground, but just that they are actually back in the race now. It also makes it clear that this is a very big deal. The information writes, Gemini is one of the highest stake efforts in the company's 25-year history. As Google enters middle age, its core advertising business continues to turn out huge profits, which have subsidized an array of bets by its parent company alphabet on new businesses, including

Starting point is 00:14:23 self-driving cars, health insurance, and biotechnology. But none of those decade-old bets has paid off. As a result, investors have increasingly breed down the neck of Google leaders to cut costs across the 182,000-person company, leading to large-scale layoffs this year that have hurt employee morale. employees are girding for more layoffs in the new year, though it isn't clear if they will be broad-based or target-specific groups. AI is another bet that will require hefty funding from the company to pay for everything from Personnel to Hardware. Google wants to dispel the perception that it has done little more than milk innovations

Starting point is 00:14:52 from decades ago. It's a really extensive piece well worth a read if you've got an information subscription. Now, back to this GPT4.5 leak. There was actually another leak shared once again from one of the more active leakers slash commentators slash predictors around OpenAI, which is of course the Flowers from the Future account. This morning, that account had tweeted what looked like a snippet of an email from inside Google about why they had gone forward with the Gemini API. The snippet reads, rumors circulating within the department, redacted has taken decisive action to address the potential impact of GPT4.5. In response

Starting point is 00:15:26 to these speculations, a strategic decision has been made to expedite the activation of the Gemini API effective as of today. This proactive measure aims to mitigate any unfortunate, seen consequences and reinforce our position in light of emerging advancements. Now, we had even before this a lot of rumors that 4.5 was on the way. The Jimmy Apples account, which has previously been right on a number of different timelines and feature announcements, wrote, keep an eye out on a potential end of December GPT 4.5 drop. And again, Flowers from the Future last week wrote, there's one big thing and one small open-A-I thing waiting for us.

Starting point is 00:15:57 The small thing is currently delayed due to company dramas, and the big thing seems to be progressing according to plan. December remains exciting, especially next week, which, if you're listening around the time this comes out on the 14th or 15th, meaning this week. Now, that small thing we got yesterday. It was the re-enablement of chat GPT plus subscriptions, which had been turned off recently because of lack of access to compute. Sam Altman tweeted,

Starting point is 00:16:18 thanks for your patience while we found more GPUs. Flowers from the future confirmed that was the small thing they were talking about. So what did that account think of this GPT4 leak? They shared it and said, I don't know what to make of it. It could be fake, but I'm not sure. No one I know has heard of this draft, which of course means nothing, but be careful lest you fall victim to a quick endorphin rush,

Starting point is 00:16:38 and yes, I know some of you are using my account in the same way. They also separately tweeted, The content of the screenshot seems to be largely correct, but none of my sources can currently verify whether it is a genuine draft. Now, what we did get yesterday in addition to the ChatGPT Plus subscriptions reopening, were a couple real announcements as well. The first was that the OpenAI Startup Fund is launching a new cohort that they're calling Converge 2. The blog post about it calls it the next evolution of our program for transformative AI companies.

Starting point is 00:17:07 They write, The AI Startup Fund was founded on two core beliefs. First, new and powerful AI systems will give rise to a new wave of transformative startups. And next, these new companies will play a central role in making AI, a force multiplier for human ingenuity and creativity. We launched Converge in December 2022 to accelerate startups working on the forefront of this evolution, doing our part to help push the boundaries of applied AI in important domains. Today, we're opening applications for Converge 2, the second cohort of our six-week program for exceptional engineers, designers, researchers, and product builders using AI to reimagine the world.

Starting point is 00:17:40 So this is sort of like a mini accelerator program but for AI startups. So they're promising tech talks, office hours, social events, and anyone who's chosen receives a $1 million investment from the Open AI Startup Fund. Applications are open from now until January 26th. Now, in addition to that, an even bigger piece of real news was that OpenAI had signed a deal with publishing giant Axel Springer. The exact terms of the deal weren't reported, but sources suggest that it was in the tens of millions of dollars over the three-year deal. Now, Axel Springer owns publications like Politico and Business Insider, and through the deal, OpenAI will legitimately get used to train their models on all of that content, but will also actually reference that

Starting point is 00:18:22 content more explicitly in chat GPT answers. So, for example, if you're not even if you're not really, If the response is using one of those pieces, it will attribute it and give links to the full articles. Now, this is obviously similar to the experience of something like perplexity, but is not currently how chat GPT works. Said the Axel Springer CEO, we want to explore the opportunities of AI empowered journalism to bring quality societal relevance at the business model of journalism to the next level. Now, obviously, this stands in contrast to all of the copyright battles being fought around whether companies like OpenAI are within their rights and whether it is truly fair use to train

Starting point is 00:18:56 AI models on scraped content, but certainly voluntary partnerships like this could have an impact on the norms surrounding how AI models and labs work with publishers. All in all, I have to say it feels to me like we are right on the edge of some big announcement from OpenAI, specifically around GPT 4.5. There's just too much swirling and too much smoke for there not to be some fire there. And I think that to the extent that it's not being released, it's for reasons that have nothing to do with whether it's available and that have more to do with OpenAI's consideration of the public narrative and conversation around it.

Starting point is 00:19:28 But of course, we will be watching this very closely, and you know the second they drop something, I will be here to tell you all about it. For now, I appreciate you listening or watching. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Is the GPT-4.5 "Leak" Real?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.