The AI Daily Brief: Artificial Intelligence News and Analysis - This Massive New AI Model is 5.7x Bigger than ChatGPT's Dataset

Starting point is 00:00:00 Today on the AI breakdown, OpenAI proposes a new global regulatory body for artificial intelligence, and before that on the brief, a new 1 trillion parameter model from Intel, meta announces a model that can recognize 4,000 languages and much, much more. The AI breakdown is a daily podcast and video all about the most important news and discussions in AI. Like, subscribe, and share, and learn more at Breakdown.network. Welcome back to the AI Breakdown Brief, all the headline news and discussions you need in five minutes or less. although today I'm not quite sure we're going to hit that five-minute mark. There is so much going on in the world of AI,

Starting point is 00:00:38 and we start with a new announcement from Intel around their Aurora generative AI model that is being trained on one trillion parameters. Now, interestingly, this comes at a time when many of the other companies are talking about how maybe the endless quest to have more parameters and larger data sets isn't necessarily what they need, but Intel is going for some very specific use cases.

Starting point is 00:01:00 In particular, Intel is thinking about the scientific applications of generative AI. So, for example, systems biology, cancer research, climate science, cosmology, polymer chemistry, and materials, and more. On yesterday's show, we talked about how AI had been used to discover a rare DNA sequence, and that's the type of use case that Intel is really focused on. Now, for a sense of comparison, that one trillion parameter data set would be about 5.7 times larger than the data set that underlies the currently available chat GPT. And Intel says that Aurora will be trained specifically on general text, code, scientific texts, and structured scientific data from biology, chemistry, material science, physics, medicine, and other sources. Next up, speaking of really interesting models, meta has announced their new massively

Starting point is 00:01:42 multilingual speech AI. It is an AI that they say can recognize over 4,000 languages, which is 40 times more than any previously known technology. Meta says that this model could expand speech to text and text to speech from around 100 languages today to more than 1,100 going forward. In their announcement, they write, many of the world's languages are in danger of disappearing, and the limitations of current speech recognition and generation technology will only accelerate this trend. We want to make it easier for people to access information and use devices in their preferred language, and today

Starting point is 00:02:13 we're announcing a series of AI models that could help them do just that. As the MIT technology review points out, there are currently around 7,000 languages in the world, but speech recognition models only cover about 100 of them comprehensively. new meta model was trained on audio recordings from thousands and thousands of languages of the New Testament Bible. Now, not everyone is a fan of training models on religious texts for the potential for bias, but by and large, researchers are excited about the possibilities in this new meta model. Now, even as we explore these new trillion parameter models, we're still figuring out the best way to actually train AI. A new research paper just came out called Lima, less is more

Starting point is 00:02:51 for alignment. I asked the new XPers plug-in from ChatGPT to review the paper for And what it said is in the context of LLM training, there are roughly two stages. The unsupervised pre-training from raw text to learn general purpose representations, and then the large-scale instruction tuning and reinforcement learning to better align the end tasks and user preferences. The researchers introduced a 65 billion parameter model and did only an extremely limited amount of fine tuning. Even with that limited amount of fine tuning, they found that it was preferred to GPT4 answers 43% of the time, and that number went up for Bard and other models. Their conclusion is that the vast majority of learning in LLMs comes in the pre-training,

Starting point is 00:03:31 not the fine-tuning portion of the training. Moving on, one of the things we've been discussing a lot lately is what comes after the current crop of LLMs, and a lot of the focuses on multimodal models. Dr. Jim Fan from NVIDIA recently tweeted about this new model called Cody, composable diffusion. He tweets, mapping any mixture of modalities, text, image, video, and audio to any other mixture. GPT4 is text image to text, but we will soon see a barrage of models that are increasingly versatile in the modality input and output. The example given is the combination of a prompt teddy bear on a skateboard 4K with a raining ambiance sound

Starting point is 00:04:08 that ends up producing an image of a video of a teddy bear in New York City rolling through the streets during a rainstorm. Moving on, AI is capturing the interest not just of us internet denizens, but some of the biggest and most influential people in the world. CNBC is reporting about comments from Bill Gates yesterday, where he said that AI could kill Google search and Amazon as we know them. Gates said that this technology is likely to radically alter human behavior. He said, whoever wins the personal agent, that's the big thing, because you will never go to a search site again. You will never go to a productivity site. You'll never go to Amazon again. He also said that personally, he'd be sad if Microsoft wasn't in the race.

Starting point is 00:04:45 Speaking of Microsoft, Twitter has accused Microsoft officially of misusing its data, which many are seeing as the beginning of a much bigger fight over AI. This is something that Elon Musk had intimated over Twitter, but now there is an official letter from his lawyers as well. From NBC, the letter primarily addresses a seemingly narrow set of alleged infractions by Microsoft and drawing information from Twitter's database of tweets. But the move could foreshadow more serious developments. Musk has previously accused Microsoft and its partner OpenAI in a tweet of, quote,

Starting point is 00:05:13 illegally using Twitter data to develop sophisticated AI systems such as chat GPT. As we round the corner an update on some commercially available AI software, Adobe Firefly is now available for all users within Adobe ID, and not wanting Tesla Optimus to have all the fun. 1X, a company backed by OpenAI, has released a new set of videos of their robot, Eve, working in the real world. The videos show Eve grasping and picking up objects as well as opening a door. Anyways, guys, that is it for today's AI breakdown brief.

Starting point is 00:05:45 Tons and tons going on. and we didn't even get into yet the AI image of the Pentagon, which caused markets to have a hemorrhage yesterday. For that, stick around for the main AI breakdown that will be coming soon. In the meantime, if you're enjoying the AI breakdown or the AI breakdown brief, please like, subscribe and share. And I will be back soon for the main AI breakdown. As a fake photo of an explosion at the Pentagon goes viral and crashes markets, open AI calls for a global regulatory coalition for artificial intelligence. Welcome back to the AI breakdown. Well, there have been a lot of theoretical concerns around the use of AI and deep fake images, but yesterday we got a taste of what it could do in practice.

Starting point is 00:06:29 Basically what happened is that a photo started appearing on Twitter and was shared widely by accounts with millions of followers that had a big plume of black smoke right next to the Pentagon in the United States, which is, of course, the home of the Department of Defense. At around 10.06 a.m., the Delta 1 account, Walter Bloomberg, on Twitter, shared the photo, and within the next five minutes, the U.S. stock market had fallen by around a quarter of a percent. It was also shared by RT, which is a media company with connections to the Russian government, and it took about 20 minutes for the Arlington Fire Department to say that this was not true, that there is no explosion or incident taking place on or near the Pentagon Reservation. Now, while this ended up being a nothing burger, it shows just how much havoc that AI can wreak in the public when it's deployed in any sort of adversarial way. These and other AI concerns were on display

Starting point is 00:07:19 at the recent G7 meeting. Leaders of the Group of Seven Nations called for the development and adoption of technical standards that would keep AI, quote-unquote, trustworthy, and said that governance of AI had not kept pace with technological innovation. They said that while approaches to achieving, quote, the common vision and goal of trustworthy AI may vary, that the rules needed to be, quote, in line with our shared democratic values. Now, of course, back here in the United States, discussions of AI in the public sphere are on the rise. Last week, the Senate held its first AI hearing in the post-chat GPT era, in which Sam Altman, the CEO of OpenAI, was the star witness.

Starting point is 00:07:55 One of the things that many commentators noted during that hearing was that Sam Altman seemed to be quite in favor of pretty involved government regulation. Some commentators did not take this as an example of Altman just having an appropriate understanding of the concerns of AI, but instead viewed it as tantamount to an attempt at regulatory capture. In other words, trying to use... the burden of onerous regulations to cement a leadership position and not let new challengers come up through the ranks. I ask perplexity, which is a chatbot that I have been absolutely loving lately, is OpenAI attempting regulatory capture? And here's how it sum things up. There are

Starting point is 00:08:29 mixed opinions on whether OpenAI is attempting regulatory capture. Some people accuse OpenAI's CEO, Sam Altman, of attempting a ladder pull by using regulatory capture to create a monopoly by preventing smaller companies from competing. Some experts warn of letting corporations write lax rules that benefits them. The concept of regulatory capture is not unique to OpenAI, as almost every large tech company has attempted it at some point. This conversation got loud enough that Sam actually responded to it on Twitter. He tweeted, regulation should take effect above a capability threshold. AGI safety is really important and frontier models should be regulated. Regulatory capture is bad, and we shouldn't mess with models below the threshold. Open source models and small startups are

Starting point is 00:09:10 obviously important. Adding a little bit more meat on the bone of what OpenAI actually thinks the regulatory apparatus should be, today they published a blog post called the governance of super intelligence. Now they say is a good time to start thinking about the governance of superintelligence. Future AI systems dramatically more capable than even AGI. Open A.I says it's conceivable that within the next 10 years, AI systems will exceed expert skill level in most domains and carry out as much productive activity as one of today's largest corporations. They say, given the possibility of existential risk, we can't just be reactive. Nuclear energy is a commonly used historical example of a technology with this property. Synthetic biology is another example. We must mitigate the

Starting point is 00:09:51 risks of today's AI technology too, but superintelligence will require special treatment and coordination. So what do they think that should look like? Well, they offer a couple starting points. First, they say we need some degree of coordination among the leading development efforts to ensure that the development of superintelligence occurs in a manner that allows us to both maintain safety and help smooth integration of these systems within societies. This, they say, could involve governments, for example, setting up a rate of growth objective that limits development to a certain rate per year.

Starting point is 00:10:18 Second, they say, and this is really the flesh out of what Sam was saying at the hearing, we are likely to eventually need something like an IAEA for superintelligence efforts. Any effort above a certain capability or resources like compute threshold will need to be subject to an international authority that can inspect systems, require audits,

Starting point is 00:10:36 test for compliance with safety standards, place restrictions on degrees of deployment and levels of security, et cetera. Third, they say we need the technical capability to make a superintelligence safe, although they admit that this is an open research question that they're just not sure of. They also reinforce that there should be a capability threshold

Starting point is 00:10:51 under which this doesn't apply. In the what's not in scope section, they say, we think it's important to allow companies and open source projects to develop models below a significant capability threshold without the kind of regulation we describe here. And of course, they say all of this needs public input. Now, interestingly, Google also published

Starting point is 00:11:09 just a couple of days ago, a policy agenda for responsible AI progress. They say calls for a halt to technological advances are unlikely to be successful or effective and risk missing out on AI's substantial benefits and falling behind those who embrace its potential. Instead, we need broad-based efforts across government, companies, universities, and more to help translate technological breakthroughs into widespread benefits while mitigating risks. Their policy platforms include one, unlocking opportunity by maximizing AI's economic promise. They say, among other things that involves, quote, preparing workforces for AI-driven job transition. The second pillar is promoting responsibility while reducing risks of misuse. They also discuss in this section some sort of global

Starting point is 00:11:48 body, although perhaps with less teeth than the one imagined by OpenAI. Leading companies, they say, could come together to form a global forum on AI, building on previous examples like the Global Internet Forum to counterterrorism. Number three, they say, enhancing global security while preventing malicious actors from exploiting this technology. Now, frankly, these proposals are a lot more toothless than the ones put forward by OpenAI. Part of that could be that open AI is responding specifically to what they saw as critique and what they were trying to describe in last week's hearing, whereas Google might have just been laying out the foundations of what they think is a good starting point. There are, as you might imagine, a lot of opinions about this. Jeffrey Laddish,

Starting point is 00:12:24 who is a frequent commenter on AI safety rights. First off, it's absolutely wild that this is where we're at. The leading AI company in the world is publicly saying that they want to build superintelligence in the near future. Let that sink in. I really do appreciate their openness in this. We could be living in a world where they plan to do this in secret, and I much prefer the present world. This is absolutely an urgent conversation we need to have as a global community. And also, we should absolutely not build super intelligent AI until we know how to do that safely. We need to spend however much time it takes to make it safe. Going ahead with building it right now, with our current state of knowledge, would be suicide. Building superintelligence risks human extinction. It's the most risky thing we've

Starting point is 00:13:03 ever done. I do think we should build super-intelligent AI once we've figured out how to do so safely, something we are very far from knowing, but I shouldn't get to decide that for everyone. I fully agree with OpenAI that we need an IAEA for AI, and we need Democratic global oversight to manage risks from superintelligence. The decision if and when to build super-intelligent AI should be decided by humanity as a whole, not one company or country. Ian Hogarth, who around five weeks ago wrote an influential piece called We Must Slow Down the Race to Godlike AI says, remarkable how much the conversation around AI governance has shifted this year and says OpenAI describing a CERN-like project that major labs like Anthropic, DeepMind,

Starting point is 00:13:42 and OpenAI could combine into. Alexandros Marinos, who has been a loud critic of Sam Altman and OpenAI says, and just like clockwork, open AI asks for a global authority to regulate AI's with advanced capabilities. Such an agency, if effective, would freeze the status quo in place. He points to Mo Bala's tweet who says, this strategy would create an arms race of AIs that are designed to be deceptive about their capabilities in order to avoid being subject to regulatory oversight. I think all of these points are valid, but the overarching one for me is that we are late to the

Starting point is 00:14:12 conversation. We need to have it and have it fast, and it needs to be held with no delusions about the second order consequences good and bad. The reality is, like them or not, Open AI is the most influential voice on this right now, and so the fact that they are calling for the conversation is a boom for those who think that that conversation needs to happen. It seems like this is something that is heading into a crescendo period. right now, and of course, I will continue to cover it as it evolves.

Starting point is 00:14:35 If you're enjoying the AIA breakdown, please like, subscribe, and share. Go check out the podcast and the newsletter. And until next time, guys, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - This Massive New AI Model is 5.7x Bigger than ChatGPT's Dataset

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.