The AI Daily Brief: Artificial Intelligence News and Analysis - Anthropic Begins to Unlock the Mystery of LLMs

Starting point is 00:00:00 Today on the AI Daily Brief, Anthropic makes a major breakthrough in interpretability. Before that, in the headlines, Invidia just continues to smash expectations. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, check out the Discord linked in our show notes. Welcome back to the AI Daily Brief Headlines edition, all the AI headline news you need in around five minutes. We kick off today with an earnings report from Nvidia, and surprise, surprise, it just keeps getting bigger. Wall Street Journal writes, NVIDIA delivered a record quarter and signal that the AI boom is still going strong,

Starting point is 00:00:42 driving its already meteoric stock up above $1,000 a share. Revenue last quarter more than tripled year over year to $26 billion, and net profit was up $7.6 to $14.88 billion. Both were quarterly records and both beat analyst expectations. The stock market obviously loved this, with share prices up 6% in pre-market trading following the report, and a single share surpassing $1,000. One of the things that's also really interesting, though, is that while currently the big cloud companies like Google, Microsoft, and Amazon account for around 45% of Nvidia's data center revenue, they're clearly trying to move to a world where they're not just selling to data

Starting point is 00:01:17 centers, but also selling directly to companies. At Dell's big annual event this week, for example, Nvidia and Dell talked about how they were trying to create the AI factories of the future, where individual companies had more direct access to this sort of capacity. CEO Jensen Huang said we are poised for our next wave of growth. From Bloomberg, NVIDIA emphasized Wednesday that it wants to sell its technology to a wider market, expanding beyond the giant cloud computing providers known as hyperscalers. Huang said that AI is moving to consumer internet companies, carmakers, biotechnology, and health care customers.

Starting point is 00:01:46 The large-scale deployment of NVIDIA chips by Elon Musk's Tesla is one sign of that expansion. It continues to be a battle for NVIDIA to keep up with demand. Said Huang, nobody has ever manufactured supercomputers at volume. We're doing the best we can. The one other interesting note, as called out by the verge, was that NVIDIA will now make new AI chips every single year. Said Huang, I can announce that after Blackwell, there's another chip. We're on a one-year rhythm. The verge points out that until now, Nvidia produced a new architecture roughly once every two years, Ampier in 2020, Hopper in 2022, Blackwell in 2024, but that everything

Starting point is 00:02:18 is getting faster now. Next up in the headlines, you'll remember that recently Microsoft inked deal with UAE-based G-42. Part of why I was so interested in the deal is that it seemed to have been facilitated by the Department of Commerce and reflected geopolitics as much as business considerations. Basically, G-42 had been right at the center of the U.S. China tension, and the U.S. had been putting a ton of pressure on it to pick aside. Well, pick-aside it did, and Microsoft's minority investment of $1.5 billion was part of that picking. Still, it's not without complications. Wright's Reuters, Microsoft President Brad Smith said the tech company's high-profile deal with the UAE-backed AI firm G-42 could eventually involve the transfer of sophisticated chips and tools,

Starting point is 00:02:57 a move that a senior Republican congressman warned could have national security implications, said Michael call the Republican chairman of the Foreign Affairs Committee. Despite the significant national security implications, Congress still has not received a comprehensive briefing from the executive branch about this agreement. I am concerned the right guardrails are not in place to protect sensitive U.S. origin technology from Chinese espionage given the CCP's interest in the UAE. To me, this sounds a little bit like, if this really was coming from the Department of Commerce that was obviously a White House facilitated deal, that Congress just doesn't have visibility into. Anyway, there's tons more details here, but what's interesting to me continues to be just the geopolitical implications of AI and how

Starting point is 00:03:32 quickly they've become an ongoing concern. In the world of M&A, I'm wondering if we're not about to see a bit of a wave of consolidation in the AI space. We've had a couple boom years of funding, and now we're very naturally in the phase where companies are figuring out if there's enough of it there there to raise a next round, or if it makes sense to try to join up with someone bigger. One company seemingly going through that decision-making process is adept, which was valued by investors at more than a billion last year, and which the information reports has held talks recently around a possible sale or strategic partnership with large tech companies, most notably meta. Adept is in the much vaunted AI agent space, and of course that means they're dealing with

Starting point is 00:04:07 not only intense competition from every angle in every direction, but also the fact that AI agents are still at this point highly theoretical. There is an existing consumer demand to tap into. It's a new behavior in a new category that's being invented on the fly. When it comes to something that challenging, they may decide that it makes sense to do that from within one of the big giants that has the capital to actually pursue it to its full ends. Speaking of big companies, in what will be a surprise to no one, Amazon is apparently planning to give Alexa an AI upgrade, as well as a monthly subscription fee. Notably, this will not be included in Amazon Prime subscriptions. Right CNBC, Amazon will launch a more conversational version of Alexa later this year, potentially positioning it to better compete with new generative AI powered chatbots from companies including Google and OpenAI.

Starting point is 00:04:50 Will the end of the year see us watching a souped up Alexa compete with a souped up Siri, both of them competing against some brand new OpenAI product? Kind of seems like it. Speaking of an OpenAI product, the Washington Post reports that OpenAI didn't actually copy Scarlett Johansson's voice. They write, When OpenAI issued a casting call last May for a secret project to endow open AI's popular chat GPT with a human voice, the flyer had several requests. The actors should be non-union.

Starting point is 00:05:16 They should sound between 25 and 45 years old, and their voices should be warm, engaging and charismatic. One thing the AI company didn't request, according to interviews with multiple people involved in the process and documents shared by OpenAI in response to questions from the Washington Post, a clone of actress Scarlett Johansson. This of course gets to the conversation that's been happening where Scarlett Johansson released a statement expressing concern that she had been asked by OpenAI to use her voice. And when she said no, there ended up being a voice that kind of sounded

Starting point is 00:05:42 like her. As I mentioned in a previous episode, I think there are multiple things going on here. There is the legal side of this, which at least this reporting from the Washington Post suggests that there might not be a there there. There's also just a broader question of the look and the public's trust or lack thereof in Sam Altman and OpenAI. For now, though, that is going to to do it for today's headlines. Stay tuned for the main episode. Today's podcast is brought to you by Plum. Are you a lean product team trying to rapidly develop and deploy AI features that deliver real value to your users? Plum empowers you to build complex AI pipelines, transform data, and leverage validated JSON schema to create reliable high-quality AI features, accessible as API endpoints,

Starting point is 00:06:19 all in an intuitive low-code interface. Go from Idea to MVP in hours, not days. Get your AI-powered product in front of customers as soon as possible with Plum. Check out useplum.com, that's Plum with a B, for early access to the future of AI app development. Hello, friends, before we get back to the episode, I want to tell you about something special I'm doing on Superintelligent this June. Super is, of course, our platform for AI learning, and I've heard from a lot of you that you really want something for a true AI beginner, someone who's really just getting their feet wet with these tools. So what I'm going to do is put together basically a course that sits on top of and uses super intelligent tutorials and lessons, but where I hand guide you through around, 10 different lessons and how-toes that I think once you complete them will have you ahead of 80% of the other people who are just starting to use AI right now. If you are interested in this learning

Starting point is 00:07:06 experience, go to B-super.a-I and sign up using code June. You'll get 25% off your first month, and I'll automatically add you to that AI for beginners group. That's B-super.a.i, discount code June. See you there. Welcome back to the AI Daily Brief. One of the remarkable things about LLMs, this technology that has taken the world by storm, that is changing how people work, how people think about work, that is generating entirely new categories of interactions with computers that has some people thinking that Terminator is going to become real, is that we genuinely don't understand exactly how they work. They just sort of seem to. Indeed, this is part of the reason for some researchers having concerns about the future state of these technologies. To wit,

Starting point is 00:07:50 if we don't understand how they work now, how do we think we're going to control them as they get more powerful. While new research from Anthropic may be shedding some light that will help us with that sort of understanding. The New York Times summed this up in a piece called AI's black boxes just got a little less mysterious. Kevin Roos writes, one of the weird or more unnerving things about today's leading AI systems is that nobody, not even the people who build them, really know how the systems work. That's because LLMs are not programmed line by line by human engineers as conventional computer programs are. Instead, these systems essentially learn on their own by ingesting vast amounts of data and identifying patterns and relationships in language,

Starting point is 00:08:25 then using that knowledge to predict the next word in a sequence. Again, this is one of the great dividing lines in terms of how people think about AI risk. To some, this lack of understanding is precisely a cause for concern, while for others, perhaps most notably, or at least most loudly, Jan Lacoon from meta, the current approach to LLMs that are just predicting the next word in a sequence are in his mind simply incapable of the types of things that some folks are worried about. Holding aside any of the big long-term existential risk things, there are challenges of our lack of understanding right now.

Starting point is 00:08:56 The examples of the New York Times points out, right now if a user types which American city has the best food and a chatbot responds Tokyo, there's no way of understanding why the model made that error, or why the next person who asks may receive a different answer. So if you are a company building a chatbot trying to make it better, it's very hard to improve things in any sort of linear or controllable way. Of course, there is also the alignment side of this problem, as the New York Times, Kevin Ruse writes, when LLM's do misbehavior go off the rails, nobody can really explain why. From there, the Times talks about the field of research that is trying to figure out how

Starting point is 00:09:28 these models work, which is called mechanistic interpretability. Ruse characterizes the work as slow going with progress being incremental. This week, however, Anthropic announced what they're calling a major breakthrough, and here's how Ruse sums it up. The researchers looked inside one of Anthropics AI models, Claude 3 Sonnet, and used a technique known as dictionary learning to uncover patterns and how combinations of neurons, the mathematical units inside the AI model were activated when Claude was prompted to talk about certain topics. They identified roughly 10 million of these patterns which they call features.

Starting point is 00:09:56 This research actually started previously. Anthropic in their announcement post writes, In October 2023, we reported success applying dictionary learning to a very small toy language model and found coherent features corresponding to concepts like uppercase text, DNA sequences, surnames and citations, nouns and mathematics, or function arguments in Python code. Now, however, they say, we've successfully extracted millions of things. millions of features from the middle layer of Quad 3 Sonnet, providing a rough conceptual map of its internal states halfway through its computation. Whereas the features we found in the toy language model were rather superficial, the features we found in Sonnet have a depth, breadth, and abstraction

Starting point is 00:10:31 reflecting Sonnet's advanced capabilities. We see features corresponding to a vast range of entities like cities, San Francisco, atomic elements like lithium, scientific fields, immunology, and programming syntax like function calls. These features are multimodal and multilingual responding to images of a given entity as well as its name or description in many languages. At this point in the piece, they show the Golden Gate Bridge feature, which activates around images of the Golden Gate Bridge or around text containing the Golden Gate Bridge. Anthropic goes on,

Starting point is 00:10:57 we were able to measure a kind of quote-unquote distance between features based on what neurons appeared in their activation patterns. This allowed us to look for features that are quote-unquote close to each other. Looking near a Golden Gate Bridge feature, we found features for Alcatraz Island, Jurydally Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco set Alfred Hitchcock film Vertigo. They continue this hold at a higher level of conceptual abstraction.

Starting point is 00:11:22 Looking near a feature related to the concept of inner conflict, we find features related to relationship breakups, conflicting allegiances, logical inconsistencies, as well as the phrase catch-22. This shows that the internal organization of concepts in the AI model corresponds at least somewhat to our human notions of similarity. Importantly, says Anthropic, they're not just able to identify these features but to manipulate them. Quote, artificially amplifying or suppressing them to see how Claude's response changes. Holding again with the example of the Golden Gate Bridge, they said when initially asked,

Starting point is 00:11:50 what is your physical form? Claude's usual kind of answer is, I have no physical form. I am an AI model. But when amplifying the Golden Gate Bridge feature, Claude responded, I am the Golden Gate Bridge. My physical form is the iconic bridge itself. Quote, altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query, even in situations where it wasn't at all relevant. They continue, the fact that manipulating these features causes corresponding changes to behavior validates that they aren't just correlated with the presence of concepts and input text, but also causally shaped the model's behavior. In other words, the features are likely to be a faithful part of how the model internally represents the world and how it uses

Starting point is 00:12:24 these representations in its behavior. Said Chris O'Law from Anthropic, who led this team, we're discovering features that may shed light on concerns about bias, safety risks, and autonomy. I'm feeling really excited that we might be able to turn these controversial questions that people argue about into things we can actually have more productive discourse on. An associate professor of computer science at MIT, Jacob Andreas, who reviewed Anthropics research, called it a hopeful sign that large-scale interpretability might be possible. He said, in the same way that understanding basic things about how people work has helped us cure diseases, understanding how these models work will both let us recognize when things are about to go wrong and let us build better tools for controlling them.

Starting point is 00:12:58 So obviously, this doesn't tell us everything about how LLMs work, but it does give us a pretty strong jumping off point to go deeper in terms of this question of interpretability. Science-y and dense, though this may be, I think this is going to be an important part of how we resolve some of these questions of risk and challenges as AI moves forward. The longer we stay in the realm of theoretical debates, the harder it will be to actually put policies in place, whereas the more specific and applied we get, the better able we might be to actually solve some of the challenges. Super interesting stuff, great work from the Anthropic team, but for now, that is going to do it for the AI Daily Brief. Appreciate you listening or

Starting point is 00:13:30 watching as always, and until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Anthropic Begins to Unlock the Mystery of LLMs

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.