The AI Daily Brief: Artificial Intelligence News and Analysis - Q*: Was This New Advance the Reason for Sam Altman's Firing?

Episode Date: November 27, 2023

More information came out late last week about a model that showed signs of more advanced reasoning called Q*, leading to additional speculation that this was part of the reason for Sam Altman's firin...g from OpenAI. Also on this episode, Inflection claims their latest model is the second most powerful LLM. ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI breakdown, what is Q Star and did it precipitate the firing of Sam Altman? Before that on the brief, inflection now claims the second most powerful AI model. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown. Network for more information about our Discord, our YouTube, and our newsletter. Welcome back to the AI breakdown brief. All the AI headline news you need in around five minutes. As you well know, the week leading up to the Thanksgiving holiday in the U.S., was completely and utterly dominated by the Open AI saga.
Starting point is 00:00:40 Sam Altman was fired, and then he wasn't, and then he was, and then he wasn't, and then he wasn't, and then he was, and then he wasn't, and then ultimately he came back, but in a slightly less powerful form. Now, later in this episode, in the main part of the episode, we'll get into the most recent information we've gotten around what might have been behind that firing, but in the meantime, another big AI lab inflection, which is, of course, led by Mustafa Sullyman, the former CEO of Google DeepMind, and was also co-founded by Reid Hoffman, the founder of LinkedIn, announced that they had finished training Inflection 2, and that according to their claims, it was the second best LLM in the world following only GBT4. So on November 22nd,
Starting point is 00:01:16 Inflection posted Inflection 2, the next step up on their blog. They write, Today we are proud to announce that we have completed training of Inflection 2, the best model in the world for its compute class, and the second most capable LLM in the world today. Now, of course, for those who aren't familiar with Inflection, inflection is the company and the model behind Pi, which is a personal AI. As it writes when you first go to pi.a.i, my goal is to be useful, friendly, and fun. Ask me for advice, for answers, or let's talk about whatever's on your mind. So basically, this is a company that's making a very different bet around what's going to be important about the future of AI. Or at least, it's not so much that they're betting that things
Starting point is 00:01:53 like coding assistants aren't going to be important. It's just that they think that those social, human interactive type of elements are also going to be extremely important. And they saw a market gap in people addressing that particular need. Now, when they say that they are the highest ranked in their compute class, they're comparing themselves to Google's Palm 2 model because they were trained on 5,000 Nvidia H100 GPUs. They write, designed with serving efficiency in mind, inflection 2 will soon be powering Pi. Thanks to a transition from A100 to H100 GPUs, as well as our highly optimized inference implementation, we managed to produce the cost and increase the speed of serving versus inflection 1, despite inflection 2 being multiple times larger. Now, in terms
Starting point is 00:02:31 that claim that they are the second most performant model now. They're using the MMLU benchmark, on which for reference GBT 3.5 scores a 70, and GPT4 scores an 86.4. Metaslama 2-70B model comes in at 68.9, so right around GPT 3.5. Inflection 1's was a little bit ahead at 72.7. Grock 1 was just ahead of that at 73. Pom 2's large model was at 78.3. Claude 2 was at 78.5, and Infliction 2 came in at 79.6, again, trailing only GPD 4's 86.4. Now, the other thing to note is that code and mathematical reasoning continued to not be an explicit focus in the training for inflection 2. That said, even without it being an explicit focus, they saw distinct improvements from inflection 1. Now, of course, in addition to just this being a feather in the cap for inflection,
Starting point is 00:03:21 it's generated a lot of conversation around the state of the AI frontier model wars. Professor Ethan Mollick from Wharton writes, speculation. Nobody has publicly beat GPD4 yet. So if OpenAI keeps shipping and there is an AI learning curve and no diminishing returns to scale, only Google keeps up. But Claude 2 and now inflection 2 beat GBT 3.5. GROC is at GPT 3.5 level and those labs are still training. Now he goes on to clarify, by AI learning curve, I mean it in the organizational and not AI sense. Are there increasing returns to building AIs either because of a flywheel, i.e. the AI helps you code the next version, or because of experience, there are tricks you need to know
Starting point is 00:03:57 and you learn them as you go? At some point, there are going to be diminishing returns to training due to technical limitations on the scaling law, data limitations, economic limitations, or something else. We just don't know when those hit. Meanwhile, others had a more dismissive and aggressive line on this. Dylan Patel from semi-analysis writes, there are now five models better than Google's best. Google is an effing joke. XAI's Grock, Inflection 2, Claude 2, GPD4, GPD4 Turbo. Where is Gemini? Well, of course, if you've been following this show, you know that Gemini continues to get delayed but in the meantime, Google is at least continuing to add capacities to their barred model, which is available now.
Starting point is 00:04:34 The latest of those is an integration with YouTube that makes it better able to understand the content in YouTube videos and use that information to handle more complex queries around them. As the Verge writes, the bot's YouTube integration is getting a handy upgrade so it can analyze individual videos to surface specific information for you, like key points or recipe ingredients without ever pressing play. Now, there are a couple interesting points about this. One is just the strategy that we've been noticing a lot and a theme more broadly of the period that we're in of AI integration, in which even as the big labs try to race to ever more
Starting point is 00:05:05 powerful frontier models, practically for consumers, a lot of the more relevant developments are happening all around us every day. Specifically, it's the way that tools that already are released are getting integrated into the workflows and systems that we already use. This is a great example of that, and seemingly Bard's whole strategy is in many ways, being better and more thoughtfully integrated across the Google suite of tools which people already use. Now, the Verge author did a little test of this, and he found that it worked really well. They tested it on an America's test kitchen recipe for an espresso martini and said Bard got all the critical bits right in summing up the video.
Starting point is 00:05:39 The ingredients and measurements are all accurate, and the instructions are correct. It even includes the first step of chilling a martini glass by filling it with ice and water. However, the author also points out an interesting challenge of how generative AI is going to be potentially problematic for creators. Specifically in this case, the author points out that this recipe is never actually included in the show notes, and that to get the recipe in written form, you would have had to go behind a paywall on the America's Test Kitchen website. Now, the reason that they might feel comfortable with the full recipe in the video is that people tend to watch these recipe videos over and over again, so presumably they're getting value from advertising from repeated watching.
Starting point is 00:06:14 The author validates that they have to go back and look at this every single time they need it. The challenge now, as the Verge author writes, is, by having BARD spit out the recipe for me, I've just skipped the step where I press play, watch a pre-roll ad, and see the channel's other recommended videos at the end. That's great for me, but probably less good for the publisher of the video. Less good or not, it is definitely the future that we are headed into. Now, moving to another big tech company and what they might be doing in AI, today kicks off Amazon Web Services Reinvent Conference, which is one of their big annual events of the year. Indeed, it was at this event last year that they had been planning to announce something that they were then calling Amazon Bedrock, which was a foundation model akin to GPT 3.5.
Starting point is 00:06:55 The problem was that the model just wasn't ready, and so at the last minute, they scrapped plans to make that announcement. In retrospect, it was a good thing they did because as the event was happening on November 30th, 2022, OpenAI released ChatGPT, which of course went entirely viral. When Amazon realized that Bedrock was not even close to up to snuff with what ChatGPT offered, they shelved that product and in fact shifted their strategy around, and Bedrock became the sandbox through which they help enterprise customers customize and deploy a variety of AI models instead of the original Holyone model that it was supposed to be. However, in recent weeks we've had rumors that Amazon has still been working on an ambitious frontier model that they codenamed Olympus.
Starting point is 00:07:35 According to Reuters sources, Olympus has 2 trillion parameters, which would make it double that of the reported one-true. trillion parameters of GPT4. Now, Reuters is a pretty good publication for this sort of thing, and so people have been assuming that maybe we'll get an announcement about Olympus or whatever they decide to call it publicly at this Amazon reinvent event. The first keynote with AWS CEO Adam Silipski is tomorrow morning, so of course I will be watching that closely. Meanwhile, a few interesting notes over from the world of geopolitics just to round us out. A group of 18 countries led by the U.S. and Britain have signed a non-binding agreement with principles around how to make
Starting point is 00:08:09 AI, quote, secure by design. Basically, they are trying to create some consensus around how to keep AI out of the hands of rogue actors, and a lot of the recommendations, which again, are non-binding, are about things like how to monitor AI systems for abuse, how to protect data from tampering, and what sort of testing there needs to be before models are released to the public. Said the director of the U.S. cybersecurity and infrastructure security agency, Jen Easterly, this is the first time we have seen an affirmation that these capabilities should not just be about cool features and how quickly we can get them to market or how we can compete to drive down costs. Instead, she said that the guidelines represent, quote, an agreement that the most important
Starting point is 00:08:44 thing that needs to be done at the design phase is security. Now, even as that group of nations was signing that document, Russian leader Vladimir Putin was identifying AI as the latest battleground with the West and something that Russia would be pouring more money into. From the independent, Putin has claimed that AI models like OpenAI's ChatGPT and Google's barred chat bots quote, cancel Russian culture and that the West holds a dangerous dominance of the technology. He added, our innovation should rest on our traditional values and the wealth and beauty of the Russian language. And again, even as all of that is happening, we're hearing more about the Pentagon's continued work on AI-in autonomous weapons as part of the initiative that has been dubbed Replacator. Back in August, Deputy Secretary of Defense,
Starting point is 00:09:23 Kathleen Hicks said that Replacator seeks to, quote, galvanized progress in the too slow shift of U.S. military innovation to leverage platforms that are small, smart, cheap, and many. Overall, I think one of the most fascinating aspects of the AI wars right now is the extent to which are playing out not just in the big tech labs, but in the halls of the most powerful militaries in the world. And of course, in many ways, those concerns should probably just as top of mind as any others that we have around AI, given that there's no signs of slowing that down anytime soon. However, for now, that is going to do it for today's AI breakdown brief. Next up, the main AI breakdown. Welcome back to the AI breakdown. After a few days off and the Thanksgiving
Starting point is 00:10:04 holiday for those of you in the U.S., we're coming back to add a little capstone to the story of the saga of OpenAI and the firing and rehiring of Sam Altman that happened last week. Now, one of the things that was very confusing throughout the entire event was what possibly could have spooked the board so much that they would have made this move. People assumed at first that it must have been something dramatic, either a major breach of ethics in some crazy way that was destined to come out very soon, or some incredible technical advance that just scared the crap out of the board, which they didn't think that Sam was handling well. People went into overdrive. trying to figure out what that advance might be.
Starting point is 00:10:41 They were reading the tea leaves of different comments and appearances that Sam had made in the weeks leading up to the event, including one where he himself suggested that there had been some major breakthrough recently that represented a significant advance in what AIs could do. And yet, at the same time, over the first weekend of the whole drama, when former Twitch CEO Emmett Shear was appointed as the new interim CEO, his announcement about taking the position made it clear that the board had not fired Sam because of specific safety disagreements, but more around broader questions of trust and communication. This seemed really strange,
Starting point is 00:11:16 as the safety explanation was the only one that kind of made sense. Now, the idea that it wasn't about safety seemed to be reinforced when chief scientist Ilya Sutskhaver switched sides once again and said that he was sorry for his role in the firing of Sam Altman, and then even went so far as to sign the letter demanding his return. And ultimately, when all was said and done, and we We got the new board structure. We got Sam and Greg agreeing to not be on the initial board. We got the replacements in Brett Taylor and Larry Summers, as well as the continuity in Adam DeAngelo.
Starting point is 00:11:46 There was still no good explanation of what had actually happened. In fact, not only was there no good explanation, we had learned that quietly behind the scenes, that very same Emmett Shear had demanded an explanation and said that if he couldn't get one, he would quit. This is all part of what led OpenAI to bring Sam back. Well, then right around the holiday on Wednesday night. into Thursday, we got several reports that indeed there had been an AI breakthrough, and that was a big part of the reason for Sam Altman's firing. Reuters wrote, Open AI researchers warned
Starting point is 00:12:17 board of AI breakthrough ahead of CEO, A.O.S. The piece reads, ahead of Open AI CEO Sam Altman's four days in exile, several staff researchers wrote a letter to the board of directors, warning of a powerful artificial intelligence discovery that they said could threaten humanity to people familiar with the matter told Reuters. The sources cited the letter as one factor, among a larger list of grievances by the board leading to Allman's firing, among which were concerns over commercializing advances before understanding the consequences. Roiders was unable to review a copy of the letter. The staff who wrote the letter did not respond to request for comment. And then it gets even crazier. Continuing, after being contacted by Reuters OpenAI, which declined
Starting point is 00:12:55 to comment, acknowledged in an internal message to staffers a project called Q Star and a letter to the board before the weekend's events. An OpenAI spokesperson said that the message sent by longtime executive Mira Mirradi alerted staff to certain media stories without commenting on their accuracy. Some at OpenAI believe QSTAR could be a breakthrough in the startup search for what's known as AGI. Given vast computing resources, the model was able to solve certain mathematical problems, and though only performing math on the level of grade school students, acing such tests made researchers very optimistic about QSTAR's future success. Reuters could not independently verify the capabilities of QSTAR claimed by
Starting point is 00:13:31 their researchers. The information had a similar report. They added, the technical breakthrough spearheaded by OpenAI chief scientist Ilya Sutskhaver raised concerns among some of the staff that the company didn't have proper safeguards in place to commercialize such advanced AI models. In the following months, senior open AI researchers used the innovation to build systems that could solve basic math problems, a difficult task for existing AI models. Jacob Pachaki and Sizimonsider, two top researchers, used Sutskaver's work to build a model called Q-Star that was able to solve math problems that it hadn't seen before, an important technical milestone. A demo of the model circulated within Open AI in recent weeks, and the pace of development alarms some researchers
Starting point is 00:14:08 focused on AI safety. The information also took the story farther into explaining the different sides in the issue. Basically, they argued that Ilya was concerned about his breakthrough, but that these two researchers who had taken and run with it had different feelings, which is why they exited the company so quickly after Alman was fired. The information also said that Greg Brockman had been working to integrate the new Q-Star technique into additional products as well. Now, going back to Reuters, here's how they described where the concern might come from. They write, researchers consider math to be a frontier of generative AI development. Currently, generative AI is good at writing and language translation by statistically predicting
Starting point is 00:14:42 the next word, and answers to the same question can vary widely. But conquering the ability to do math, where there is only one right answer, implies AI would have greater reasoning capabilities resembling human intelligence. This could be applied to novel scientific research, for instance, AI researchers believe. In their letter to the board, researchers fly. AI's prowess and potential danger, the sources said without specifying the exact safety concerns noted in the letter. Researchers have also flagged work by a, quote, AI scientist team, the existence of which multiple sources confirmed. The group formed by combining earlier co-gen
Starting point is 00:15:13 and math-gen teams was exploring how to optimize existing AI models to improve their reasoning and eventually perform scientific work. So what it seems like to me taking a step back is that there clearly have been technological breakthroughs in OpenAI. I don't think this should be surprising to any of us, unless we were assuming that somehow they were just sitting on their hands, not working on more advanced research ever since GPT4 came out, which clearly wasn't the case. Now, as it seems, the nature of those updates divided the company on questions of safety and ethical development. Again, this is not particularly surprising. In fact, one could argue that having robust debate inside the company is a potential guardrail on these technologies. Now, at the same
Starting point is 00:15:53 time, what it doesn't seem like is the case, and again, this is to a completely uninformed outside observer who's just watching what's available and what's trickled out, it doesn't seem to me like it's likely, like these technological advances were so breakthrough in and of themselves, that they were for sure some sort of inflection point or frontier crossing moment. What I come back to is the fact that Ilya switched sides at the end of the day. Now, to the extent that these developments were part of the reason that he wanted to help create and then run the super alignment team that started over the summer, That's an indication that he took it very seriously.
Starting point is 00:16:25 And clearly there is tension internally around how Sam specifically is thinking about the commercialization of these technologies and the speed with which they get deployed. But ultimately, the fact that Ilya came back suggests to me that we are still just on another step on this continuum towards AGI, not as some had speculated that we suddenly had AGI sitting in the closet just waiting to be released. Now, adding more intrigue to all of this, after these stories came out in Reuters and the information, the Verge published a response. that actually went back on some of this.
Starting point is 00:16:55 Quote, after the publishing of the Reuters report, which said senior exec, Mirrani, told employees that a letter about QSTAR precipitated the board's actions to fire Sam Altman last week, OpenAI spokesperson Lindsay held Bolton refuted that notion in a statement shared with The Verge.
Starting point is 00:17:09 Mira told employees what the media reports were about, but she did not comment on the accuracy of the information. Separately, a person familiar with the matter told the verge that the board never received a letter about such a breakthrough and that the company's research progress didn't play a role in all. and sudden firing. So here we have the verge with their own source saying that the board never got that letter and that that wasn't part of the firing, and also a refutation of the idea that even if that letter did exist, it was what precipitated this dramatic action. Now,
Starting point is 00:17:36 in the meantime, speculation around exactly what Q-Star is has just been absolutely rampant. On the OpenAI community boards, Rainy 107-577 writes, The Cats Out of the Bag, Reuters published any interpretations, any knowledge files out there on the subject? Others responded. M4 Calic writes, definitely makes me question Sam's motives and puts the recent drama in a different light. This is moving towards more existential questions faster than anyone imagined, and I'd rather not have Microsoft, Larry Summers, and the ex-CEO of Frickin' Salesforce making the calls whether or not something is AGI. Now, on the flip side, QRDL writes, as someone who's done a fair amount of ML and AI research, I can tell you that it is very, very
Starting point is 00:18:15 easy to think you've discovered a breakthrough. There's a great deal of cognitive bias in AI and you have to falsify very aggressively. I am deeply skeptical. It's also worth noting that in the news today, we found out that the 86 billion share sale is back on. I'm sure this quote-unquote breakthrough will get investors quite interested. Now, the implication there, of course, is that in the wake of all of this chaos, the team at OpenAI had been trying to write the ship when it came to the tender sale, where employees were going to be able to cash out some of their stock at an $86 billion valuation. And the skepticism in that community post is, of course, that they have very strong incentives to get investors excited again, which might have led to the confirmation of the leaks of
Starting point is 00:18:51 Q-Star. Now, others tried to speculate around what Q-Star might be. Samuel Hammond wrote, I discussed Q-Transformers and Q-Learning as one of the more promising areas of AI research. The news that Open AI's breakthrough involves something called Q-Star suggests it's related. Q-learning is a class of reinforcement learning and not new. However, there's been recent progress in combining Q-learning with Transformers in LLMs. Tesla uses deep Q-learning for self-driving, for example. There's even speculation that Google's long-awaited Gemini model employs a version of it. Q-Star refers to the optimal action function. Finding Q-Star involves training an agent to take actions that maximize its cumulative reward, given its environment.
Starting point is 00:19:27 OpenAI has a team working on reasoning and planning, so that was inevitable that they pivot back to reinforcement learning. This could be what spooked the board, has all the scariest Eleazar-Yudkowski-style scenarios involve reinforcement learning in some form or another. Q-learning is a model-free approach to reinforcement learning, as it can work even if the environment is complex and randomly changing, rather than requiring a set of well-defined rules like chess. Q Learning is popular for single-agent games, as by default, it models other agents as simply features in its environment to navigate around rather than as distinct agents with their own internal states. Note, this is also the basic definition of sociopathy. If OpenAI has made major strides in giving their transformer models a queue to optimize for, that would explain
Starting point is 00:20:06 what Sam meant when he said today's GPTs, their quasi-agents would soon look quaint. Finding Q-star is equivalent to having the best possible Markov decision process. In other words, no matter what life throws your way, you always find a way to win. Back on that community forum, Waylon Lab said something similar, writing, Q learning is an algorithm that helps an agent learn the best actions to take in a given state to maximize a reward. That's it, pretty much. Now, I'm not going to read the whole piece because it's like a thousand words, but Dr. Jim Fan from NVIDIA writes, in my decades spent on AI, I've never seen an algorithm that so many people fantasize about, just from a name, no paper, no stats, no product. So let's reverse engineer the Q-star
Starting point is 00:20:43 fantasy. Ada Pi also did their own speculation. QSTAR, they write, can solve grade school math problems. So what? What's the big deal? Why did researchers send the letter? Why did Ilya freak out? Because they continue, OpenAI already has a stack where they can predict intelligence based on compute and data. They disclose this with the GPT4 release. QSTAR is a relatively small model, solving grade school math problems reliably and consistently using reasoning. It's an algorithmic breakthrough. And they have the scaling graphs to predict the intelligence increase from data and compute, so they know they can probably solve elite human and beyond human problems. They don't have AGI yet, but they have a near-term roadmap. Now the question is whether to go down
Starting point is 00:21:20 the path or not. Sam's reaction to this was to raise tens of billions of dollars to get a Johnny I device into people's hands and build data centers everywhere to deploy this tech as soon as possible. Now, very notably, Ada Pi followed this with the source I made it up meme, reinforcing that this is all speculation and just one of the scenarios that could explain all of this. Bindu Reddy writes something similar. The doomsday scenario for for GPD 5 in OAI, she writes. So QSTAR is generalizing and is pretty good at high school level math. QSTAR is harmless, but GPT5 will be way more powerful at logical reasoning, problem solving,
Starting point is 00:21:51 and cogeneration. It's being trained and evaluated on OAI superclusters. Part of the evaluation involves generating and executing code. It could generate a rogue piece of code that could hack into their supercomputer and then begin to take over the internet, some sort of computer virus. While this is far-fetched, it is possible. After all, you are giving a very powerful model access to a very powerful supercomputer. If anything, these are the type of doomsday scenarios you will see with these super LLMs.
Starting point is 00:22:14 The good news is that they might cause some disruption but won't pose anywhere near an ex-risk to civilization. Now, there is also even more rumors about an open AI model cracking something called AES-192 encryption, although I've seen no indications of any sort of confirmation of that, so I'll stay out of it, and will only add it as a note in this piece to recognize that the whole thing around this open AI dust up, and now the Q-Star explanation, is that it's causing people to play out in a bigger way, the scenarios of what happens if AGI is a lot closer than we think. Later in the week, we'll maybe get to Jan Lacoon from Mena's response to all of this.
Starting point is 00:22:47 But for now, the story stays very interesting and very relevant, and I will do my best to stay on top of it. Thanks for listening or watching as always, and until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.