The AI Daily Brief: Artificial Intelligence News and Analysis - Q*: Was This New Advance the Reason for Sam Altman's Firing?
Episode Date: November 27, 2023More information came out late last week about a model that showed signs of more advanced reasoning called Q*, leading to additional speculation that this was part of the reason for Sam Altman's firin...g from OpenAI. Also on this episode, Inflection claims their latest model is the second most powerful LLM. ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, what is Q Star and did it precipitate the firing of Sam Altman?
Before that on the brief, inflection now claims the second most powerful AI model.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown. Network for more information about our Discord, our YouTube, and our newsletter.
Welcome back to the AI breakdown brief.
All the AI headline news you need in around five minutes.
As you well know, the week leading up to the Thanksgiving holiday in the U.S.,
was completely and utterly dominated by the Open AI saga.
Sam Altman was fired, and then he wasn't, and then he was, and then he wasn't, and then he wasn't,
and then he was, and then he wasn't, and then ultimately he came back, but in a slightly less powerful form.
Now, later in this episode, in the main part of the episode, we'll get into the most recent
information we've gotten around what might have been behind that firing, but in the meantime,
another big AI lab inflection, which is, of course, led by Mustafa Sullyman, the former
CEO of Google DeepMind, and was also co-founded by Reid Hoffman, the
founder of LinkedIn, announced that they had finished training Inflection 2, and that according to
their claims, it was the second best LLM in the world following only GBT4. So on November 22nd,
Inflection posted Inflection 2, the next step up on their blog. They write,
Today we are proud to announce that we have completed training of Inflection 2, the best model
in the world for its compute class, and the second most capable LLM in the world today.
Now, of course, for those who aren't familiar with Inflection, inflection is the company and the model
behind Pi, which is a personal AI. As it writes when you first go to pi.a.i, my goal is to be
useful, friendly, and fun. Ask me for advice, for answers, or let's talk about whatever's on your mind.
So basically, this is a company that's making a very different bet around what's going to be
important about the future of AI. Or at least, it's not so much that they're betting that things
like coding assistants aren't going to be important. It's just that they think that those social,
human interactive type of elements are also going to be extremely important. And they saw a market
gap in people addressing that particular need. Now, when they say that they are the highest ranked
in their compute class, they're comparing themselves to Google's Palm 2 model because they were
trained on 5,000 Nvidia H100 GPUs. They write, designed with serving efficiency in mind,
inflection 2 will soon be powering Pi. Thanks to a transition from A100 to H100 GPUs,
as well as our highly optimized inference implementation, we managed to produce the cost and increase
the speed of serving versus inflection 1, despite inflection 2 being multiple times larger. Now, in terms
that claim that they are the second most performant model now. They're using the MMLU benchmark,
on which for reference GBT 3.5 scores a 70, and GPT4 scores an 86.4. Metaslama 2-70B model
comes in at 68.9, so right around GPT 3.5. Inflection 1's was a little bit ahead at 72.7.
Grock 1 was just ahead of that at 73. Pom 2's large model was at 78.3. Claude 2 was at 78.5,
and Infliction 2 came in at 79.6, again, trailing only GPD 4's 86.4.
Now, the other thing to note is that code and mathematical reasoning continued to not be an explicit focus in the training for inflection 2.
That said, even without it being an explicit focus, they saw distinct improvements from inflection 1.
Now, of course, in addition to just this being a feather in the cap for inflection,
it's generated a lot of conversation around the state of the AI frontier model wars.
Professor Ethan Mollick from Wharton writes,
speculation. Nobody has publicly beat GPD4 yet. So if OpenAI keeps shipping and there is an AI
learning curve and no diminishing returns to scale, only Google keeps up. But Claude 2 and now
inflection 2 beat GBT 3.5. GROC is at GPT 3.5 level and those labs are still training.
Now he goes on to clarify, by AI learning curve, I mean it in the organizational and not
AI sense. Are there increasing returns to building AIs either because of a flywheel, i.e. the
AI helps you code the next version, or because of experience, there are tricks you need to know
and you learn them as you go? At some point, there are going to be diminishing returns to training
due to technical limitations on the scaling law, data limitations, economic limitations, or something
else. We just don't know when those hit. Meanwhile, others had a more dismissive and aggressive
line on this. Dylan Patel from semi-analysis writes, there are now five models better than Google's
best. Google is an effing joke. XAI's Grock, Inflection 2, Claude 2, GPD4, GPD4 Turbo. Where is Gemini?
Well, of course, if you've been following this show, you know that Gemini continues to get delayed
but in the meantime, Google is at least continuing to add capacities to their barred model,
which is available now.
The latest of those is an integration with YouTube that makes it better able to understand
the content in YouTube videos and use that information to handle more complex queries around them.
As the Verge writes, the bot's YouTube integration is getting a handy upgrade so it can
analyze individual videos to surface specific information for you, like key points or recipe
ingredients without ever pressing play.
Now, there are a couple interesting points about this.
One is just the strategy that we've been noticing a lot and a theme more broadly of the period
that we're in of AI integration, in which even as the big labs try to race to ever more
powerful frontier models, practically for consumers, a lot of the more relevant developments
are happening all around us every day. Specifically, it's the way that tools that already are
released are getting integrated into the workflows and systems that we already use. This is a great
example of that, and seemingly Bard's whole strategy is in many ways, being better and more
thoughtfully integrated across the Google suite of tools which people already use.
Now, the Verge author did a little test of this, and he found that it worked really well.
They tested it on an America's test kitchen recipe for an espresso martini and said
Bard got all the critical bits right in summing up the video.
The ingredients and measurements are all accurate, and the instructions are correct.
It even includes the first step of chilling a martini glass by filling it with ice and water.
However, the author also points out an interesting challenge of how generative AI is going to be
potentially problematic for creators. Specifically in this case, the author points out that this recipe
is never actually included in the show notes, and that to get the recipe in written form, you would have
had to go behind a paywall on the America's Test Kitchen website. Now, the reason that they might
feel comfortable with the full recipe in the video is that people tend to watch these recipe videos
over and over again, so presumably they're getting value from advertising from repeated watching.
The author validates that they have to go back and look at this every single time they need it.
The challenge now, as the Verge author writes, is, by having BARD spit out the recipe for me,
I've just skipped the step where I press play, watch a pre-roll ad, and see the channel's other
recommended videos at the end. That's great for me, but probably less good for the publisher
of the video. Less good or not, it is definitely the future that we are headed into.
Now, moving to another big tech company and what they might be doing in AI, today kicks off
Amazon Web Services Reinvent Conference, which is one of their big annual events of the year.
Indeed, it was at this event last year that they had been planning to announce something that they were then calling Amazon Bedrock, which was a foundation model akin to GPT 3.5.
The problem was that the model just wasn't ready, and so at the last minute, they scrapped plans to make that announcement.
In retrospect, it was a good thing they did because as the event was happening on November 30th, 2022, OpenAI released ChatGPT, which of course went entirely viral.
When Amazon realized that Bedrock was not even close to up to snuff with what ChatGPT offered,
they shelved that product and in fact shifted their strategy around,
and Bedrock became the sandbox through which they help enterprise customers
customize and deploy a variety of AI models instead of the original Holyone model that it was supposed to be.
However, in recent weeks we've had rumors that Amazon has still been working on an ambitious frontier model
that they codenamed Olympus.
According to Reuters sources, Olympus has 2 trillion parameters,
which would make it double that of the reported one-true.
trillion parameters of GPT4. Now, Reuters is a pretty good publication for this sort of thing,
and so people have been assuming that maybe we'll get an announcement about Olympus or whatever they
decide to call it publicly at this Amazon reinvent event. The first keynote with AWS CEO Adam
Silipski is tomorrow morning, so of course I will be watching that closely. Meanwhile, a few
interesting notes over from the world of geopolitics just to round us out. A group of 18 countries
led by the U.S. and Britain have signed a non-binding agreement with principles around how to make
AI, quote, secure by design. Basically, they are trying to create some consensus around how to keep
AI out of the hands of rogue actors, and a lot of the recommendations, which again, are non-binding,
are about things like how to monitor AI systems for abuse, how to protect data from tampering,
and what sort of testing there needs to be before models are released to the public.
Said the director of the U.S. cybersecurity and infrastructure security agency, Jen Easterly,
this is the first time we have seen an affirmation that these capabilities should not just be
about cool features and how quickly we can get them to market or how we can compete to drive down
costs. Instead, she said that the guidelines represent, quote, an agreement that the most important
thing that needs to be done at the design phase is security. Now, even as that group of nations was
signing that document, Russian leader Vladimir Putin was identifying AI as the latest battleground with the
West and something that Russia would be pouring more money into. From the independent, Putin has
claimed that AI models like OpenAI's ChatGPT and Google's barred chat bots quote, cancel Russian culture
and that the West holds a dangerous dominance of the technology. He added, our innovation should rest
on our traditional values and the wealth and beauty of the Russian language. And again, even as all of that
is happening, we're hearing more about the Pentagon's continued work on AI-in autonomous weapons
as part of the initiative that has been dubbed Replacator. Back in August, Deputy Secretary of Defense,
Kathleen Hicks said that Replacator seeks to, quote, galvanized progress in the too slow shift
of U.S. military innovation to leverage platforms that are small, smart, cheap, and many.
Overall, I think one of the most fascinating aspects of the AI wars right now is the extent to which
are playing out not just in the big tech labs, but in the halls of the most powerful militaries
in the world. And of course, in many ways, those concerns should probably just as top of mind
as any others that we have around AI, given that there's no signs of slowing that down anytime
soon. However, for now, that is going to do it for today's AI breakdown brief. Next up, the
main AI breakdown. Welcome back to the AI breakdown. After a few days off and the Thanksgiving
holiday for those of you in the U.S., we're coming back to add a little capstone to the story of
the saga of OpenAI and the firing and rehiring of Sam Altman that happened last week.
Now, one of the things that was very confusing throughout the entire event was what possibly could
have spooked the board so much that they would have made this move. People assumed at first that
it must have been something dramatic, either a major breach of ethics in some crazy way that was
destined to come out very soon, or some incredible technical advance that just scared the crap
out of the board, which they didn't think that Sam was handling well. People went into overdrive.
trying to figure out what that advance might be.
They were reading the tea leaves of different comments and appearances that Sam had made
in the weeks leading up to the event, including one where he himself suggested that there
had been some major breakthrough recently that represented a significant advance in what AIs
could do.
And yet, at the same time, over the first weekend of the whole drama, when former Twitch
CEO Emmett Shear was appointed as the new interim CEO, his announcement about taking the position
made it clear that the board had not fired Sam because of specific safety disagreements,
but more around broader questions of trust and communication. This seemed really strange,
as the safety explanation was the only one that kind of made sense. Now, the idea that it wasn't
about safety seemed to be reinforced when chief scientist Ilya Sutskhaver switched sides once again
and said that he was sorry for his role in the firing of Sam Altman, and then even went so far as
to sign the letter demanding his return. And ultimately, when all was said and done, and we
We got the new board structure.
We got Sam and Greg agreeing to not be on the initial board.
We got the replacements in Brett Taylor and Larry Summers, as well as the continuity in Adam
DeAngelo.
There was still no good explanation of what had actually happened.
In fact, not only was there no good explanation, we had learned that quietly behind the scenes,
that very same Emmett Shear had demanded an explanation and said that if he couldn't get one,
he would quit.
This is all part of what led OpenAI to bring Sam back.
Well, then right around the holiday on Wednesday night.
into Thursday, we got several reports that indeed there had been an AI breakthrough, and that was a
big part of the reason for Sam Altman's firing. Reuters wrote, Open AI researchers warned
board of AI breakthrough ahead of CEO, A.O.S. The piece reads, ahead of Open AI CEO Sam Altman's
four days in exile, several staff researchers wrote a letter to the board of directors, warning of a
powerful artificial intelligence discovery that they said could threaten humanity to people familiar
with the matter told Reuters. The sources cited the letter as one factor,
among a larger list of grievances by the board leading to Allman's firing, among which were
concerns over commercializing advances before understanding the consequences. Roiders was unable to review a
copy of the letter. The staff who wrote the letter did not respond to request for comment.
And then it gets even crazier. Continuing, after being contacted by Reuters OpenAI, which declined
to comment, acknowledged in an internal message to staffers a project called Q Star and a letter to the board
before the weekend's events. An OpenAI spokesperson said that the message sent by longtime executive
Mira Mirradi alerted staff to certain media stories without commenting on their accuracy.
Some at OpenAI believe QSTAR could be a breakthrough in the startup search for what's known as
AGI. Given vast computing resources, the model was able to solve certain mathematical problems,
and though only performing math on the level of grade school students,
acing such tests made researchers very optimistic about QSTAR's future success.
Reuters could not independently verify the capabilities of QSTAR claimed by
their researchers. The information had a similar report. They added, the technical breakthrough spearheaded
by OpenAI chief scientist Ilya Sutskhaver raised concerns among some of the staff that the company
didn't have proper safeguards in place to commercialize such advanced AI models. In the following
months, senior open AI researchers used the innovation to build systems that could solve basic math
problems, a difficult task for existing AI models. Jacob Pachaki and Sizimonsider, two top
researchers, used Sutskaver's work to build a model called Q-Star that was able to solve math
problems that it hadn't seen before, an important technical milestone. A demo of the model
circulated within Open AI in recent weeks, and the pace of development alarms some researchers
focused on AI safety. The information also took the story farther into explaining the different
sides in the issue. Basically, they argued that Ilya was concerned about his breakthrough,
but that these two researchers who had taken and run with it had different feelings, which is why
they exited the company so quickly after Alman was fired. The information also said that Greg
Brockman had been working to integrate the new Q-Star technique into additional products as well.
Now, going back to Reuters, here's how they described where the concern might come from.
They write, researchers consider math to be a frontier of generative AI development.
Currently, generative AI is good at writing and language translation by statistically predicting
the next word, and answers to the same question can vary widely.
But conquering the ability to do math, where there is only one right answer, implies
AI would have greater reasoning capabilities resembling human intelligence.
This could be applied to novel scientific research, for instance, AI researchers believe.
In their letter to the board, researchers fly.
AI's prowess and potential danger, the sources said without specifying the exact safety concerns
noted in the letter. Researchers have also flagged work by a, quote, AI scientist team,
the existence of which multiple sources confirmed. The group formed by combining earlier co-gen
and math-gen teams was exploring how to optimize existing AI models to improve their reasoning
and eventually perform scientific work. So what it seems like to me taking a step back is that
there clearly have been technological breakthroughs in OpenAI. I don't think this should be
surprising to any of us, unless we were assuming that somehow they were just sitting on their hands,
not working on more advanced research ever since GPT4 came out, which clearly wasn't the case.
Now, as it seems, the nature of those updates divided the company on questions of safety and ethical
development. Again, this is not particularly surprising. In fact, one could argue that having
robust debate inside the company is a potential guardrail on these technologies. Now, at the same
time, what it doesn't seem like is the case, and again, this is to a completely uninformed
outside observer who's just watching what's available and what's trickled out, it doesn't seem to
me like it's likely, like these technological advances were so breakthrough in and of themselves,
that they were for sure some sort of inflection point or frontier crossing moment.
What I come back to is the fact that Ilya switched sides at the end of the day.
Now, to the extent that these developments were part of the reason that he wanted to help
create and then run the super alignment team that started over the summer,
That's an indication that he took it very seriously.
And clearly there is tension internally around how Sam specifically is thinking about the commercialization
of these technologies and the speed with which they get deployed.
But ultimately, the fact that Ilya came back suggests to me that we are still just on
another step on this continuum towards AGI, not as some had speculated that we suddenly
had AGI sitting in the closet just waiting to be released.
Now, adding more intrigue to all of this, after these stories came out in Reuters and the
information, the Verge published a response.
that actually went back on some of this.
Quote,
after the publishing of the Reuters report,
which said senior exec, Mirrani, told employees
that a letter about QSTAR
precipitated the board's actions
to fire Sam Altman last week,
OpenAI spokesperson Lindsay held Bolton
refuted that notion in a statement shared with The Verge.
Mira told employees what the media reports were about,
but she did not comment on the accuracy of the information.
Separately, a person familiar with the matter told the verge
that the board never received a letter about such a breakthrough
and that the company's research progress didn't play a role in all.
and sudden firing. So here we have the verge with their own source saying that the board never
got that letter and that that wasn't part of the firing, and also a refutation of the idea
that even if that letter did exist, it was what precipitated this dramatic action. Now,
in the meantime, speculation around exactly what Q-Star is has just been absolutely rampant.
On the OpenAI community boards, Rainy 107-577 writes,
The Cats Out of the Bag, Reuters published any interpretations, any knowledge files out there on the
subject? Others responded. M4 Calic writes, definitely makes me question Sam's motives and puts the
recent drama in a different light. This is moving towards more existential questions faster than anyone
imagined, and I'd rather not have Microsoft, Larry Summers, and the ex-CEO of Frickin' Salesforce making
the calls whether or not something is AGI. Now, on the flip side, QRDL writes,
as someone who's done a fair amount of ML and AI research, I can tell you that it is very, very
easy to think you've discovered a breakthrough. There's a great deal of cognitive bias in AI and you have
to falsify very aggressively. I am deeply skeptical. It's also worth noting that in the news today,
we found out that the 86 billion share sale is back on. I'm sure this quote-unquote breakthrough
will get investors quite interested. Now, the implication there, of course, is that in the wake
of all of this chaos, the team at OpenAI had been trying to write the ship when it came to the
tender sale, where employees were going to be able to cash out some of their stock at an $86 billion
valuation. And the skepticism in that community post is, of course, that they have very strong
incentives to get investors excited again, which might have led to the confirmation of the leaks of
Q-Star. Now, others tried to speculate around what Q-Star might be. Samuel Hammond wrote,
I discussed Q-Transformers and Q-Learning as one of the more promising areas of AI research.
The news that Open AI's breakthrough involves something called Q-Star suggests it's related.
Q-learning is a class of reinforcement learning and not new. However, there's been recent progress
in combining Q-learning with Transformers in LLMs. Tesla uses deep Q-learning for self-driving, for example.
There's even speculation that Google's long-awaited Gemini model employs a version of it.
Q-Star refers to the optimal action function.
Finding Q-Star involves training an agent to take actions that maximize its cumulative reward, given its environment.
OpenAI has a team working on reasoning and planning, so that was inevitable that they pivot back to reinforcement learning.
This could be what spooked the board, has all the scariest Eleazar-Yudkowski-style scenarios involve reinforcement learning in some form or another.
Q-learning is a model-free approach to reinforcement learning, as it can work even if the environment is complex and randomly changing,
rather than requiring a set of well-defined rules like chess.
Q Learning is popular for single-agent games, as by default, it models other agents as simply
features in its environment to navigate around rather than as distinct agents with their own
internal states. Note, this is also the basic definition of sociopathy. If OpenAI has made
major strides in giving their transformer models a queue to optimize for, that would explain
what Sam meant when he said today's GPTs, their quasi-agents would soon look quaint.
Finding Q-star is equivalent to having the best possible Markov decision process. In other
words, no matter what life throws your way, you always find a way to win. Back on that community
forum, Waylon Lab said something similar, writing, Q learning is an algorithm that helps an agent
learn the best actions to take in a given state to maximize a reward. That's it, pretty much. Now,
I'm not going to read the whole piece because it's like a thousand words, but Dr. Jim Fan from
NVIDIA writes, in my decades spent on AI, I've never seen an algorithm that so many people fantasize
about, just from a name, no paper, no stats, no product. So let's reverse engineer the Q-star
fantasy. Ada Pi also did their own speculation. QSTAR, they write, can solve grade school math
problems. So what? What's the big deal? Why did researchers send the letter? Why did Ilya freak out?
Because they continue, OpenAI already has a stack where they can predict intelligence based on
compute and data. They disclose this with the GPT4 release. QSTAR is a relatively small model,
solving grade school math problems reliably and consistently using reasoning. It's an algorithmic
breakthrough. And they have the scaling graphs to predict the intelligence increase from data
and compute, so they know they can probably solve elite human and beyond human problems.
They don't have AGI yet, but they have a near-term roadmap. Now the question is whether to go down
the path or not. Sam's reaction to this was to raise tens of billions of dollars to get a Johnny I
device into people's hands and build data centers everywhere to deploy this tech as soon as possible.
Now, very notably, Ada Pi followed this with the source I made it up meme, reinforcing that this
is all speculation and just one of the scenarios that could explain all of this.
Bindu Reddy writes something similar. The doomsday scenario for
for GPD 5 in OAI, she writes.
So QSTAR is generalizing and is pretty good at high school level math.
QSTAR is harmless, but GPT5 will be way more powerful at logical reasoning, problem solving,
and cogeneration.
It's being trained and evaluated on OAI superclusters.
Part of the evaluation involves generating and executing code.
It could generate a rogue piece of code that could hack into their supercomputer and then
begin to take over the internet, some sort of computer virus.
While this is far-fetched, it is possible.
After all, you are giving a very powerful model access to a very powerful supercomputer.
If anything, these are the type of doomsday scenarios you will see with these super LLMs.
The good news is that they might cause some disruption but won't pose anywhere near an
ex-risk to civilization.
Now, there is also even more rumors about an open AI model cracking something called
AES-192 encryption, although I've seen no indications of any sort of confirmation of that,
so I'll stay out of it, and will only add it as a note in this piece to recognize that the
whole thing around this open AI dust up, and now the Q-Star explanation, is that it's
causing people to play out in a bigger way, the scenarios of what happens if AGI is a lot closer
than we think. Later in the week, we'll maybe get to Jan Lacoon from Mena's response to all of this.
But for now, the story stays very interesting and very relevant, and I will do my best to stay
on top of it. Thanks for listening or watching as always, and until next time, peace.
