The AI Daily Brief: Artificial Intelligence News and Analysis - The Latest in AI + Humanoid Robots
Episode Date: August 8, 2024NLW looks at the Figure 02 launch and the latest competition in humanoid robots. Plus rumors of a new OpenAI model. Concerned about being spied on? Tired of censored responses? AI Daily Brief listen...ers receive a 20% discount on Venice Pro. Visit https://venice.ai/nlw and enter the discount code NLWDAILYBRIEF. Learn how to use AI with the world's biggest library of fun and useful tutorials: https://besuper.ai/ Use code 'podcast' for 50% off your first month. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, the latest in AI in humanoid robotics.
Before that in the headlines, are we getting a new open AI model soon?
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlined edition,
all the daily AI news you need in around five minutes.
We kick off today with a fun one, something that has gotten the Twitter AI's fear chattering.
A new model has appeared in the Limbsphere.
Siss Arena that appears to be some new version of GPT4.
It was first noticed last night and is named Anonymous Chatbot.
Now, you might remember that the last time that we got a new OpenAI model, it appeared
first in this sort of semi-anonymous form on the LIMSIS Arena, and so people are already
speculating around what this might represent.
When OpenAI leaker Jimmy Apples asked, what model are you?
The anonymous chatbot responded, I'm based on OpenAI's GPT4 architecture.
specifically you're interacting with a version of GPT4 that has been fine-tuned for chat-based interactions.
This model is designed to understand and generate human-like text based on the input it receives.
Now, of course, that could be a hallucination.
It's no guarantee that that actually is what this model represents, but that's at least what it's self-reporting.
Speculation is that it might be something called Q-Star.
As the whole Sam Altman controversy was going down last November,
Reuters and others reported that Q-Star was a new, more advanced reasoning model,
and there was lots of speculation that safety concerns around QSTAR were part of what the Rift was inside the organization.
Later, of course, all the parties involved would say that it wasn't about a safety issue, although there are still many who just don't believe that.
About a month ago, Reuters once again reported on the latest of QSTAR, although it was now codenamed strawberry.
Information came from an internal open AI document that was seen by Reuters in May.
Quote, the document describes a project that uses strawberry models with the aim of enabling the company's AI to not just generate
answers to queries but to plan ahead enough to navigate the internet autonomously and reliably to
perform what OpenAI terms deep research. Two other Reuters' sources had viewed demos of what OpenAI
staffers told them was Q-Star, with the main difference being that it was able to answer science
and math questions outside of the capacity of models like GPT4. So back to this new model, another
frequent AI leaker flowers, said anonymous chatbot was able to solve all my test puzzles on the first
attempt. Another ex-user Golden Hawk said it can finally solve the river puzzle without some crazy and
wrong solution. The river puzzle asks, if a man and a dog are on one side of a river and have a boat that
can fit a human and an animal, how do they get to the other side? This has caused some trouble for existing
LLMs, but anonymous chatbot seemed to have no problem. For example, Lama 3.170B has this long
convoluted answer that ends up with the man taking two trips across the river, whereas anonymous
chatbot answered, this puzzle is pretty straightforward since the boat can fit.
both the man and the dog at the same time. The man and the dog simply get into the boat together,
and the man rows them across the river. Once they reach the other side, they both get out of the boat.
That's it. Flowers responded, wow, that's great. I just hope they didn't intentionally
overfit it to the usual memes because they knew these were the first things we would test.
The model isn't perfect. When Feltsim asked, a farmer is on one side of a river with a wolf, a goat,
and a goat, he can only take one item at a time with him. The wolf will eat the goat if left
alone together and the goat will eat the cabbage if left alone together. How can the farmer transport
the goat across the river without it being eaten? It does ultimately get the right answer, but in a very
convoluted way. Anyway, all of this is to say in the most important part of this discussion is that there
appears to be another new OpenAI model being tested right now, which, as Andrew Curran points out,
suggests that something new is on the way. Next up, staying on OpenAI for a minute, the company
has announced a very requested feature from developers that are called Structured Outputs.
They write, last year at Debday, we introduced JSON Mode, a useful building block for developers
looking to build reliable applications with our models.
While JSON mode improves model reliability for generating valid JSON outputs, it does not
guarantee that the model's response will conform to a particular schema.
Today, we're introducing structured outputs in the API.
A new feature designed to ensure model-generated outputs will exactly match JSON schemas
provided by developers.
They continue generating structured data from unstructured inputs is one of the core use cases
for AI in today's applications.
Developers use the OpenAI API API to build powerful assistants that have the ability to fetch
data and answer questions via function calling, extract structured data for data entry and build
multi-step agentic workflows that allow LLMs to take action.
Developers have long been working around the limitations of LLMs in this area via open-source
tooling, prompting, and retrying requests repeatedly to ensure that model outputs match the formats
needed to interoperate with their systems.
Structured output solves this problem by constraining open AI models to match developers-supplied
schemas and by training our models to better understand complicated schemas.
Now, this is a feature that developers I've seen are very excited about.
And interestingly, although it is a technology advance,
in some ways it also matches the sort of product and user experience development
that we're seeing happening elsewhere as well.
A product feature designed to improve the product experience of using Clod.
It's not some big technological advance.
It just makes it a better tool because it separates the output panel from the instruction
panel in a way that makes it a lot easier to use.
This is not exactly the same, but it's also not totally different.
This basically takes a use case that OpenAI understands and modifies the existing experience just to better match it.
Now, there is some technological advance here, so like I said, it's not exactly the same, but you get it.
We're moving into an era where there is more product consideration for specific use cases for LLMs rather than just a big blinking chatbot window.
Now, almost buried in that announcement, there was also a significant price drop.
Stephen Hydel from OpenAI said, oh yeah, nearly forgot to mention.
The new 4-O version with structured outputs is 50% cheaper for input, 33% cheaper for output.
and is available immediately no beta's previews or waitlists. So this, of course, gets at two recent
critiques of OpenAI. The perpetual question of cost, which, to be fair to them, has been coming
down precipitously for some time now. And second, the fact that recently, they'd become one of those
companies that announces things before they're available, leading to some chagrin among users.
Interestingly, reports also came out today that back in 2017, Intel had a chance to buy a stake
in OpenAI. Reuters writes that over a several month period in 2017 and 207,
18, executives at OpenAI and Intel had discussed a variety of different arrangements, including
Intel buying a 15% stake for a billion dollars in cash. They also discussed apparently Intel getting
an additional 15% if it made hardware available for OpenAI at cost. And here we get the problem
with ROI-based thinking. Intel ultimately decided against the deal, partly because then-C-Eo
Bob Swan did not think generative AI models would make it to market in the near future and thus
repay the chipmaker's investment. Now, I'm no mathematician, but if you look at 30%
of the roughly $80 to $90 billion valuation that Open AI commands right now,
you'll see the problem of having too short a time horizon when you're making decisions around
technology. But then again, one of the big stories that we keep coming back to
is the tension between public market dynamics and private venture capital-style thinking.
There is an incredible friction there that is leading to lots of weirdness in this space as it
evolves. A couple more today before we get out of here. Audible is testing a new AI-powered search feature.
Audible has a new personal recommendation expert they call Maven, through which a user can use
natural language to enter queries. The example that TechCrunch gives is I'm looking for an
uplifting fiction novel with a female protagonist. For me, it would probably be something like,
I'm looking for a historical thriller that involves the Templars or the Vatican's entity.
One thing to watch out for when it comes to figuring out where in the hype cycle we are,
when companies can no longer get news stories just for launching an AI feature, you will know we
have entered a different phase. So far, we're not there yet.
Case in point, Reddit is the latest to join the AI-powered summaries as part of search.
During an earnings call on Tuesday, CEO Steve Huffman said that Reddit would begin testing
AI-powered search results to summarize and recommend content.
I am much less interested in the specifics for Reddit and much more interested in this
broader trend of how search is being reorganized.
It's something we've been watching in the context of perplexity, Google AI overviews,
and more recently searched GPT, and seems to be at least for the moment a broader shift.
For now that, that is going to do it for today's AI Daily Brief Headlines edition,
next up the main episode.
Today's episode is brought to you by Venice.
The leading AI companies store your entire conversation history and attach it to your identity forever.
That's every question you ask, every answer you receive, every image you generate, every thought you share with the machine, it's all being spied on.
If you trust all the company's hackers and NSA board members that will ever have access to your AI conversations, then rejoice.
For you are well served.
For the rest of us, Venice is an alternative.
Venice is a powerful AI app for text image and code generation that respects,
you as a sovereign individual and believes privacy and free speech are not only human rights,
but necessary for civilizational advancement. Private, permissionless, and uncensored, you can try it for
free without an account. AIA Daily Brief listeners receive a 20% discount on Venice Pro.
Visit venice.a.l. slash NLW. And enter the discount code, NLW Daily Brief. That's NLW Daily
Brief. All one word. Today's episode is brought to you by Super Intelligent, the platform that
helps teams maximize AI. Super is, of course, the platform that we've been built.
that pairs fun, fast, practical video tutorials with step-by-step instructions to get you actually
using AI, and from there unlocks information about hundreds of use cases that show you how people
are actually getting value out of AI right now. Now, we have just launched Super for teams. This is a new
add-on experience that allows teams to share more information about what AI they're using, how it's
working, and how to get more value out of it. Whether your company is 25 or 2,500, Super Bowl
Superintelligent is going to be the best platform for unlocking information about how to get the most out of AI right now.
If you'd like to learn more about the super intelligent team's offering, go to BESuper.A.I.
slash partner and send us a note so that our team can get right back to you.
Once again, that's Bsuper.A.I. slash partner.
Welcome back to the AI Daily Brief.
Today we are talking about the latest in AI powered robots, specifically,
Figure has announced their latest robot, the Figure 2.
Now, for us here at the AI Daily Brief, this is obviously a related but separate area than what
we normally focus on. However, we are likely also living through the last period in which
AI isn't fully integrated with humanoid robots, which could have as much of an impact on
work as anything we talk about here separately. Right now, two of the most discussed companies
when it comes to humanoid robots are the Tesla Optimus and the figure O2. So what is new about
figure O2? Figure CEO, Brett Adcock, writes, figure O2 was a ground-up
redesigned to achieve six times the cameras, 50% more battery, fourth generation hands, integrated
wiring, exoskeleton structure, and speech-to-speech reasoning. Now, speech-to-speech reasoning is the
big thing that has been called out here. Adcock writes, figure O2 is capable of speech-to-speech
conversations with humans, on-board mics and speakers connected to custom AI models trained in
partnership with open AI. The default UI to our robot will be speech. This is something we've been
discussing a lot, both on this show, as well as within the super-intelligent community, of how much
the default UI for our interactions with AI in general will be moving to speech and voice.
Adcock shared a graphic that shows, on a very high level, how figure works.
When someone asks, for example, can I have something to eat?
The figure O2 takes advantage of an integrated system that includes open AI's model,
which allows for common sense reasoning from images,
neural network policies that enable fast, extras manipulation,
body controller to output the ability to say, sure thing, here's an apple,
and then hands someone an apple.
Adcock also writes that figure O2 has an onboard.
vision language model. This enables semantic grounding and fast common sense visual reasoning from
robot cameras. Basically, the VLM is how the robot takes in visual stimuli and interacts with it.
The big thing about the battery is that they're framing it in terms of what it can actually do
from a work perspective. Adcock writes, we hope figure O2 will be able to achieve upwards of
around 20 hours of useful work per day, which is obviously totally transformative if the figure
can actually mirror things that are done by humans today. One of the big shifts from the figure
2001, they say, is the exoskeleton structure. Adcock writes, in order to provide structural
stiffness and protect against crash loads, figure O2 was designed as an exoskeleton structure,
similar to aircraft where the outer skin bears the load. Now, you might also remember that back
in February, figure raised a $675 million series B. OpenAI was a partner and investor in that deal,
and that partnership is clearly expanded in the time since. Alongside the announcement,
the company also shared more details of its test with humanoid robots at a BMW.
plant in Spartanburg. BMW writes, during a trial run lasting several weeks at BMW group
plant Spartanburg, the latest humanoid robot, figure O2, successfully inserted sheet metal parts
into specific features, which were then assembled as part of the chassis. The robot must be
particularly dexterous to complete this production step. BMW says, using a robot can save employees
from having to perform ergonomically awkward and tiring tasks. The takeaway for many is not so much
that the figure O2 is completely production ready, but that it is actually in the testing in a real-world
environment stage. Another big topic of conversation on Twitter, at least, is the competition with
Tesla's Optimus. I personally think that this is a little bit overstated, mostly because I think that
the total addressable market for humanoid robots is going to be so massively immense that, while
yes, there will be a first mover advantage to some extent, there is going to be a lot of market share
to go around. For Tesla investors, though, it might be seen more as a bellwether for how the company
is doing in general. Danish on X writes, Tesla Optimus has serious competition.
now. As expected, Tesla is now under significant pressure from a more focused and better-performing
competitor figure. I spoke about this last year, including the importance of visionary
leadership and good governance. Well, here is the result. Figure is backed by Nvidia and OpenAI
who are betting that this will be the first real embodied agent. Now, one other interesting note in
the hardware space, it's not exactly robotics but seems somewhat related. OpenAI has made a $60 million
investment into Opel. In fact, the company is leading Opel Series B.
Opal is a plug-and-play webcam, which promises quality similar to a DSLR.
It's what you see when I'm on camera in these videos, or, for example, on the videos on Superintelligent.
The information writes OpenAI's involvement in the funding round is surprising.
Opel is best known for its $300 professional-grade webcams, not an obvious match for an LLM developer.
But they write, Opel plans to develop other types of devices powered by OpenAI's AI models
while it continues to sell its webcams.
The three-year-old startup envisions developing devices that individuals can use as creative tools
rather than AI-powered friends or companions.
Opal will be working closely with OpenAI researchers
to prototype various device ideas.
It is very clear from all of Altman in OpenAI's moves
that they think that there is going to be an entire new generation
of AI-powered hardware, AI-powered devices, and AI-powered robots,
and it's a future which appears to be materializing rapidly.
For now, though, that is going to do it for today's AI Daily Brief.
Until next time, peace.
