The AI Daily Brief: Artificial Intelligence News and Analysis - The State of AI for Robotics
Episode Date: March 14, 2025Google just released Gemini Robotics, a powerful new AI built for humanoid robots, taking robotics beyond simple tasks like opening doors to handling complex actions like folding origami or packing gr...oceries without specific training. Companies like Figure AI, NVIDIA, and startups Unitree, Dexterity, and Apptronic are also pushing ahead. SPECIAL OFFERTo get your ready-to-go agent from https://www.lindy.ai/ email nlw@besuper.ai with the word "LINDY" in the titleBrought to you by:KPMG – Go to https://kpmg.com/ai to learn more about how KPMG can help you drive value with our AI solutions.Vanta - Simplify compliance - https://vanta.com/nlwThe Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, Google's new model for embodied AI.
Before that in the headlines, more information on Google's investment in Anthropic.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, follow the Discord link at our show notes.
We kick off today with a report from the New York Times around Google's relationship with Anthropic.
The headliner statistic was that documents obtained by the New York Times show that Google owns about 14% of Anthropic.
Now, of course, we knew that Google had been in a lot of,
an investor in Anthropics, so that's nothing new. Instead, this just gives a little bit more of a
background picture around one of these very interesting deals that is, quite frankly, novel to the
AI space. OpenAI's deal with Microsoft set the template for this, and the catalyst for it is the fact
that AI needs so much money that the traditional venture capital establishment, which kind of
taps out at a billion or two billion dollars usually, just couldn't keep up with the demand for
tens of billions of dollars of capital. That effectively left the frontier labs, having
their only choice being to team up with one of the big tech giants. Part of the reason that the news media
is so interested in this is that it's caught up in the Google antitrust case. You might remember that
back in August, a federal court found that Google had acted as a monopolist in Internet search,
and the Justice Department has made a set of proposals around how to remedy the situation,
including forcing Google to sell any AI products that could possibly compete with search. That puts their
relationship with Anthropic, whose clawed chatbot is used as a form of search by some, squarely in the
crosshairs. Now, Anthropic has argued that Google should not be forced to divest. They said that a forced
divestment would, quote, harm both Anthropic and competition more generally. They said that it would
depress Anthropics value and hinder its ability to raise capital. Ultimately, this is just another
interesting artifact in what is a fast-changing financial landscape alongside the AI startup scene.
Speaking of the fast-changing AI startup scene, a company that has gotten more attention than just
about any other over the last week or two is, of course, the AI agent startup Manus. Well, that company has now
announced that it's teaming up with Alibaba to be officially able to launch their product in China.
In a statement, they said that they were engaging in strategic cooperation with Alibaba's
Quen team to, quote, meet the needs of Chinese users.
Basically, the deal right now is that if you are releasing an artificial intelligence product
for the Chinese market, you have to work with a Chinese AI company.
This is why, for example, Apple hasn't released even their basic Apple intelligence features in the
country because they've been working to finalize that set of partnerships.
Given the excitement around Manus right now, T. P. Huang captured a lot of the sentiment when they wrote,
Alibaba Cloud will need a whole lot more compute.
Speaking of Alibaba, that company has also released a new AI model they're calling R1 Omni,
just firmly in the line of just great, memorable AI model names that they claim can read human emotions.
The team published demos that showed the functionality in interpreting video inputs.
In the video, a man in a brown jacket stands in front of a vibrant mural.
His facial expression is complex with wide eyes, slightly open mouth, raised eyebrows, and furrow
brows, revealing surprise and anger. Speech recognition technology suggests his voice contains words
like you, lower your voice and freaking out, indicating strong emotions and agitation.
Overall, he displays an emotional state of confusion, anger, and excitement.
While the specific use cases haven't been articulated for this, Bloomberg suggested it could
be a way for Alibaba to keep up with OpenAI's GPT 4.5. On launch, OpenAI had said that their new
model had, quote, a better understanding of what humans mean and interpret subtle cues or impact
expectations with greater nuance and EQ. Lastly, today, beliegered Intel has announced a new CEO,
renewing hopes, at least among some, that the struggling company could be revived.
Three months ago, Pat Gelsinger was fired as CEO after a four-year stint. He was installed at the
head of the company in 2021 with a mandate to rationalize the business and turn things around.
By the time he was ousted in December, however, it looked as though the once-great U.S. chipmaker
was going to be sold off for parts. A few months went by with various merger and acquisition rumors.
there were even reports that the Trump administration was pushing a shotgun arrangement with TSM
who would take over chipmaking boundaries. The board, though, has now named Liputon as the new
CEO. Tahn is a 40-year veteran tech investor and served on the board since 2022. He resigned from his
board last year, reportedly due to disagreements on how to turn the company around. And when he did
resign, that left the board with a sum total of zero members with any experience in the semiconductor industry.
Now at the helm, Tahn will be allowed to put his recovery plan into action. In a statement, he wrote,
Together, we will work hard to restore Intel's position as a world-class products company,
establish ourselves as a world-class foundry, and delight our customers like never before.
Following the appointment, though, news broke that the TSM takeover plan is still alive.
TSM has pitched NVIDIA, AMD, and Broadcom on taking shares in a joint venture that would
operate Intel's foundries.
DSMC would take the lead role in operating the business, but would not own more than 50%
of the joint venture.
This would help ameliorate concerns from the Trump administration about a foreign company
owning critical U.S.-based chipmaking facilities.
According to Reuters, Intel board members have backed a deal
and held negotiations with DSMC, while some executives are firmly opposed.
We'll have to see if that goes through, but overall, Wall Street likes the deal,
Wall Street likes the new appointment, with Intel stock up 11% in overnight trading.
That's going to do it, however, for today's AI Daily Brief Headlines edition.
Next up, the main episode.
We talk a lot about agents on this show.
But if you've ever thought to yourself, I don't want to talk about agents
anymore. I just want to actually build and deploy something. I'm really excited to share something
special with you today. We've partnered with Lindy to offer companies that just want to dive into the
deep end of agents a way to get their feet wet, a way to move fast and build something meaningful
without breaking the budget. The first five companies that email me, NLW at Bsupert.a.i,
with Lindy and the title, will have access to work with Lindy to build an actual functional
agent serving their specific needs for under $20,000.
Some of the agents you can build include a customer support agent, maybe automating responses
on your website.
You could build an SDR for generating or qualifying sales leads, or you could build an agent
that's perfectly suited for your internal communications needs, be it note-taking, scheduling,
or something else.
Not only is Lindy structured to integrate with all of the places that you already keep data
and information.
It's also a full extensible platform, which means as you hire more and
more agent employees and really build out your digital workforce,
Lindy's going to enable those agents to be interoperable
and basically be able to work together in a seamless way.
So again, if you are interested in diving in all the way to agents,
in a matter of weeks, not months, not years,
email me, nLW at B-super.aI, put Lindy in the title,
and let's get your first digital employee online.
Today's episode is brought to you by Vanta.
Trust isn't just earned, it's demanded.
Whether you're a startup founder navigating your,
first audit or a seasoned security professional scaling your GRC program, proving your commitment
to security has never been more critical or more complex. That's where Vanta comes in.
Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks
like SOC2 and ISO-2-2-2. Centralized security workflows, complete questionnaires up to 5X faster,
and proactively manage vendor risk. Vanta can help you start or scale up your security program by
connecting you with auditors and experts to conduct your audit and set up your security program
quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back, so you can
focus on building your company. Join over 9,000 global companies like Atlassian, Kora, and Factory,
who use Vanta to manage risk and prove security in real time. For a limited time, this audience gets
$1,000 off Vanta at vanta.com slash NLW. That's V-A-N-T-A dot com slash NLW for $1,000 off.
Hey listeners, are you tasked with the safe deployment and use of trustworthy AI?
KPMG has a first of its kind AI Risk and Controls Guide,
which provides a structured approach for organizations to begin identifying AI risks
and design controls to mitigate threats.
What makes KPMG's AI Risks and Controls Guide different is that it outlines practical
control considerations to help businesses manage risks and accelerate value.
To learn more, go to www.kpmG.us.
slash AI Guide. That's www.kmg.us.
slash AI Guide.
Today we're going to do that thing where we take a bit of contemporary news and use that as a lens
to look at a broader set of updates that have happened over the last few weeks.
And as I mentioned, we are talking today about the intersection of AI and robotics.
Now, the specific catalyst for this conversation is that Google has released a new family of
AI models that are specifically designed to drive humanoid robotics, meaning it's a good time to
talk about embodied AI. This is a field that is moving extremely quickly, and a big part of that is
driven by the advances in the AI models that actually power the robotics. It's less than six months
since Elon Musk unveiled Tesla's Optimus Robot at the big splashy Robotaxy event. And while those robots
were visually impressive, it came out in the following days that the robots were largely being
controlled by remote from behind the scenes. And as much as that was fodder for the Elon haters,
it also reflected the fact that embodied AI is really hard,
especially when it comes to AI models that work for generalized tasks.
Humanoid robots have so far required specific training for each action,
with the AI models largely helping with edge cases and little deviations.
For example, the optimist robots could easily mix a drink during the demo,
likely because they were trained to do that.
However, they would have had difficulty if a patron asked to shake their hand
without a human controlling them.
That's the problem that Google DeepMind's new AI model is trying to solve.
called Gemini Robotics, the new model is built on top of Gemini 2.0, inheriting Gemini's native
multimodal functionality, meaning that the model can process visual text and audio inputs.
In their announcement blog post, Deep Mind wrote, to be useful and helpful to people, AI models
for robotics need three principal qualities. They have to be general, meaning they're there
to adapt to different situations, they have to be interactive, meaning they can understand
and respond quickly to instructions or changes in their environment, and they have to be dexterous,
meaning they can do the kind of things people generally do with their hands and fingers,
like carefully manipulate objects.
DeepMind is actually built a pair of models to drive different parts of the functionality
required for generalized robotics.
The first is their advanced vision language action or VLA model,
which is functionally similar to other multimodal LLMs,
but includes physical actions as a new mode of output.
The second is called Gemini Robotics ER, short for embodied reasoning.
The model takes the premise behind reasoning models and applies it to physical environments.
As DeepMind put it, the model has, quote, advanced spatial understanding.
Now, as an interesting note, this is similar to the way that the current generation of AI agents are being designed.
Agent builders typically use a reasoning model for planning and analysis of the situation
and then hand that off to a separate model for execution, meaning that it's not unreasonable to think
of embodied AI as agents with eyes and hands.
DeMind says the Google Robotics model, quote,
leverages Gemini's world understanding to generalize to novel situations and solve a wide
variety of tasks out of the box, including tasks it has never seen before in training.
As the model is built on top of an LLM, it has a general understanding of language inputs and can take instruction in natural language.
One of the demo videos shows a table with a variety of fruit and containers laid out.
The embodied AI receives a voice command and deftly places the banana in the clear container without having any specific training on that task.
Google also demonstrated a big step-up and fine motor skills, with the embodied AI able to close a Ziploced bag and even make an origami crane.
The reasoning model, Google Robotics ER, is added to help increase the robot's ability to plan for novel tasks.
task execution. DeMind writes, combining spatial reasoning and Gemini's coding abilities,
Gemini Robotics ER can instantiate entirely new capabilities on the fly. For example,
when shown a coffee mug, the model can intuit an appropriate two-finger grasp for picking it up
by the handle and a safe trajectory for approaching it. Functionality from reasoning LLMs
also carries over into the real world, meaning the robots can do things like play tic-tac-tow
or complete a word puzzle using scrabble tiles. A key breakthrough here is that this system of models
allows robots to move from a narrow range of specific tasks to much more generalized applications.
Kyrthana Gopalakrishnan, who works on the embodied AI team at DeepMind posted,
Gemini Robotics is out and is the most advanced VLA in the world. I'm especially blown away
by the instruction following results. It's the first time where I've personally felt that
building generic embodied intelligence is within reach, like a robot coming to life.
Bloomberg's Mark German pointed out that the implications are for much more than just Google
DeepMind. He said artificial intelligence is going to be at the core of everything,
and really the ultimate hardware expression of AI is robotics, being able to understand how a human acts,
artificially learn from data, and mimic a human. And that's what a robot is.
Now, Google aren't the only ones that have been working on this form of embodied AI models.
In early February, Figure AI ditched their partnership with Open AI to use their own models developed in-house.
A few weeks later, we got a look at what these models can do.
The demo video showed a pair of robots working together to pack away a grocery delivery.
The robots had never seen the items before, but were able to reason about where the ketchup
bottle should go in the fridge. If one's trying to make direct one-to-one comparisons, some might think
that this demo wasn't as impressive as Google's demos from this week, with the robots acting much
more slowly, seeming less dexterous, and promising a more limited range of tasks. But on the other hand,
Figure AI has their own humanoid design in production, while Google were demonstrating their software
on hardware source from other companies. Still, both companies seem to be working on the same basic
system design of pairing a reasoning model with an execution model. When they dropped the OpenAI deal, Figure AI
CEO Brett Adcock said, we found that to solve embodied AI at scale in the real world, you have to
vertically integrate robot AI. We can't outsource AI for the same reason we can't outsource our
hardware. And Figure AI has begun deploying their robots in real world settings. They have one pilot
program currently underway in the BMW manufacturing plant in South Carolina, and a second undisclosed
contract that the company says could potentially allow them to reach 100,000 robots shipped.
The company indeed showed a video of robot sorting parcels, making many think that the client is one of the
large U.S. shipping companies. These are both commercial clients, but much of the excitement and
appetite, at least from an investor perspective, is what seems to many as the inevitable future of
bringing humanoids into the household setting. Figure AI also seems to have demonstrated that
humanoid companies are past the speculative phase, at least in terms of their valuations.
Last February, during their Series B, the company was valued at a very decent $2.6 billion,
but last month, Bloomberg reported that they are in talks to raise their Series C at a valuation of
$39.5 billion. Of course, we are now also living in the world of deep-seek and manis,
and everyone is wondering what's going on in China. It feels like every day on X,
you can see a video of some Chinese-produced robot carrying out some feat of dexterity.
Earlier this month, one company called X-Robot went viral, with an extremely lifelike
female robot with a good voice model behind it. Now, this video that you're watching here
had the sci-fi factor turned all the way up, so who knows how real the product is.
Then again, with what we've seen out of Chinese AI in recent months, I certainly wouldn't count it out.
One Chinese company that is definitely producing real products is Unitary.
They had a huge range of robots and assorted form factors on display at CES in January.
You also might have seen the company's latest viral video showing a Kung Fu robot kicking a stick out of a person's hand.
Now, many of the videos from trade shows still have a human operator in control,
which gets us exactly back to why potentially this Google model is such important news.
as Google may have just demonstrated a path to fill in the blanks where Chinese embodied AI is lacking.
Right now, Unitary is offering these G1 units starting at $16,000,
but you have to think those prices are going to come down precipitously in the years ahead.
Another key player in embodied AI that's worth mentioning in this roundup is Invidia.
The chipmaker isn't working on robots per se, but they've definitely made some big advancements in the AI used to train them.
In January, Nvidia released their Cosmos World Foundation model.
The generative model can be used to create virtual simulations of real-world simulations,
scenarios for robot training. Improvements in world models have been one of the big breakthroughs over
the past few months, with several startups showing off their own versions of the tech and development.
The idea is that a digital twin of a robot can be placed in a simulation, which allows synthetic
training data to be quickly generated. This doesn't help necessarily with the reasoning and
generalization problem that Google is working on, but it does allow for big improvements in
dexterity and specific movement training. The Cosmos reveal in January also came with some very
bullish statements from Nvidia CEO Jensen Huang. He said the chat-chabit-te moment for general
robotics is just around the corner. He also delivered his keynote address standing in front of a chart
showing the AI sector going exponential. After agentic AI, the wave that we're currently in the middle of,
the chart spiked even higher for physical AI, consisting of self-driving cars and general robotics.
During the speech, Huang said that self-driving cars would likely be the, quote,
first multi-trillion dollar robotics industry. And while at this point, we haven't seen anything that
looks close to a fully capable general purpose humanoid, Huang did mention that he expects
invidia's products to power a billion humanoid robots over the coming years. So far, I've hit a lot of
the biggies. But even beyond these companies, VCs are definitely sitting up and paying attention
to the potential inflection point we're hitting with embodied AI. Earlier this week,
Dexterity Inc. raised $95 million at a $1.65 billion valuation to build robots capable of
human-like dexterity. The company's pitch is remarkably similar to the way Google described
their criteria for generalized robotics. CEO Samir Menon described that his robots can touch and
recognize objects, are aware of, and respond appropriately to surroundings, and will move gracefully
and adjust as needed. He added, the combination of those three is what we engineer and what we
believe will drive the future of physical AI. Revere's Jane, a partner at Lightspeed Ventures,
said he was investing more money in the company because he believes were reaching an inflection
point for physical AI. Also, last month, a startup called Apptronic raised $350 million in Series A funding
at an undisclosed valuation. The company is a spin-out from the University of Texas and has
been working on humanoid robots for over a decade. The round included participation from Google
with DeepMind partnering with the company to provide the AI to drive their robots. In fact,
you could see the Apptronic robots putting Google's embodied AI through its paces in the demo
videos from this week. The raise was vastly more money than the $28 million the company had raised
prior to this round, and CEO Jeff Cardenas commented that the mega round was necessary because his
robots are almost production ready. He said, what 2025 is about for Apptronic and the
humanoid industry is really demonstrating useful work in these applications with these initial
early adopters and customers, and then true commercialization and scaling happening in 2026
and beyond.
Explaining the Google partnership, Cardenas said it made far more sense than creating their own models,
adding, we believe that right now, Google is at the top of the game and building some of the
best models in the world.
So friends, that is a quick update on the state of embodied AI, the intersection of AI and
robotics.
And that is where we will wrap today's episode.
Appreciate you listening as always.
And until next time, peace.
