Tech Brew Ride Home - Wed. 12/11 – Gemini 2.0
Episode Date: December 11, 2024GM shocked everybody by shutting down the Cruise robotaxi business. ChatGPT is finally on your iPhone once you update it. Could all Apple watches someday have satellite texting? And then, I guess Goog...le wanted to pre-empt Santa Sam, because they released an absolute slew of AI products today. Links: GM to refocus autonomous driving development on personal vehicles (GM Investor Relations) GM Calls It Quits on Mary Barra’s $50 Billion Robotaxi Dream (Bloomberg) Apple’s Next Ultra Smartwatch Will Be Able to Send Texts Via Satellite (Bloomberg) Google Rolls Out Faster Gemini AI Model to Power Agents (Bloomberg) Google’s new Trillium AI chip delivers 4x speed and powers Gemini 2.0 (VentureBeat) Gemini 2.0, Google’s newest flagship AI, can generate text, images, and speech (TechCrunch) Google unveils AI coding assistant ‘Jules,’ promising autonomous bug fixes and faster development cycles (VentureBeat) Google unveils Project Mariner: AI agents to use the web for you (TechCrunch) Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco.
Hey, who did this to you?
What happened next turned the story into a political firestorm.
Reports have identified the victim as Bob Lee, the founder of Cash App.
From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16.
Welcome to the Tech meme right home for Wednesday, December 11th,
2024. I'm Brian McCullough today. GM shocked everybody by shutting down the Cruz
Robotaxy business. ChatGBT is finally on your iPhone once you update it. Could all Apple
watches someday have satellite texting? And then, I guess, Google wanted to preempt Santa
Sam because they released an absolute slew of AI products today. Here's what you missed today in
the world of tech. GM says Cruz is exiting the Robotaxy business and Cruise and GM's
technical teams will be combined to focus on autonomous tech for upcoming GM vehicles instead.
So then there was only one, and by one I basically mean Waymo, at least in the consumer self-driving
space. First, here's what GM itself said, quote, General Motors plans to realign its
autonomous driving strategy and prioritize development of advanced driver assistance systems
on a path to fully autonomous personal vehicles. GM will build on the progress of supercrues,
the company's hands-off, eyes-on driving feature, now offered on more than 20 GM vehicle models
and currently logging over 10 million miles per month. GM intends to combine the majority-owned
Cruise LLC and GM technical teams into a single effort to advance autonomous and assisted driving,
consistent with GM's capital allocation priorities. GM will no longer fund Cruise's
Robotaxy development work given the considerable time and resources that would be needed to scale
the business along with an increasingly competitive Robotaxy Market.
is committed to delivering the best driving experiences to our customers in a discipline and
capital-efficient manner, said Mary Barra, Chair and CEO of GM.
Cruise has been an early innovator in autonomy and the deeper integration of our teams,
paired with GM's strong brands, scale, and manufacturing strength will help advance our vision
for the future of transportation. As the largest U.S. automotive manufacturer,
we're fully committed to autonomous driving and excited to bring GM customers its benefits,
things like enhanced safety, improved traffic flow, increased accessibility,
and reduced driver stress, said Dave Richardson, Senior Vice President of Software and Services Engineering, end quote.
And quoting Bloomberg. It's a big retrench for GM and crews, which survived a shakeout among autonomous driving companies and restarted operations after one of its cars dragged a pedestrian last year.
The move has significant implications for GM. Chief Executive Officer Mary Barra wanted to transform the automaker into a transportation technology company and double GM's revenue by 2030, in part by General.
generating $50 billion from Cruz. Without a robotaxy business to bring in fares, that goal looks remote.
GM is pulling back just as Alphabet's Waymo expands into more cities, and Tesla plans to start its
robot taxi business in 2026. Tesla's CEO, Elon Musk, is now one of the most influential voices in
President-elect Donald Trump's circles and has pressed for a federal framework for self-driving cars.
Ending the Robotaxy push brings GM closer to its main business. The company will develop the technology to
enhance its core business of building cars, scrapping dreams of mobility as a service."
End quote.
Right.
I kind of don't get the timing of this.
Unless this is merely a dollars and cents calculation, Waymo is clearly hitting scale all
of the sudden.
Pedal to the metal, if you will forgive the pun.
It might take years, but if Waymo can just scale up ride-hailing region by region, like a
burger joint does franchising territory by territory, they seem to be having a bit of a breakthrough
moment, as we've discussed. And as the new administration clearly is motivated to create a framework
for self-driving tech, why, and forgive the pun again, slam on the brakes now.
ChatGPT is now on your iPhone. That's because Apple this morning began rolling out iOS 18.2
and iPadOS 18.2 adding major AI updates, including chat GPT integration for the first time,
but also Gen Moji and Image Playground on iPhone 15 Pro, 16 and 16.16.
Pro. The rollout extends to macOS 15.2, though these AI features are currently accessible only in
select regions, the U.S., Australia, Canada, New Zealand, South Africa, and the UK, and exclusively
for devices set to English. Users now have access to Image Playground, available both as a
standalone app and within messages. The tool generates image suggestions based on chat context
or custom prompts, with support for using existing photos as inspiration, while Image Playground
intentionally avoids creating photorealistic human images to prevent misuse. Its creations
integrate seamlessly with freeform pages and keynote. The innovative Gen Moji feature transforms the
emoji experience letting users craft custom expressions through text descriptions or friend photos.
Apple Intelligence serves up multiple options to choose from, access directly through the
emoji keyboard's new Gen Moji button. It is Siri and writing tools that have received that
powerful upgrade through chat GPT integration. Users can now leverage.
the chatbot's capabilities for tasks like creating itineraries or workout plans, though a
daily query limit applies. No separate chat GPT account is required for these features. Beyond that,
users gain expanded air tag sharing options for both friends and airlines, while News Plus
subscribers receive daily Sudoku puzzles. The update also reintroduces lock screen volume controls,
a feature previously removed in iOS 16. Users can access these updates through settings,
general software update on iOS and iPadOS devices, or via system settings and software update for
macOS systems if automatic notifications haven't appeared.
Sources are telling Bloomberg that the 2025 version of the Apple Watch Ultra is slated to get
satellite connectivity for off-the-grid text messages via Global Star and might finally get
that blood pressure monitoring system Apple has been working on.
But it's the satellite stuff for text messaging that I'm interested in.
you can see why they've been investing in Global Star.
This is clearly a feature that you can see migrating down the whole Apple Watch line over time.
Like, why upgrade my Apple Watch three years from now?
Because maybe I'll be able to text from wherever.
Quoting Bloomberg.
The satellite capability is slated to come to next year's Apple Watch Ultra,
the company's top-of-the-line model, according to people familiar with the matter.
The technology will let smartwatch users send off-the-grid text messages
via Global Star's fleet of satellites when they don't have a cellular or Wi-Fi connection.
The other feature, which would monitor whether Apple Watch users have high blood pressure may arrive as soon as 2025 as well,
said the people who asked not to be identified because the work is confidential.
But it's been delayed before with Apple previously aiming to release the tool last year, end quote.
So for now, this just gives people an extra reason to pay up for the top of the line model.
But remember, just in November, Apple invested $1.5 billion for Global Star to increase its infrastructure
and took a 20% stake in the company for their trouble.
Apple introduced satellite connectivity with the iPhone 14 in 2022, initially just enabling users to maintain emergency service contact while exploring off-grid locations.
But then the following year, they incorporated Roadside assistance providers and the most recent iteration fully integrated iMessage functionality.
So you can see where this is going, a TikTok of iterative additions from a safety focus feature, maybe eventually to be just, you can always text people when you.
you're out with your watch, even without your iPhone. And here comes the slew of Google AI headlines,
I promised you. Google release their new flagship model, Gemini 2.0, and announced plans to test it
in search and AI overviews, saying 2.0 makes, quote, it possible to build agents that can think.
Quoting Bloomberg, Gemini 2.0 can generate images and audio across languages and can assist during
Google searches and coding projects, the company said Wednesday. The new capabilities of Gemini
quote, make it possible to build agents that can think, remember, plan and even take action on
your behalf, said Tulsi Doshi, a director of product management at the company in a briefing with
reporters. Beyond experimental products, Google incorporated more AI into its search engine,
which remains its lifeblood. The company said that this week it would begin testing Gemini
2.0 in search and in AI overviews. The artificial intelligence-powered summaries displayed at the
top of Google search. That will improve the speed and quality of search results.
for increasingly complex questions like advanced math equations.
Google also debuted a new web feature called Deep Research,
which it says will enable Gemini users to use AI to dive into topics with detailed reports.
The feature, billed as an AI-powered research assistant,
will be available Wednesday to users of Gemini Advanced,
Google's paid AI subscription product.
The products featured on Wednesday show how Google's premier AI lab,
Google DeepMind, is playing a more pivotal role in the product development.
The lab is expanding tests of Project Astra,
an AI agent that uses a smartphone camera to process visual input in an elaborate space evoking a home library with towering bookshelves containing titles on computer programming and travel. Google employees showed how Astra can summarize information on the page. A hidden door nestled in the shelves revealed a small art gallery where the agent reflected on how Norwegian painter Edvard Munch, the scream, captured his own anxiety and the general paranoia of his age, end quote. Google is also apparently testing AI agents based on Gemini 2.0 that can understand.
and rules in video games like Clash of Clans to help players with the game.
But some of these other announcements were interesting enough that I don't want to just fold this
into one big segment.
So, Trilium is Google's new six-generation AI chip, which powers Gemini 2.0,
with four times the training performance of its predecessor while using significantly less
energy, which sounds like a big deal.
Quoting Venture Beat.
Trilium specifications represent significant advances across multiple dimensions.
The chip delivers a 4.7x increase in peak compute performance per chip compared to its predecessor,
while doubling both high bandwidth memory capacity and interchip interconnect bandwidth.
Perhaps most importantly, it achieves a 67% increase in energy efficiency, a crucial metric
as data centers grapple with the enormous power demands of AI training.
The business implications of Trilium extend beyond raw performance metrics.
Google claims the chip provides up to 2.5 times improvement in training performance per dollar
compared to its previous generation, potentially reshaping the economics of AI development.
This cost-efficiency could prove particularly significant for enterprises and startups developing large-language models.
AI21 Labs, an early Trilium customer, has already reported significant improvements.
The advancements in scale, speed, and cost efficiency are significant, noted Baraz Lenz,
CTO of AI21 Labs in the announcement, end quote.
This is clearly a shot across the bow of Nvidia.
InVIDIA chips are still the industry standard,
but Google seems to think they can pick away at Nvidia's lead around the margins with specific types of workloads.
And increased performance and efficiency would certainly turn some heads as well.
Gemini 2.0 Flash is a new variant of Gemini 2.0 for generating images, audio, and text,
and use third-party apps and services available via Gemini API and developer platforms.
Quoting TechCrunch.
Google claims that 2.0 Flash, which is twice as fast as the company's Gemini 1.5 Pro-Modeled,
on certain benchmarks per Google's own testing is, quote, significantly improved in areas like
coding and image analysis. In fact, the company says 2.0 Flash displaces 1.5 Pro as the flagship
Gemini model, thanks to its superior mass skills and factuality. As alluded to earlier, 2.0 Flash can
generate and modify images alongside text. The model can also ingest photos and videos as well as
audio recordings to answer questions about them, e.g., what did he say? Audio generation is
2.0 Flash's other key feature, and Google described it as steerable and customizable.
For example, the model can narrate text using one of eight voices optimized for different accents
and languages. You can ask it to talk slower, you can ask it to talk faster, or you can
even ask it to say something like a pirate. A Google spokesperson said, the production version of
2.0 Flash will land in January, but in the meantime, Google is releasing an API, the Multimodal
Live API, to help developers build apps with real-time audio and video streaming functionality, end
quote. Jules is Google's new AI coding assistant that can autonomously fix software bugs and prepare
code changes built on the new Gemini 2.0 platform. Quoting Venture Beat, Jules integrates directly with
GitHub's workflow system and can analyze complex codebases, implement fixes across multiple
files, and prepare detailed pool requests without constant human supervision. Unlike traditional
coding assistants that merely suggest fixes, Jules operates as an autonomous agent within GitHub's
ecosystem. It analyzes codebases, creates comprehensive repair plans, and executes fixes across
multiple files simultaneously. More importantly, it integrates seamlessly with existing developer workflows.
During a press conference, Jacqueline Kanzelman, Director of Product Management at Google Labs,
emphasized the system's safety features. Developers are in control along the way, she explained.
Jules presents a suggested plan before taking action, and users can monitor its progress writing
code. The system requires explicit approval before merging any changes maintaining human oversight
of the development process. Software development projects typically run significant risks of
cost overruns, with large IT projects running 45% over budget and delivering 56% less value than
predicted, according to McKinsey. By automating routine bug fixes and maintenance tasks,
Jules could significantly reduce these costs while accelerating development cycles, end quote.
Finally, on the agentic front, but probably eventually on the consumer side,
Project Mariner is a prototype AI agent Google DeepMind built that can control Chrome, the web browser,
can move the cursor, click buttons, fill out forms, and more.
Quoting TechCrunch.
A Google executive tells TechCrunch that this is part of a, quote, fundamentally new UX paradigm shift,
moving users away from directly interacting with websites and instead interacting with a generative AI system
that does it for you. These shifts could affect millions of businesses from publishers like
TechCrunch to retailers like Walmart, which have historically relied on Google to send real
people to visit and use their websites. In a demo with TechCrunch, Google Labs director
Jacqueline Kondselman showed how Project Mariner works. After setting up the AI agent with an extension
in Chrome, a chat window pops up to the right of your browser. You can instruct the agent to do
things like, quote, create a shopping cart from a grocery store based on this list. From there, the
AI agent navigated to a grocery store's website, in this case Safeway, and then searched for
and added items to a virtual shopping cart. One thing that's immediately evident is how slow the
agent is. There were about five seconds of delay in between each cursor movement. At times,
the agent stopped its task and reverted back to the chat window asking for clarification
about certain items, like how many carrots. Google's agent cannot check out, as it's not
supposed to fill out credit card numbers or billing information. Project Mariner also won't accept
cookies for users or sign a terms of service agreement. Google says it purposefully doesn't allow the
agent to do these things in order to give users more control, end quote. For now, Project Mariner
maintains the traditional digital ecosystem for now. So websites and online retailers will
continue receiving valuable visitor information again for now. Yet, this shift hints at evolving
user behavioral patterns. As AI agents become more sophisticated, we're likely to see decreasing
direct engagement with websites. The technology's trajectory suggests a future where AI might
bypass traditional web interfaces entirely, which makes you wonder what the web becomes if it's
not a medium and interface for human interaction, but for bot interaction. Forget about a web browser
as the window to the web. Imagine an AI bot interface as all you need.
you never actually go to the web.
Nothing more for you today. Talk to you tomorrow.
