Tech Brew Ride Home - Wed. 12/11 – Gemini 2.0

Starting point is 00:00:00 On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco. Hey, who did this to you? What happened next turned the story into a political firestorm. Reports have identified the victim as Bob Lee, the founder of Cash App. From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16. Welcome to the Tech meme right home for Wednesday, December 11th, 2024. I'm Brian McCullough today. GM shocked everybody by shutting down the Cruz Robotaxy business. ChatGBT is finally on your iPhone once you update it. Could all Apple

Starting point is 00:00:47 watches someday have satellite texting? And then, I guess, Google wanted to preempt Santa Sam because they released an absolute slew of AI products today. Here's what you missed today in the world of tech. GM says Cruz is exiting the Robotaxy business and Cruise and GM's technical teams will be combined to focus on autonomous tech for upcoming GM vehicles instead. So then there was only one, and by one I basically mean Waymo, at least in the consumer self-driving space. First, here's what GM itself said, quote, General Motors plans to realign its autonomous driving strategy and prioritize development of advanced driver assistance systems on a path to fully autonomous personal vehicles. GM will build on the progress of supercrues,

Starting point is 00:01:36 the company's hands-off, eyes-on driving feature, now offered on more than 20 GM vehicle models and currently logging over 10 million miles per month. GM intends to combine the majority-owned Cruise LLC and GM technical teams into a single effort to advance autonomous and assisted driving, consistent with GM's capital allocation priorities. GM will no longer fund Cruise's Robotaxy development work given the considerable time and resources that would be needed to scale the business along with an increasingly competitive Robotaxy Market. is committed to delivering the best driving experiences to our customers in a discipline and capital-efficient manner, said Mary Barra, Chair and CEO of GM.

Starting point is 00:02:16 Cruise has been an early innovator in autonomy and the deeper integration of our teams, paired with GM's strong brands, scale, and manufacturing strength will help advance our vision for the future of transportation. As the largest U.S. automotive manufacturer, we're fully committed to autonomous driving and excited to bring GM customers its benefits, things like enhanced safety, improved traffic flow, increased accessibility, and reduced driver stress, said Dave Richardson, Senior Vice President of Software and Services Engineering, end quote. And quoting Bloomberg. It's a big retrench for GM and crews, which survived a shakeout among autonomous driving companies and restarted operations after one of its cars dragged a pedestrian last year. The move has significant implications for GM. Chief Executive Officer Mary Barra wanted to transform the automaker into a transportation technology company and double GM's revenue by 2030, in part by General.

Starting point is 00:03:06 generating $50 billion from Cruz. Without a robotaxy business to bring in fares, that goal looks remote. GM is pulling back just as Alphabet's Waymo expands into more cities, and Tesla plans to start its robot taxi business in 2026. Tesla's CEO, Elon Musk, is now one of the most influential voices in President-elect Donald Trump's circles and has pressed for a federal framework for self-driving cars. Ending the Robotaxy push brings GM closer to its main business. The company will develop the technology to enhance its core business of building cars, scrapping dreams of mobility as a service." End quote. Right.

Starting point is 00:03:42 I kind of don't get the timing of this. Unless this is merely a dollars and cents calculation, Waymo is clearly hitting scale all of the sudden. Pedal to the metal, if you will forgive the pun. It might take years, but if Waymo can just scale up ride-hailing region by region, like a burger joint does franchising territory by territory, they seem to be having a bit of a breakthrough moment, as we've discussed. And as the new administration clearly is motivated to create a framework for self-driving tech, why, and forgive the pun again, slam on the brakes now.

Starting point is 00:04:22 ChatGPT is now on your iPhone. That's because Apple this morning began rolling out iOS 18.2 and iPadOS 18.2 adding major AI updates, including chat GPT integration for the first time, but also Gen Moji and Image Playground on iPhone 15 Pro, 16 and 16.16. Pro. The rollout extends to macOS 15.2, though these AI features are currently accessible only in select regions, the U.S., Australia, Canada, New Zealand, South Africa, and the UK, and exclusively for devices set to English. Users now have access to Image Playground, available both as a standalone app and within messages. The tool generates image suggestions based on chat context or custom prompts, with support for using existing photos as inspiration, while Image Playground

Starting point is 00:05:08 intentionally avoids creating photorealistic human images to prevent misuse. Its creations integrate seamlessly with freeform pages and keynote. The innovative Gen Moji feature transforms the emoji experience letting users craft custom expressions through text descriptions or friend photos. Apple Intelligence serves up multiple options to choose from, access directly through the emoji keyboard's new Gen Moji button. It is Siri and writing tools that have received that powerful upgrade through chat GPT integration. Users can now leverage. the chatbot's capabilities for tasks like creating itineraries or workout plans, though a daily query limit applies. No separate chat GPT account is required for these features. Beyond that,

Starting point is 00:05:50 users gain expanded air tag sharing options for both friends and airlines, while News Plus subscribers receive daily Sudoku puzzles. The update also reintroduces lock screen volume controls, a feature previously removed in iOS 16. Users can access these updates through settings, general software update on iOS and iPadOS devices, or via system settings and software update for macOS systems if automatic notifications haven't appeared. Sources are telling Bloomberg that the 2025 version of the Apple Watch Ultra is slated to get satellite connectivity for off-the-grid text messages via Global Star and might finally get that blood pressure monitoring system Apple has been working on.

Starting point is 00:06:37 But it's the satellite stuff for text messaging that I'm interested in. you can see why they've been investing in Global Star. This is clearly a feature that you can see migrating down the whole Apple Watch line over time. Like, why upgrade my Apple Watch three years from now? Because maybe I'll be able to text from wherever. Quoting Bloomberg. The satellite capability is slated to come to next year's Apple Watch Ultra, the company's top-of-the-line model, according to people familiar with the matter.

Starting point is 00:07:03 The technology will let smartwatch users send off-the-grid text messages via Global Star's fleet of satellites when they don't have a cellular or Wi-Fi connection. The other feature, which would monitor whether Apple Watch users have high blood pressure may arrive as soon as 2025 as well, said the people who asked not to be identified because the work is confidential. But it's been delayed before with Apple previously aiming to release the tool last year, end quote. So for now, this just gives people an extra reason to pay up for the top of the line model. But remember, just in November, Apple invested $1.5 billion for Global Star to increase its infrastructure and took a 20% stake in the company for their trouble.

Starting point is 00:07:39 Apple introduced satellite connectivity with the iPhone 14 in 2022, initially just enabling users to maintain emergency service contact while exploring off-grid locations. But then the following year, they incorporated Roadside assistance providers and the most recent iteration fully integrated iMessage functionality. So you can see where this is going, a TikTok of iterative additions from a safety focus feature, maybe eventually to be just, you can always text people when you. you're out with your watch, even without your iPhone. And here comes the slew of Google AI headlines, I promised you. Google release their new flagship model, Gemini 2.0, and announced plans to test it in search and AI overviews, saying 2.0 makes, quote, it possible to build agents that can think. Quoting Bloomberg, Gemini 2.0 can generate images and audio across languages and can assist during Google searches and coding projects, the company said Wednesday. The new capabilities of Gemini

Starting point is 00:08:45 quote, make it possible to build agents that can think, remember, plan and even take action on your behalf, said Tulsi Doshi, a director of product management at the company in a briefing with reporters. Beyond experimental products, Google incorporated more AI into its search engine, which remains its lifeblood. The company said that this week it would begin testing Gemini 2.0 in search and in AI overviews. The artificial intelligence-powered summaries displayed at the top of Google search. That will improve the speed and quality of search results. for increasingly complex questions like advanced math equations. Google also debuted a new web feature called Deep Research,

Starting point is 00:09:18 which it says will enable Gemini users to use AI to dive into topics with detailed reports. The feature, billed as an AI-powered research assistant, will be available Wednesday to users of Gemini Advanced, Google's paid AI subscription product. The products featured on Wednesday show how Google's premier AI lab, Google DeepMind, is playing a more pivotal role in the product development. The lab is expanding tests of Project Astra, an AI agent that uses a smartphone camera to process visual input in an elaborate space evoking a home library with towering bookshelves containing titles on computer programming and travel. Google employees showed how Astra can summarize information on the page. A hidden door nestled in the shelves revealed a small art gallery where the agent reflected on how Norwegian painter Edvard Munch, the scream, captured his own anxiety and the general paranoia of his age, end quote. Google is also apparently testing AI agents based on Gemini 2.0 that can understand.

Starting point is 00:10:10 and rules in video games like Clash of Clans to help players with the game. But some of these other announcements were interesting enough that I don't want to just fold this into one big segment. So, Trilium is Google's new six-generation AI chip, which powers Gemini 2.0, with four times the training performance of its predecessor while using significantly less energy, which sounds like a big deal. Quoting Venture Beat. Trilium specifications represent significant advances across multiple dimensions.

Starting point is 00:10:44 The chip delivers a 4.7x increase in peak compute performance per chip compared to its predecessor, while doubling both high bandwidth memory capacity and interchip interconnect bandwidth. Perhaps most importantly, it achieves a 67% increase in energy efficiency, a crucial metric as data centers grapple with the enormous power demands of AI training. The business implications of Trilium extend beyond raw performance metrics. Google claims the chip provides up to 2.5 times improvement in training performance per dollar compared to its previous generation, potentially reshaping the economics of AI development. This cost-efficiency could prove particularly significant for enterprises and startups developing large-language models.

Starting point is 00:11:24 AI21 Labs, an early Trilium customer, has already reported significant improvements. The advancements in scale, speed, and cost efficiency are significant, noted Baraz Lenz, CTO of AI21 Labs in the announcement, end quote. This is clearly a shot across the bow of Nvidia. InVIDIA chips are still the industry standard, but Google seems to think they can pick away at Nvidia's lead around the margins with specific types of workloads. And increased performance and efficiency would certainly turn some heads as well. Gemini 2.0 Flash is a new variant of Gemini 2.0 for generating images, audio, and text,

Starting point is 00:12:04 and use third-party apps and services available via Gemini API and developer platforms. Quoting TechCrunch. Google claims that 2.0 Flash, which is twice as fast as the company's Gemini 1.5 Pro-Modeled, on certain benchmarks per Google's own testing is, quote, significantly improved in areas like coding and image analysis. In fact, the company says 2.0 Flash displaces 1.5 Pro as the flagship Gemini model, thanks to its superior mass skills and factuality. As alluded to earlier, 2.0 Flash can generate and modify images alongside text. The model can also ingest photos and videos as well as audio recordings to answer questions about them, e.g., what did he say? Audio generation is

Starting point is 00:12:44 2.0 Flash's other key feature, and Google described it as steerable and customizable. For example, the model can narrate text using one of eight voices optimized for different accents and languages. You can ask it to talk slower, you can ask it to talk faster, or you can even ask it to say something like a pirate. A Google spokesperson said, the production version of 2.0 Flash will land in January, but in the meantime, Google is releasing an API, the Multimodal Live API, to help developers build apps with real-time audio and video streaming functionality, end quote. Jules is Google's new AI coding assistant that can autonomously fix software bugs and prepare code changes built on the new Gemini 2.0 platform. Quoting Venture Beat, Jules integrates directly with

Starting point is 00:13:34 GitHub's workflow system and can analyze complex codebases, implement fixes across multiple files, and prepare detailed pool requests without constant human supervision. Unlike traditional coding assistants that merely suggest fixes, Jules operates as an autonomous agent within GitHub's ecosystem. It analyzes codebases, creates comprehensive repair plans, and executes fixes across multiple files simultaneously. More importantly, it integrates seamlessly with existing developer workflows. During a press conference, Jacqueline Kanzelman, Director of Product Management at Google Labs, emphasized the system's safety features. Developers are in control along the way, she explained. Jules presents a suggested plan before taking action, and users can monitor its progress writing

Starting point is 00:14:16 code. The system requires explicit approval before merging any changes maintaining human oversight of the development process. Software development projects typically run significant risks of cost overruns, with large IT projects running 45% over budget and delivering 56% less value than predicted, according to McKinsey. By automating routine bug fixes and maintenance tasks, Jules could significantly reduce these costs while accelerating development cycles, end quote. Finally, on the agentic front, but probably eventually on the consumer side, Project Mariner is a prototype AI agent Google DeepMind built that can control Chrome, the web browser, can move the cursor, click buttons, fill out forms, and more.

Starting point is 00:15:05 Quoting TechCrunch. A Google executive tells TechCrunch that this is part of a, quote, fundamentally new UX paradigm shift, moving users away from directly interacting with websites and instead interacting with a generative AI system that does it for you. These shifts could affect millions of businesses from publishers like TechCrunch to retailers like Walmart, which have historically relied on Google to send real people to visit and use their websites. In a demo with TechCrunch, Google Labs director Jacqueline Kondselman showed how Project Mariner works. After setting up the AI agent with an extension in Chrome, a chat window pops up to the right of your browser. You can instruct the agent to do

Starting point is 00:15:40 things like, quote, create a shopping cart from a grocery store based on this list. From there, the AI agent navigated to a grocery store's website, in this case Safeway, and then searched for and added items to a virtual shopping cart. One thing that's immediately evident is how slow the agent is. There were about five seconds of delay in between each cursor movement. At times, the agent stopped its task and reverted back to the chat window asking for clarification about certain items, like how many carrots. Google's agent cannot check out, as it's not supposed to fill out credit card numbers or billing information. Project Mariner also won't accept cookies for users or sign a terms of service agreement. Google says it purposefully doesn't allow the

Starting point is 00:16:20 agent to do these things in order to give users more control, end quote. For now, Project Mariner maintains the traditional digital ecosystem for now. So websites and online retailers will continue receiving valuable visitor information again for now. Yet, this shift hints at evolving user behavioral patterns. As AI agents become more sophisticated, we're likely to see decreasing direct engagement with websites. The technology's trajectory suggests a future where AI might bypass traditional web interfaces entirely, which makes you wonder what the web becomes if it's not a medium and interface for human interaction, but for bot interaction. Forget about a web browser as the window to the web. Imagine an AI bot interface as all you need.

Starting point is 00:17:07 you never actually go to the web. Nothing more for you today. Talk to you tomorrow.

Tech Brew Ride Home - Wed. 12/11 – Gemini 2.0

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.