Tech Brew Ride Home - Fri. 12/08 – Did Google Fake The Gemini Demo?

Episode Date: December 8, 2023

People across the Internet are accusing Google of faking that Gemini AI video demo that everyone was wowed by. Apple seems to be diversifying out of China for manufacturing at pace now. Might the UK�...�s CMA have an issue with Microsoft’s relationship with OpenAI? And, of course, the Weekend Longreads Suggestions. Sponsors: ShopBeam.com/ride Links: Google’s Gemini Looks Remarkable, But It’s Still Behind OpenAI (Bloomberg) Early impressions of Google’s Gemini aren’t great (TechCrunch) Apple to move key iPad engineering resources to Vietnam (NikkeiAsia) Microsoft, OpenAI Are Facing a Potential Antitrust Probe in UK (Bloomberg) Google launches NotebookLM powered by Gemini Pro, drops waitlist (9to5Google) Weekend Longreads Suggestions: The real research behind the wild rumors about OpenAI’s Q* project (ArsTechnica) AI and Mass Spying (Schneier On Security) The race to 5G is over — now it’s time to pay the bill (The Verge) In the Hall v. Oates legal feud, fans don’t want to play favorites (NBCNews) Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco. Hey, who did this to you? What happened next turned the story into a political firestorm. Reports have identified the victim as Bob Lee, the founder of Cash App. From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16. Welcome to the Tech meme right home for Friday, December 8th, 2023. I'm Brian McCullough today. People across the internet are accusing Google of faking that Gemini AI video demo that everyone was wowed by. Apple seems to be diversifying out of China
Starting point is 00:00:47 for manufacturing at pace now. Might the UK's CMA have an issue with Microsoft's relationship with OpenAI? And of course, the week on long-read suggestions. Here's what you miss today in the world of tech. So when Google announced Gemini, its new AI product, it had a lot of a fancy demo video that was pretty wow. Except, it seems that Google's demo of Gemini's especially multimodal AI capabilities was, shall we say, edited? Quoting Parmi Olson in Bloomberg. On first viewing, the demo is impressive stuff. The model's ability to track a ball of paper from under a plastic cup or to infer that a dot-to-dot picture was a crab before it is even drawn show glimmers of the reasoning abilities that Google's deep-mind AI lab have cultivated over the years.
Starting point is 00:01:39 That's missing from other AI models, but many of the other capabilities on display are not unique and can be replicated by ChatGPT Plus. Google also admits that the video is edited. For the purposes of this demo, latency has been reduced, and Gemini outputs have been shortened for brevity, it states, in its YouTube description. This means the time it took for each response was actually longer than in the video. In reality, the demo wasn't carried out in real-time or in voice. When asked about the video by Bloomberg opinion, a Google spokesperson said it was made by, quote, using still image frames from the footage and prompting via text, and they pointed to a site showing how others could interact with Gemini with photos of their hands or of drawings of other objects.
Starting point is 00:02:24 In other words, the voice in the demo was reading out human-made prompts they'd made to Gemini and showing them still images. That's quite different from what Google seemed to be suggesting that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world around it, end quote. Olson argues that that approach significantly diverges from the impression Google conveyed, suggesting a seamless real-time voice conversation with Gemini responding to its surroundings. In fairness to Google, companies often modify demo videos to prevent technical glitches that live demonstrations can bring. minor adjustments are customary. Nonetheless, Google has faced skepticism regarding the authenticity of their video demonstrations in the past, notably with the Duplex AI voice assistant, which raised doubts due to the absence of ambient noise and overly accommodating employees. The use of pre-recorded
Starting point is 00:03:14 videos showcasing AI models tends to heighten suspicion as seen when Baidu's Erniebot launched, resulting in a stock price drop due to edited videos. Olson contends that in this instance, Google is engaging in, quote, showboating to divers. attention from the fact that Gemini still lags behind OpenAI's GPT. Google disputes these claims when asked about the controversy by The Verge. Google referred them to a post by Oriel Vinyall's, vice president of research and deep learning lead at Google's Deep Mind, who explained how the video was created. Vignioles asserted that all user prompts and outputs in the video were genuine, albeit shortened for brevity. But everyone on the internet is crying foul. That's not the only criticism
Starting point is 00:03:53 Gemini is receiving users are also saying that Google's Gemini Pro is loathe to comment on potentially controversial news topics, fails to get basic facts right, and struggles with some coding tasks. The entire internet is a buzz with people posting examples of what Gemini can't do that GPT4 already can. Weirdly, a whole bunch of news today about Apple's continuing efforts to diversify its supply chain away from an over-reliance on China. Niki Asia is saying that in a first, Apple is working with China's BYD to move iPad production development to Vietnam, with test production set to begin around mid-February 2024. Quote, engineering verification for test production of an iPad model will start around mid-February next
Starting point is 00:04:44 year, sources told NECA Asia, the model will be available in the second half of next year. B.YD was also the first Apple supplier to help the U.S. Tech Titan shift iPad assembly for the first time to Vietnam in 2022. NikK. reported earlier. This shift of new product introduction resources or NPI engineering resources is focused on entry-level models, rather than. than the premium iPad Pro. New product introduction demands substantial resources both from the tech company and its suppliers, such as engineers and investment in lab equipment for testing new features and functions. Most of Apple's NPI is carried out in China in collaboration with engineers in Cupertino to take advantage of the country's decades-long experience in hardware manufacturing.
Starting point is 00:05:22 But geopolitical uncertainties are forcing the company to rethink this approach. Apple also plans to send some NPI processes for the iPhone to India, NICA Asia reported earlier, end quote. Yes, on that, the Journal, is reporting that Apple and its suppliers aim to make upwards of 50 million iPhones annually in India in two to three years, which would account for a quarter of global iPhone production. Apple's partner Tata alone plans to build one of India's largest iPhone plants employing 50,000 workers within two years, and expects operations to begin in 12 to 18 months. Quoting the journal, Apple has gradually boosted its reliance on India in recent years despite challenges including rickety infrastructure and restrictive labor rules that often make
Starting point is 00:06:05 doing business harder than in China. Among other issues, labor unions retain clout even in business-friendly states and are pushing back on an effort by companies to get permission for 12-hour workdays, which Apple suppliers find helpful during crunch periods. Apple and its suppliers led by Taiwan-based Foxcon generally believe the initial push into India has gone well and are laying the groundwork for a bigger expansion, say people involved in the supply chain. Apple has also chosen India as its site for a manufacturing stage for lower-end iPhones to be sold in 2025. In this stage, known as new product introduction, Apple's teams work with contractors in translating product blueprints and prototypes into a detailed manufacturing plan. Until now, that work was done only in China,
Starting point is 00:06:46 combined with plans for expanded production at existing Foxcon plants near Shanai and other existing plants recently bought by Indian conglomerate Tata, these developments signify that Apple intends to have the capacity to make at least 50 million to 60 million iPhones in India annually within two to three years said people involved in the planning, end quote. Routro, the UK's CMA is apparently considering whether Microsoft and Open AI's partnership should be called in for a merger probe and is asking for comments from interested parties, quoting Bloomberg. The CMA said it will look at whether the balance of power between the two firms has fundamentally shifted to give one side more control over the other.
Starting point is 00:07:32 The move by the UK raises the question of whether antitrust regulators in other regions, namely the European Union and the U.S. will launch similar probes. When asked to comment on the CMA's move, a European Commission spokesperson said the regulator had been, quote, following the situation of control over Open AI very closely. The U.S. Federal Trade Commission declined to comment. Will Hayter, senior director of the CMA's Digital Markets Unit, told Bloomberg in October that the watchdog was, quote, looking quite closely at those vertical relationships between big tech companies and AI models.
Starting point is 00:08:00 He said at the time that the agency was exploring how effective the competition is at developing and deploying models, end quote. Previously mentioned controversy notwithstanding another launch of a new AI tool from Google, but this one is maybe more interesting or at least a little different. Google has launched the Gemini Pro-powered Notebook L.M. Originally demoed back at I.O. in 2023 as an AI-first notebook that pulls information from users' documents in the U.S. like Bard, it will be powered by Gemini Pro, quoting 9 to 5 Google. Google has talked to knowledge workers, creators, students, and educators to learn more about the ways they're using and what they're liking about Notebook LM since Early Access opened in July. Users like the auto-generated summaries and suggested follow-up questions that provide a, quote, helpful new way to comprehend difficult text
Starting point is 00:08:56 and synthesize connections between multiple documents. Gemini Pro is specifically used for document understanding and reasoning, but Palm 2 and other models are being leveraged as well. With today's launch, Notebook LM is adding a dozen new features over the coming weeks. A note board that appears above the chat box with a card-based grid UI will let you save notebook LM responses, excerpts from your sources, and any notes you've written. You can select multiple notes and prompt Notebook LM to summarize, combine into a single note, create an outline, study guide, or something broader, like turning your notes into an email newsletter, a script outline, a draft of a marketing plan, and more. There's also a shortcut to export to
Starting point is 00:09:36 Google Docs. Meanwhile, notebook L.M will, quote, dynamically suggest actions based on whatever you happen to be doing, end quote. Time for the weekend long read suggestions. First up, remember during the whole Sam Altman drama, folks were talking about that Q-Star breakthrough that may or may not have spooked the board into firing Altman because they were afraid progress was maybe happening too quickly. Well, Ars Technica has taken a look at the real research behind those rumors of Q-Star, quote. An important clue, here is OpenAI's decision to hire computer scientist Noam Brown earlier this year. Brown earned his PhD at Carnegie Mellon, where he developed the first AI that could play poker at a superhuman level.
Starting point is 00:10:21 Then Brown went to meta, where he built an AI to play diplomacy. Success at diplomacy depends on forming alliances with other players, so a strong diplomacy AI needs to combine strategic thinking with natural language abilities. This seems like a good background for someone trying to improve the reasoning abilities of large language models. One person who believes Brown is working on Q-Star is LeCoon, chief AI scientist at Meta, where Brown worked until earlier this year. It is likely that QSTAR is OpenAIs attempts at planning, LeCoon tweeted in November. They pretty much hired Noam Brown to work on that, end quote. A general reasoning algorithm needs the ability to learn on the fly as it explores possible solutions.
Starting point is 00:11:00 When someone is working through a problem on a whiteboard, they do more than just mechanically iterate through possible solutions. Each time a person tries a solution that doesn't work, they learn a little bit about the problem. They improve their mental model of the system they're reasoning about and gain a better intuition about what kind of solution might work. In other words, humans' mental policy network and value network aren't static. The more time we spend on a problem, the better we get at thinking of promising solutions, and the better we get at predicting whether a proposed solution will work. Without this capacity for real-time learning, we'd get lost in the essentially infinite space
Starting point is 00:11:33 of potential reasoning steps. In contrast, most neural networks today maintain a rigid separation between training and inference. Once AlphaGo was trained, its policy and value networks were frozen. They didn't change during a game. That's fine for Go, because Go is simple enough that it's possible to experience a full range of possible game situations during self-play. But the real world is far more complex than a Go board. By definition, someone doing research is trying to solve a problem that hasn't been solved before, so it likely won't closely resemble any of the problems it encountered during training. So a general reasoning algorithm needs a way for insights gained during the reasoning process to inform a model's subsequent decisions as it tries to solve the same problem.
Starting point is 00:12:12 Yet today's large language models maintain state entirely via the context window and the tree of thoughts approach is based on removing information from the context window as a model jumps from one branch to another. One possible solution here is to search using a graph rather than a tree, an approach proposed in this August paper. This could allow a large language model to combine insights gained from multiple branches, end quote. Then technology security researcher Bruce Schneier has written a piece that got a lot of pickup because he argued that while the internet enabled mass surveillance and AI will probably enable mass spying once limited by human labor, it will do so by making troves of data searchable and understandable. Quote, mass surveillance fundamentally changed the
Starting point is 00:12:57 nature of surveillance. Because all the data is saved, mass surveillance allows people to conduct surveillance backward in time, and without even knowing whom specifically you want to target. Tell me where this person was last year. List all the red sedans that drove down this road in the past month. List all of the people who purchased all of the ingredients for a pressure cooker bomb in the past year. Find me all the pairs of phones that were moving toward each other, turn themselves off and then turn themselves on again an hour later, while moving away from each other, a sign of a secret meeting. Similarly, mass spying will change the nature of spying. All the data will be saved. It will all be searchable and understandable in
Starting point is 00:13:31 bulk. Tell me who has talked about a particular topic in the past month and how discussions about that topic have evolved. Person A did something, check if someone told them to do it. Find everyone who is plotting a crime or spreading a rumor or planning to attend a political protest. In the early days of Gmail, Google talked about using people's Gmail content to serve them personalize ads. The company stopped doing it, almost certainly because the keyword data it collected was so poor, and therefore not useful for marketing purposes. That will soon change. Maybe Google won't be the first to spy on its users' conversations, but once others start, they won't be able to resist. Their true customers, their advertisers will demand it. We could limit this capability. We could
Starting point is 00:14:10 prohibit mass spying. We could pass strong data privacy rules, but we haven't done anything to limit mass surveillance. Why would spying be any different, end quote? Then remember all the hype around 5G? All these years on, do you feel like it has been transformative even a little bit? Probably not. In The Verge, Alice and John Johnson argues that the big telecos have backed themselves into a bit of a cul-de-sac with 5G hype. Maybe 5G isn't entirely smoke and mirrors, but by rolling it out with a breakneck race, networks backed themselves into a corner. They took on piles of debt.
Starting point is 00:14:46 On that recent earnings call, a Verizon exec talked about the company's desire to return to a pre-spectrum auction level of debt. In the meantime, there are few returns to show for these investments, not helped by the fact that interest levels are high and smartphone sales are down. The networks thought they had a golden goose with 5G, but so far it's just laying regular old eggs, expensively at that. And while they're waiting for their efforts to bear fruit, never looking for other ways to boost their bottom line. 5G will improve as time marches on as it tends to, particularly when the networks have fully deployed standalone 5G, but we can probably stop holding our breath for that killer app and make peace with the fact that technological progress is often slow and boring.
Starting point is 00:15:25 Moving forward, cell tower by cell tower, not by leaps and bounds. In the short term, its greatest effect might be a more consolidated, more expensive wireless broadband market. If it's any consolation, you can take some comfort in the fact that we're probably not going to see commercials for 6G anytime soon, end quote. Finally today, this is not tech, but it turns out, Hall and Oates, hate each other now, and that just makes me sad. Quoting NBC News, the half-century musical collaboration of soul pop duo Hall and Oates came to a record-scratching halt when it was very much. revealed last month that Daryl Hall 77 was suing John Oates 75 in Nashville court. Court filings and a hearing last week shed some light on their legal spat, which revolves around the sale of their joint business venture, but the acrimonious language, including Hall accusing Oates of
Starting point is 00:16:13 committing the, quote, ultimate partnership betrayal has taken fans by surprise. Now many are wondering if this spells the end of the multi-platinum selling duo, whose catalog of hits includes You Make My Dreams, Man-Eater, and Rich Girl. Your average fans feel like it's a divorce. They're upset and sad, said Lori Arred, who in 1987 founded the first HoloNotes fan club in the U.S. that was officially recognized by the pair. You're seeing these two people and you grow up with their music, following them and going to shows, and now it's like, wow, we love both of them. We don't want to take a side, end quote. No bonus episodes for you this weekend. Gonna focus on getting my Christmas shopping done. Talk to you on Monday.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.