The AI Daily Brief: Artificial Intelligence News and Analysis - Google's AlphaGeometry Breakthrough

Starting point is 00:00:00 Today on the AI breakdown, we're looking at Samsung's new AI powered phone and what it says about the big trends in AI hardware. Before that on the brief, a major research breakthrough from Google around geometry. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown Network for more information about our YouTube, our Discord, and our newsletter. Hello, friends, one quick note before we dive into today's episodes, you will notice that the sound is a little different than normal and potentially not as good. although we tried to use some AI to improve it. Basically, my normal mic somehow stopped registering and I didn't notice that it had switched over to the IMac.

Starting point is 00:00:40 So you're just getting raw Mac audio with a little bit of help from Descript. Apologies in advance, but I'd rather have the content out and a little janky than not out at all. Enjoy, and we will be back to our normal sound quality on tomorrow's episode. Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes. Today, we are starting with a new research breakthrough from Google DeepMind that has totally transformed the state of the art when it comes to AI solving geometry problems. The New York Times writes, For four years, the computer scientist True Trin has been consumed with something of a metamath

Starting point is 00:01:16 problem, how to build an AI model that solves geometry problems from the International Mathematical Olympiad, the annual competition for the world's most mathematically attuned high school students. Last week, Dr. Trin successfully defended his doctoral dissertation on this topic at New York University. This week, he described the results of his labors in the journal Nature. Named Alpha Geometry, the system solves Olympia geometry problems at nearly the level of a human gold medalist. While developing the project, Dr. Trin pitched it to two research scientists at Google,

Starting point is 00:01:41 and they brought him on as a resident from 2021 to 2023. Alpha Geometry joins Google DeepMind's fleet of AI systems, which have become known for tackling grand challenges. In their blog post about this, Google writes, AI systems often struggle with complex problems in geometry and mathematics due to a lack of reasoning skill and training data. Alpha Geometry System combines the predictive power of a neural language model, with a rule-bound deduction engine, which work in tandem to find solutions.

Starting point is 00:02:06 And by developing a method to generate a vast pool of synthetic training data, 100 million unique examples, we can train alpha geometry without any human demonstrations, sidestepping the data bottleneck. With alpha geometry, we demonstrate AI's growing ability to reason logically, and to discover and verify new knowledge. Solving Olympiad-level geometry problems is an important milestone in developing deep mathematical reasoning on the path towards more advanced than general AI systems. So Thomas Othley, the head of machine learning at normal computing,

Starting point is 00:02:31 tried to sum this up on Twitter slash X in a way that was fairly accessible. You'd be the judge of how well he did. Thomas writes, My understanding of alpha geometry. 1. Translate the problem statement to symbolic form. 2. Try to solve the problem with a symbolic solver. 3. If it didn't work, use a language model to suggest an auxiliary point somewhere such as a midpoint, then go back to number 2.

Starting point is 00:02:51 Try to solve the problem with a symbolic solver. To teach the model to generate good auxiliary points, they create a dataset of 100 million synthetic proofs. This is done by 1. Generating Random Premise. 2. Finding the closure of all statements that follow from the premises. 3. Sort them topologically and pick a downstream statement. Then prune to only the necessary premises and steps. If a point is only mentioned in the generated steps and not in the selected premises, it becomes an auxiliary point.

Starting point is 00:03:15 91% of the proofs generated this way don't have any auxiliaries, so they are used only for pre-training, whereas the quote, interesting 9% are used for fine tuning. Serafim Batsalguelu says, Alpha geometry is brilliant. The method seems super challenging to extend beyond geometry, but still represents a big advance. The most important factor for its success is perhaps the synthetic data generation method for training. Given a geometric diagram, a symbolic deduction engine can deduce all true statements about the diagram that do not involve drawing any new elements like line, circles, etc.

Starting point is 00:03:44 Synthetic data is generated by one, generating a random diagram, two, deducing all true statements about it, three, given a deduced statement tracing back for the minimal proof for it and for the associated sub-diagram. This is an NP hard step but can be solved greedily. 4. Taking this minimal diagram and minimal premises together with a deduce statement as the problem description. 5, repeating this 100 million times to generate data. Then, an LLM is trained on the above proofs to generate proofs including auxiliary constructions in a diagram. Finally, to solve a problem, the LLM is interleaved with the deduction engine.

Starting point is 00:04:14 The LLM iteratively augments the minimal diagram with another line, middle point, etc. And then the deduction engine deduces all true statements in the augmented diagram. Now, I'm not sure if I actually cleared anything up with those descriptions, but if you just take away one thing, it's this. For some sense of benchmarking, a bronze medalist tends to get 19.3 out of 30 correct. A silver medalist tends to get 22.9 out of 30 correct, and a gold medal is 25.9 correct. The previous state of the art before alpha geometry got 10 out of the 30 correct, whereas alpha geometry got 25 out of 30 correct. That's just below the gold medalist standard.

Starting point is 00:04:51 And that's why people are so excited about this shift. Now, like Serafim said, how generalizable this strategy is outside of math is not immediately apparent, but it's still really fascinating work. Next up, you might remember back at the end of last year, the vice president of audio very publicly quit stability AI over disagreements around the company's approach to copyright and training. Well, now that same executive, Ed Newton Rex, has started a new nonprofit called Fairly Train, and the best way to think of it is sort of a fair trade label but for copyright-compliant AI products that officially obtained consent for the data that they use to train AI systems.

Starting point is 00:05:31 So far, nine companies have been certified by fairly trained, all of which are fairly small startups. Said Newton Rex, as long as generative AI companies are saying, yes, this is fair use, we can scrape whatever we want, then I think it will be a battle between the two sides. At the same time, he believes that there is an opportunity to not be so adversarial. Newton Rex said there's a, quote, mutually beneficial setup here that works for everyone. Of the nine fairly trained certified startup so far, eight relate to sound generation, and one is an image generator. A really interesting story out of Japan, a novelist, Rikudam, has won one of the most prestigious literary prizes in the country, but has been very public about using ChatGBT as part of her process.

Starting point is 00:06:10 In fact, she said that about 5% of her novel was quoted verbatim from the sentences generated by AI. Now, interestingly, this wasn't just an author casting around for ideas and asking ChatGPT to write their work for them. She was specifically interested in having ChatGPT help mimic the way that, quote, soft and fuzzy words muddle ideas about justice. In other words, she was looking for a specific corpus of knowledge that Chatchipt had access to that would help her bring to life her vision for her creative work. I think that that sort of human AI collaboration is going to be representative of what a lot of the future looks like. something that's not so either or, but is both and. Finally, a little preview of the future. You may have seen this video going around

Starting point is 00:06:53 of new Argentine President Javier Miele's talk at Davos. Now, for those of you who are interested in free market ideas, there's a lot there that you'll probably be interested in, but anyone who's in the AI space will find it interesting that the version going around Twitter is Miele speaking in English, even though he gave the presentation in Spanish. The video was translated by HeyGen, including artificially generating the mouth movements,

Starting point is 00:07:15 to match the language. However, it's done completely in Miele's own voice, so English speakers can hear him speak in his own voice, but in a language that he didn't speak in. Many people are having a revelatory kind of zero to one moment watching this, where it feels like the linguistic barriers that divide us may cease to be a thing in the very, very near future. That, however, is going to do it for the AI breakdown brief. Next up, the main AI breakdown. Welcome back to the AI breakdown. It is super easy to be very, very skeptical about marketing claims related to AI. And frankly, I think it's probably healthy to be so.

Starting point is 00:07:51 However, as Samsung launches their new S-24, the pitch for which is almost entirely focused on our official intelligence, I think it's worth taking a little bit more seriously in terms of the broad trend lines that it represents and what it says about the AI hardware future we're moving into. So let's talk first about what all is included with the new Samsung phone and why it represents something a little bit different. The Wall Street Journal writes, it's finally here, an AI phone that does more than a iPhone. I can't imagine why Samsung's marketing department didn't go with that slogan, but the newest Galaxy S24 smartphones really are all about artificial intelligence features that go beyond a voice assistant that can execute basic

Starting point is 00:08:34 tasks. And so here I want to pause for a moment and talk about what AI on a phone means currently. Right now, AI on a phone is something like this, the ChatGPT app, where you operate inside the paradigm that has existed on smartphones since the introduction of the iPhone, which is, of course, the apps, and you do AI-related things, but inside those apps. Yes, ChatGPT offers some things that make it feel a little bit different. Their whisper voice-to-text technology, for example, feels radically faster than things like Siri, but it's still ultimately an app. It's an app version of the web experience that is ChatGBT.T. Now, of course, there have been some attempts to break out of that a little bit more. For example, if you download Microsoft Swift Key keyboard, you can actually

Starting point is 00:09:20 make Dolly 3 images directly from your keyboard without having to go to the ChatGBTBT BT app or the ChatGPT website. It's pretty cool. I actually did a tutorial all about this for the AI Education beta that we've been running. And so what we see here is, the first breaking out of that app paradigm at least a little bit to integrate AI more broadly into the phone experience, this time through the keyboard. Now, as the Wall Street Journal intimates, one of the next steps that many people are expecting is voice assistance like Siri being the way that AI finds its way to phones. Goodness knows, Siri could really use an upgrade at this point. But again, Samsung is offering something that's a little bit different than any of those

Starting point is 00:10:01 things that we just discussed. So this is not all coming from Samsung alone. Samsung has been working with Qualcomm, from a chip perspective, as well as Google, to put together what it calls Galaxy AI. So what are some of these AI features? One that's incredibly impressive for now, but will likely be table stakes in the future, is the real-time translation. WSJ writes, need to call someone who only speaks Japanese or one of 12 other supported languages, dial the number, select the language, and start the conversation. You'll see a transcription appear on screen and hear a robotic voice repeat what you said in the other language for the caller. When the recipient responds, you'll see the foreign and translated text. You'll also hear the same

Starting point is 00:10:39 robotic voice. Now, the WSJ reports that this actually worked quite well in their tests, and importantly, part of the reason that it worked well is that the processing was being done on device, not in the cloud. This has, of course, been one of the big barriers for Apple to introduce generative AI features is the concern that they can't do enough without having to rely on the cloud, Sort of like Swift Key but built in, there are AI keyboard features. Once you typed your message, you can select from tonal options like professional, casual, and polite, and the AI will rewrite the message in that tone. There's even an Emojify option which adds appropriate emojis.

Starting point is 00:11:13 There's AI in the Notes app. If you've taken a bunch of notes, for example, in a meeting, you can press a single button, and AI will auto-format and summarize all of those notes. A same type of tool is built into the web browser that can do that same sort of summarization. Generative AI photo editing is now built to, directly into the photo app. The example that the WSJ gives is, so your kid only jumped a foot high, but he'd like it to look like 10 feet,

Starting point is 00:11:36 use the generative edit feature to move him up in the air, and then create a new background where he was originally standing. Now, Samsung does promise that this feature will include a watermark in the bottom left, as well as in the metadata, but at least when it comes to the visual watermark, people are already showing how you can take it off. There is also an anywhere feature called Circle the Search, where you can long click the Home button, which brings up a Google search bar,

Starting point is 00:11:56 from which you can tap or circle something on the screen, and it will automatically search for it. Finally, there's an Android Auto feature that does AI tech summarization for safer driving and also suggest replies, so that you don't have to take your eyes off the road. Now, when it comes to whether all of these features will lead consumers to ditch the phones that they have and run out to buy this new phone, the Wall Street Journal is skeptical. Instead, they suggest that this phone, as well as Google's Pixel 8, have, quote, set the stakes for AIifying the smartphone. The Pixel 8 was revealed last October and had some of these more AI-type features that break out of the app paradigm to integrate AI across the experience

Starting point is 00:12:32 that Samsung would then build on even more with its Galaxy S-24. Now, on top of this being the beginning of the AI phone era, we're also seeing a lot of claims that were at the beginning of the AIPC era. Intel pitched that back at their developer conference in September and reinforced that in December with a new set of chips designed specifically for AI inference on device. CES was basically one big AI PC era show. Reuters wrote CESPC makers bet on AI to rekindle sales. Now, what this story gets at is the fact that PC makers definitely need a new narrative, and this is ready-made for that, but at the same time,

Starting point is 00:13:07 the reason that it's somewhat resonant and not just totally dismissible as marketing hype, is the fact that AI really is changing how we interface with computers. Right now, we're in a big, massive period of experimentation and exploration, where everyone is throwing a chatbot into all of their experiences, where generative AI is being integrated into everything that we do, and inevitably some of it will stick, while other new user interfaces and experiences won't, but on the other side, it's very likely

Starting point is 00:13:33 that the way that we interface with our computers and our phones looks the same as it does today. There is a possibility, however, that it won't be the form factors of existing devices that really capture this change in the human computer interface. Ryan Hoover from Product Hunt recently tweeted, Theory, new devices like Humane AIPin and Rabbit R1 will have an easier time building new computer.

Starting point is 00:13:54 consumer behaviors than incumbent devices like the iPhone. The downside, adoption of new hardware is very slow relative to software. Basically, what Ryan is saying is that when people are coming to your device with a more open mind and not so many expectations based on years of usage as is the case with something like a smartphone, they're more likely to be able to grok a new mode of interaction than they might be just with a new update to their Samsung or iPhone. At the same time, however, an inherently smaller number of people are going to have those conversion experiences because fewer people are going to be willing to try. Still, by way of following up our coverage of Rabbit last week,

Starting point is 00:14:30 one, they've sold a lot of units. They sold out their first 10,000 unit run in a single day and sold out a second run pretty quickly as well. What's more, there's been a lot of positive chatter. Microsoft's CEO, Sotian Nadella, being interviewed at Davos, said that the Rabbit R1 presentation was one of the most impressive demos he'd seen since Steve Jobs' iPhone unveiling. writes to Crypt. Nadella opined on the state of AI and how the technology is changing as new

Starting point is 00:14:53 more powerful models and products appear and suggested that Rabbit portends a future where agent-centric operating systems, not discrete apps, handle all user interactions. Quote, to me, the relationship we all have with computers is going to now be with an agent, which will be on all your computers. That, I think, is going to be the defining category of this next generation. So in other words, instead of doing things with the computer, you tell an agent what you want and it figures out what it needs to do to a accomplish that. Certainly if the amount of developer effort is any indication, that is clearly a priority among entrepreneurs and builders right now. The countertake comes from Mark Wilson over at Fast Company, who writes, were not even close to the Blackberry of AI yet, let alone the iPhone of

Starting point is 00:15:34 AI. He writes, were likely years from a unifying vision of AI hardware. Smartphones offer an interesting case study as to why. Wilson leads, Tab Stoic White Pendant, Rabbit's nostalgic orange walkie-talkie, humane Star Trek lapel pit. While 2020 and 2020, brought us the rise of chatchip T and other AI software. 2024 is shaping up to be the year of AI hardware. Or, as I begun to think of it more reasonably, year one of AI hardware. It's the year where everyone is trying to launch the next big thing,

Starting point is 00:16:02 the iPhone of AI as has become the preferred shorthand. But no one is going to launch the iPhone of AI this year, or next year, or even the year after that. I don't mean that to be spitting hot takes. But if we're going to frame our futures inside imperfect metaphors, then allow me to explain. We haven't even created the BlackBerry of AI yet, and until that happens, nobody has any clue what the iPhone of AI could be.

Starting point is 00:16:21 We're not even close. This is not me throwing shade at any of the AI hardware startups we've seen thus far. It's simply acknowledging that we're in a time of immense experimentation. At the moment, untold amounts of money are being transformed into a pile of silicon meets consumer electronic stuff. We're back in the gadget era of the late aughts when a dopamine drip of the shiny and new is glimmering in our eyes. Basically, the short of it is that he just thinks that we're going to have to do a massive amount of more experimentation before we actually figure out what works

Starting point is 00:16:45 and what the unified vision of something that replaces the app paradigm really is. Now, certainly there's going to be no shortage of people working on this. Although Sam Altman denied that he has any sort of formal partnership with Johnny Ive, Bloomberg was reporting as recently as the end of last month that Ives Design Studio was heavily recruiting Apple designers, including the iPhone design chief, to come work on an AI hardware device. If you are excited about hardware, it's going to be a great time to be alive from here till the foreseeable future, and I'm excited to share what comes up as it does.

Starting point is 00:17:15 That's going to do it for today's AI breakdown. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Google's AlphaGeometry Breakthrough

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.