The AI Daily Brief: Artificial Intelligence News and Analysis - The AI-Robotics Revolution: Robots That Think Like LLMs

Episode Date: July 31, 2023

Google recently shared its RT-2 model which can translate generalized instructions into robotic control. Also on today's episode, researchers use AI to discover ancient antibiotics from Neanderthals a...nd Denisovans; Google releases research on Med-PaLM M, a general multimodal medical model; and AI might be coming to Dungeons & Dragons. Today's Sponsor Netsuite | The leading business management software | Get no interest and no payments for 6 months https://netsuite.com/breakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI breakdown, we're discussing some major advances in AI for health and medicine. Before that on the brief, a big discussion of the convergence of robotics and LLMs. The AI Breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube, our Discord, and our newsletter. Welcome back to the AI breakdown brief. All the AI headline news you need in around five minutes. Today we kick off with the convergence of, on the one hand, robotics and on the other hand, artificial intelligence and more specifically large language models. Now, one company at the very center of both of these big shifts is Google's deep mind.
Starting point is 00:00:42 At the end of last week, they introduced something they call RT2. Robotic Transformer 2 or RT2 is, they say, a novel vision language action or VLA model that learns from both web and robotics data and translates this knowledge into generalized instructions for robotic control. The TLDR here, as simply as it can be put, frankly oversimplifying it, is that rather than training robots on first-hand data across every object and environment and situation that they might encounter, RT2 instead translates all sorts of web and robotics data into more generalized instructions that robots can use to make sense of the world around them. And so just like generalized LLMs can do more than just a specific set of tasks that they were trained for, RT2 is also increasing those generalized capabilities. Deep mind sums up. RT2 shows improved generalization capabilities and semantic and visual understanding beyond the robotic data it was exposed to. This includes interpreting new commands and responding to user commands by performing rudimentary
Starting point is 00:01:38 reasoning, such as reasoning about object categories or high-level descriptions. We also show that incorporating chain of thought reasoning allows RT2 to perform multi-stage semantic reasoning, like deciding which object could be used as an improvised hammer, a rock, or which type of drink is best for a tired person, an energy drink. So what are the types of skills that come with RT2? Well, first, by way of understanding what Google DeepMind actually did, they say they performed a series of qualitative and quantitative experiments on over 6,000 robotic trials. The three categories of skills that they defined were symbol understanding, reasoning, and human recognition. Each task, they say, required understanding
Starting point is 00:02:13 a visual semantic concept, as well as the ability to perform a robotic control to operate upon that concept. So, for example, pick up the bag about to fall off the table. The robot has to understand what bag is about to fall off the table, and then respond to the command to pick it up. Google reports that across all three categories, they observed increased generalization performance and more than a 3x improvement compared to previous baselines. Now, the New York Times wrote about this as well, which is why so many people have been chattering about it. The New York Times piece is called aided by AI language models.
Starting point is 00:02:44 Google's robots are getting smart. Giving a storyteller's flair to the demonstration, they write, a one-armed robots stood in front of a table. On the table sat three plastic figurines, a lion, a whale, and a dinosaur. An engineer gave the robot an instruction, pick up the extinct animal. The robot word for a moment, then its arm extended and its claw opened and descended. It grabbed the dinosaur. As the New York Times points out, until very recently, this demonstration would have been impossible. Robots weren't able to reliably manipulate objects they had never
Starting point is 00:03:11 seen before, and they certainly weren't capable of making the logical leap from extinct animal to plastic dinosaur. However, they say a quiet revolution is underway in robotics. As the Times reports, right now Google isn't planning on selling these RT2 powered robots or releasing the model more widely. However, they do believe that eventually this type of robot that has a built-in language model is going to find its way into many, many use cases. Now, just to leave on a ominous note, the New York Times concludes their piece, if you're the kind of person who worries about AI going rogue, making robots that can reason, plan, and improvise on the fly probably strikes you as a terrible idea. But at Google, it's the kind of idea researchers are celebrating.
Starting point is 00:03:47 After gears in the wilderness, hardware robots are back, and they have their chatbot brains to thank. Next up, a story that is at once nerderly cool, as well as possibly enraging for the anti-tech set, Gizmodo is reporting that Dungeons and Dragons owner Hasbro has teased the idea of integrating, quote, powerful AI-driven game mechanics for D&D. This comes as part of a press release for a partnership announcement between Hasbro and explored. That press release says that the partnership would allow Hasbro to, quote, deliver innovative gameplay to our players and fans, limitless digital expansions to physical expansions to physical games, seamless onboarding and powerful AI-driven game mechanics. When asked by
Starting point is 00:04:22 Games Radar for more on what AI-driven game mechanics meant, a VP at the company said that AI might be used to do things like, quote, generate experiences that could react to player decisions right away and potentially streamline rules to make it easier on newer players. Gameres taking to Twitter were unsurprisingly upset. Al-Az-Stuart says, almost the first of the month, Joe Hasbro, shall we unleash the stupidity crackin again? Jack Murphy says, kind of sad watching this game become a victim of its own popularity as Hasbro makes every mistake possible as it tries to squeeze blood from a stone. H. H. Hooligan says, let's be honest, technology has made character creation and gameplay easier, but this is nothing more than Hasbro and Watsy trying not to pay humans if possible.
Starting point is 00:04:59 And Cora Bueller simply says, stop with this AI crap no one wants or needs. I don't know, man, I'm kind of excited. Let's see what they do. Moving on to our next topic, it seems like every weekend in the Bay Area, there is another big hackathon, often taking place at the AGI house, which builds itself as the machine learning hacker house bringing hackathon life back to Silicon Valley. Jeremiah Awang was at last weekend's Anthropic Hackathon, and the list of projects are pretty interesting. There was a real-time fact-checking platform for live broadcasts, Dr. Claude, which is effectively a personal AI physician that uses your own data, Immigrantfirst.AI, which helps immigrants figure out what information they need based on the government of whatever jurisdiction they're in and what they have to file,
Starting point is 00:05:39 and the winning project, which was called Claude Scholar. Jeremiah, summed it up, finds insights in existing data for molecular scientists. upload data, PDFs, which you can then summarize information, apply for grants, and generate new info such as molecule variants. Finally, today we continued to see policy efforts in the United States around AI. Last week saw the introduction of the Create AI Act, which stands for creating resources for every American to experiment with Artificial Intelligence Act. The Create AI Act would establish the N-AIRR, the National Artificial Intelligence Research
Starting point is 00:06:10 Resource. And if you're exhausted with these acronyms, basically what this is trying to do is to ensure that it's not just big tech companies that can actually do cutting-edge AI research. The NARR is designed to be a, quote, shared national research infrastructure that provides AI researchers and students from diverse backgrounds with greater access to the complex resources, data, and tools needed to develop safe and trustworthy artificial intelligence. Sponsoring Congresswoman Anna Eschew said, AI offers incredible possibilities for our country, but access to the high-powered computational tools needed to conduct AI research is limited to
Starting point is 00:06:41 only a few large technology companies. By establishing the national artificial intelligence, intelligence research resource. My bipartisan Create AI Act provides researchers from universities, nonprofits, and government with the powerful tools necessary to develop cutting-edge AI systems that are safe, ethical, transparent, and inclusive. The four primary goals are to one, spur innovation and advance the development of safe, reliable, and trustworthy AI research and development. Two, improve access to AI resources for researchers and students, including groups typically underrepresented in STEM. Three, improve capacity for AI research in the United States. Four, support the testing, benchmarking, and evaluation of AI systems developed and deployed in the United States.
Starting point is 00:07:17 Now, ultimately, what this represents, I believe, is the fact that AI isn't going to be just one big set of banner legislation. Sure, we might get comprehensive legislation from the likes of Chuck Schumer, who has been working on exactly that, but I think it's far more likely that along the way we see lots and lots of these very specific smaller bills, which are highly focused, have a good chance to pass, and address some aspect of the AI ecosystem that politicians want to get addressed, perhaps on its own terms, without having to argue in the context of a larger, more comprehensive, and thus more controversial bill. In this case, the concern is that the high cost and barriers to entry of developing AI and of advanced AI research could reinforce inequalities that exist already, thus an attempt to expand
Starting point is 00:07:57 access and have this next technology wave not necessarily reflect the biases of the last. Anyways, guys, that is going to do it for today's AI breakdown brief. Thanks as always for listening or watching, and I'll be back soon with the main AI breakdown. Before we get to the main episode, I want to tell you about today's sponsor, NetSuite. I know many of you guys are entrepreneurs, executives, managers, business leaders who are trying to figure out how technology is changing the world and how it can change your business. Given that, I am thrilled to have NetSuite as a sponsor of the AI breakdown. NetSuite gives you the visibility and control you need to make better decisions faster.
Starting point is 00:08:33 It is the software superpower behind so many of the world's most successful companies. And for the first time in NetSuite's 25 years as the number one cloud financial system, you can defer payments of a full NetSuite implementation for six months. That's no payment and no interest for six months, and you can take advantage of this special financing offer today. Listen, NetSuite is number one because they give your business everything you need in real time all in one place to reduce manual processes, boost efficiency, build forecasts,
Starting point is 00:09:01 and increase productivity across every department. I know that if you are listening to the AI breakdown, you understand intuitively and deeply just how much data matters to any modern business. Having all of your information in one place can be the difference between making the right decision and making the wrong one. I think it's awesome that NetSuite has this new offer designed to really make their suite of tools available for all the businesses that need it. So if you have been sizing up NetSuite to make the switch, then you know that this deal
Starting point is 00:09:29 is unprecedented. No interest, no payments, take advantage of this special financing offer at NetSuite.com slash breakdown. Go to netseweet.com slash breakdown to get visibility and control you need to weather any storm. That's net suite.com slash breakdown. And with that, let's get to the show. Welcome back to the AI breakdown. Today we are talking about AI advances in health and medicine, including a new multimodal model from Google, as well as research inspired by Jurassic Park. Cesar de la Fuente is a presidential assistant professor at the University of Pennsylvania. Recently, he tweeted,
Starting point is 00:10:05 My lifelong dream has been to use machines to accelerate discoveries in biology. Using AI, we discovered antibiotics in Neanderthals and Denisovans, our closest hominid relatives. Molecular de-extinction opens new avenues for drug discovery. So I first saw this reported in nature, and the piece was called AI search of Neanderthal proteins resurrects extinct antibiotics. So this is obviously as someone who's interested in both AI and deeply in history, pretty much the most clickbaity catnip type of title you could imagine. However, the research is useful for much more than just a good article title.
Starting point is 00:10:39 Nature.com gives the context really simply. They write, Antibiotic development has slowed over the past few decades, and most of the antibiotics prescribed today have been on the market for more than 30 years. Meanwhile, antibiotic-resistant bacteria are on the rise, so a new wave of treatments will soon be needed. I wasn't joking when I said this was inspired by Jurassic Park. De La Fuente said,
Starting point is 00:10:58 we started actually thinking about Jurassic Park. Why not bring back molecules? Now, the specific molecules they were interested in were peptides. Peptides are a short protein subunit and often have antimicrobial properties. So what De LaFuente and his researchers did was train an AI model first to recognize human protein peptide sites, and then second, use that model against publicly available protein sequences from our ancient relatives, including the Neanderthals and the Denisovans. From there, they used the properties of previously known antimicrobial peptides to predict which of the new peptides from these ancient human relatives might help kill bacteria. Researchers then tested dozens of the peptides that AI had suggested
Starting point is 00:11:39 to see if they could kill bacteria first in laboratory dishes, but then with six of the highest potential examples, tested them on mice that had been infected with a bacteria that is a common cause of hospital-borne infections among people. The results were mixed. The peptides halted the growth of the bacteria, but none actually killed it. And while the results might not have been a slam dunk, the research team think that, one, using these high promise peptides as a jumping off point could reduce the time to drug discovery. And second, they think that the AI model they used might be able to be improved to actually have a higher degree of predictive success. Now, if we started in our ancient past, we now zoom forward to our near future.
Starting point is 00:12:15 On Sunday, Leor at Alpha Signal AI tweeted, incredible news, the first generalist medical AI system is out. DeepMind just announced MedPom M, a multimodom. genitive AI model that understands one, clinical language, two, imaging, three, genomics. What he's referring to is a new research paper from Google that was called Towards Generalist Biomedical AI. The abstract of the paper reads, medicine is inherently multimodal, with rich data modality spanning text, imaging, genomics, and more. Generalist biomedical artificial intelligence systems that flexibly encode, integrate, and interpret this data at scale can potentially enable impactful applications ranging from scientific discovery to care delivery. To enable the development of those
Starting point is 00:12:57 models, they say, they introduced a new biomedical benchmark, which they called MultimedBENCH. Multimed Bench encompasses 14 different tasks that range from medical question answering to radiology report generation and summarization. And then they introduce something they call MedPom Multimodal or MedPom-M. So basically what this research team is attempting to do is in the same way that we are now deploying AI models that are generalists, i.e., we can use them for a variety of tasks. not just a very specific task they were trained for, this Google team is trying to do the same thing in the field of medicine. TLDR, the researchers had some really promising results.
Starting point is 00:13:32 The model showed zero-shot generalization on novel medical concepts and tasks, and even emergent zero-shot medical reasoning. Now, zero-shot refers to a type of machine learning, where a model is able to correctly make predictions or decisions on data it has not been explicitly trained on. The term zero-shot comes from the fact that the model receives zero examples or shots of these specific tasks during training. during training. This is important in the context of this generalist model as it demonstrates a model's
Starting point is 00:13:57 ability to generalize knowledge from known tasks to unknown tasks. Now, one of the big examples they used of the performance of this model in their tests was a radiologist's evaluation of model-generated chest x-ray reports as compared to reports produced by radiologists. In a side-by-side ranking on 246 retrospective chest x-rays, clinicians prefer Med Palm M reports over those produced by radiologists and up to around 40.5% of cases. Now, obviously, there's a long way to go, but having two-fifths prefer this generalized models version rather than a specialist-created version by humans
Starting point is 00:14:30 suggest that there's a lot more promise here. Now, this relationship between AI and medicine is one that is just growing every single day. In July, Dr. Danielle Lamas wrote a guest essay for the New York Times called There's One Hard Question My Fellow Doctors and I will need to answer soon. In it, she says, The idea of a computer diagnostician has long been compelling. Doctors have tried to make machines that can think like a doctor and diagnose patients for decades,
Starting point is 00:14:54 like a doctor-house style program that can take in a set of disparate symptoms and suggest a unifying diagnosis. But early models were time-consuming to employ and ultimately not particularly useful in practice. However, she points, generative AI systems are different. They are not, quote, the same as looking up a set of symptoms on Google. Instead, these programs have the ability to synthesize data and think, much like an expert. To date, she says, we have not integrated generative AI into our work in the intensive care unit. But it seems clear that we inevitably will. Interestingly, Dr. Lomas also points to x-rays the subject of that Med Palm M test as a likely example where AI might come to the force sooner rather than later. She writes,
Starting point is 00:15:31 One of the easiest ways to imagine using AI is when it comes to work that requires pattern recognition such as reading x-rays. Even the best doctor may be less adept than a machine when it comes to recognizing complex patterns without bias, echoing the concerns of people in other fields where they worry that the rise of AI will undermine fundamental skills. Dr. Lamas says, there is the real possibility that doctors in training could lean on these programs to do the hard work of generating a diagnosis rather than learn to do it themselves. If you've never sorted through the mess of seemingly unrelated symptoms to arrive at a potential diagnosis, but instead relied on a computer, how do you learn the thought processes required for excellence as a doctor?
Starting point is 00:16:07 Ultimately, Dr. Lomas doesn't have the answer, but in the context of an ever-advancing set of medical AIs, it's one that's going to come up more and more. That is going to do it for today's AI Breakdown. If you're enjoying it, do me a favor and text one person, a link to this show or another recent show that you think they would like. The AI breakdown community is one of thoughtful, intentional people, and I'd love for you to invite someone in. Thanks for listening as always, and until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.