Your Undivided Attention - Talking With Animals… Using AI

Starting point is 00:00:00 Hey, everyone, it's Tristan. And today we have a bit of a departure from our regular content on this podcast. As you may know, we've spent the last few months working incredibly hard to deal with the threat of the AI dilemma, which we outlined in a previous episode, and to help pause and redirect the resources of the race-to-deploy runaway AI into something that would constitute moving at the speed that we can get this right. But as I've said before, we're not opposed to AI or artificial intelligence. In fact, Aza built an entire nonprofit that's using artificial intelligence to learn how to communicate with animals.

Starting point is 00:00:34 It's called the Earth's Species Project. And in this special episode, Aza's going to tell you all about that work. Okay, over to him. Hey, everyone, it's Aza. And you probably don't recognize that sound. It's the communication of orcas. Recorded in Antarctica. Studying animal communication is part of my work with an organization I founded back in 2017

Starting point is 00:01:05 called the Earth Species Project, or ESP. And what we're aiming to do is to decode non-human communication, basically learn how to talk with animals. The goal is to transform the way we communicate with and relate to the rest of nature. In this case, I like to think of AI as a kind of telescope, that opens the aperture of the human mind. Why is that so important? Well, if we could draw down all of the carbon out of the atmosphere tomorrow,

Starting point is 00:01:35 that may help ameliorate climate in the short term, but it doesn't solve the core problem, which is human ego. And I think of Earth species as one of many different types of projects to try to patch human ego. Now, we know through our work on AI and exponential growth of this technology, that it's going to be in the next 12 to 36 months that we should be able to imitate whale or crow communication in such high fidelity that we will be able to build a synthetic whale or a synthetic crow or synthetic buluga or synthetic seal that they won't know is not one of them so that's the plot twist we'll be able to have a conversation before we know what we are saying and with that ability comes a great responsibility we talked to about that in our recent episode about the three rules of technology, and it very much applies here. So a quick reminder about the three rules. Rule number one, when you invent a new technology,

Starting point is 00:02:38 you uncover a new class of responsibilities. Two, if that technology confers power, it'll start a race. And three, if you do not coordinate, that race will end in tragedy. So how does that relate to talking to animals? Well, we need to think about who might gain power. by using this technology. So one example might be ecotourism operators, where they will be in a race with other ecotourism operators to attract animals for their clients. So that'll start a race probably ends in tragedy.

Starting point is 00:03:12 Poachers are going to do this too, and animal agriculture is definitely going to want to use this, so we are going to need to, before we put out this technology and invent it, figure out how to coordinate. ESP is of course developing this technology both to shift the way that we relate to the rest of nature as a species but also accelerate conservation research

Starting point is 00:03:37 we're hoping to be able to listen and learn from species that have culture has passed down for 34 million years to expand the human imagination. We think of this as a hopeful use of AI because we believe that someday soon when we cross the language barrier It could lead to a moment that superpowers a movement.

Starting point is 00:03:58 Specifically, it could lead to unprecedented coordination, the kind we are going to need to avert mass extinction. The true hope, I think, of AI is that it helps us understand ourselves and the world around us better so that we can better care for it. Not too long ago, I did a presentation about Earth species' work at the World Economic Forum in San Francisco, and we wanted to share that with you. here. So Earth Species Project seems sort of crazy to get to say this. We're working to decode

Starting point is 00:04:33 non-human communication. That is to say, can we talk to animals? And we get to work on this now, and it's credible. And the goal is to say, you know, can we unlock communication to transform our relationship with the rest of nature? And I want to start with this audio. Does anyone know, who's not on our team what animal makes the sound I heard I heard krill and wales which it's not and wales which it's not also not beluga also not a bird, it is seals. That was this guy's maiden call. So

Starting point is 00:05:30 if you hear that, and you get excited, now you know where to go. And what I love starting here is like the sounds of the natural world are so diverse, but we are mostly unaware of them. Earth species, the original idea for it,

Starting point is 00:05:48 came in 2013 from an NPR piece about gelata monkeys. And the researcher who's on NPR talking about them said, you know, they have one of the largest vocabularies of any primate, except for humans, they swear that the animals talk about them behind their backs. And at the point of thinking about this, there's no one really working on how do you apply machine learning

Starting point is 00:06:10 to decoding an animal language. And then the person that started Earth Species with co-founded 2017, when we started, there were very few people thinking about this. So it's gone from an idea in the mind to something where there's an entire field that's now working on it, which is incredibly exciting. And if there's one concept I want you guys to hold in the back of your minds for this talk, is that our ability to understand is limited by our ability to perceive. And what does AI do? It is opening the aperture of human imagination and the human senses to let us perceive much more. and in so doing, it'll let us understand a whole bunch more.

Starting point is 00:06:54 So I want to start with just a couple of examples of my favorites for opening the aperture of what we even think is possible. So University of Tel Aviv, 2019, is an incredible study on primrose flowers. And they asked this question, do you think a flower can hear the sound of an approaching bee? And so they played different sounds to a primrose flower. They played like traffic noise, bat noise, and pollinator noise.

Starting point is 00:07:19 noise and only when they played bee noise did the flower respond and it produced more and sweeter nectar in just a couple seconds right so the flower hears the bee coming and gets excited it's like here come to me i think that's just amazing and actually they tried the inverse same lab they stressed out tobacco and tomato plants so dehydrated them like cut them and it turns out they emit sound and not softly they'll emit sound in proportioned to how much they are stressed at the sound of human speaking. It's just up at 50 or 60 kilohertz, so we can't hear it. So here we go.

Starting point is 00:07:56 We have plants that can hear and plants that are speaking, that are emitting sound, and we were completely unaware of it until 2019. Like the world is awash in communication. And I think if we move forward and look back in time, we will be astounded at how static we thought the world was. Just another one, because I can't help it. there's this amazing plant called the bochila trifiliolata. It's a vine, and it does the most amazing thing.

Starting point is 00:08:24 You put it on other plants, and it will mimic their leaves. Pretty amazing. And so biologists, botanists, are like, well, how is it doing this? Well, it's probably detecting the chemical signatures of the other plants, and that's how it's like knowing what leaves to make. And so they tried this great experiment in 2020, where they tried growing this vine on artificial plants. and it still was able to mimic the leaves.

Starting point is 00:08:50 And so, honestly, this is a current mystery. The current best thought is that they use Ossoletti, which is a very fancy way of saying eyes, that they are seeing the plant and changing. So again, we go forward, we look back, and we realize how little we actually knew. We're looking for animal language because they think it's, one, awesome,

Starting point is 00:09:11 and two, a really big lever for maybe changing human culture and driving conservation. but is there or they are there? And this is a fascinating study from University of Hawaii where they taught dolphins two gestures. And the first gesture was do something you have not done before.

Starting point is 00:09:28 That is innovate. And think about it, that's a pretty complex thing to be able to communicate. And to do, right? To be able to innovate, you have to remember all the things you've done before that session, understand the concept of negation,

Starting point is 00:09:39 not one of those things, and then to invent something that isn't one of those things. And yet dolphins do it. and then they teach a second gesture do something together and they say to the dolphins in a pair do something you haven't done

Starting point is 00:09:52 before together and the dolphins go down, exchange some kind of information, they come up and they do the same thing they haven't done before at the same time you're like Occam's razor doesn't prove that there's language but you're like it's sort of the simplest explanation

Starting point is 00:10:07 and that leads to the question okay maybe there is a there there How would you go about transiting a language without a Rosetta Stone? Well, if you want to understand AI, I think there's like one concept to hold in your mind that's really explanatory. And that is, AI turns semantic relationships into geometric relationships. Okay, I need to break in for just a moment.

Starting point is 00:10:32 What's on the screen here is a three-dimensional image rotating clockwise. It basically looks like a galaxy of thousands of stars, but in fact, it's representing the English language. All right, back to the presentation. This is English. This is the top 10,000 most spoken words in English. It's actually supposed to be in like 300 dimensions. We projected it down to three dimensions because I can't think in 300 dimensions.

Starting point is 00:10:56 Every star in this galaxy is a word. And words that share a semantic relationship, share a geometric relationship. So an example of this might be, you know, smelly is to malodorus as book is to tone. Because melodorus is sort of the pretentious way of saying smelly. And so if you take that, you do melodorous minus melly, gives you pretentiousness as a relationship, you add pretentiousness to clever and illegal a droid. It's pretty wild to play with these spaces.

Starting point is 00:11:23 And so if you think then about, like, how do you end up with a shape that represents a language? If you think about a concept like a dog, well, it has relationship to friend and to guardian and to man and to cat and to wolf and to fur, and it fixes it in a point in space. And if you sort of solve the massive multidimensional Sudoku puzzle of every concept to every other concept and the relationships out pops this rigid structure.

Starting point is 00:11:48 And the question then researchers had and why we started Earth species in 2017 is if you have the shape which is German and the shape which is English, they can't possibly be similar shapes, can they? And linguists would say, well, they have a different history, different cosmology, different way of relating to the world, so it should be a different shape. And yet when the machine learners tried it, it turned out that they fit. and it wasn't just English and German which share a root was languages like Japanese and Esperanto and Finnish

Starting point is 00:12:20 and Turkish and Aramaic and it's not like they all have the same shape and more distant languages have more unrelated shapes and yet there's a way that you could align them all on top of each other and in so doing dog ends up in the same spot in both I just think this is so profoundly beautiful that in a time of such a deep division there is this hidden underlying structure that unites us all.

Starting point is 00:12:46 And so our thought was, and actually, you know, this is not the way now modern, I don't know what the right terms are these things, our ultra-modern machine learning does translation, but this is the core concept, I think, that holds in your head for why this thing works. And our thought was, well, can we apply this then to animal communication? If we build this shape for the way animals communicate, what part fits into the universal human meaning shape? And if it does, then we should be able to do direct translation to the human experience.

Starting point is 00:13:14 And there should be some part where their experience of the world is so different. We can't translate, but we can see that there's something there. And I still don't know which one is going to be more fascinating. The parts that we can directly translate into the human experience are the parts we have no idea what it is. And those are going to be the things that are outside of the aperture of the human imagination. Whales and dolphins have cultures that have been passed down vocally for 34 million years. humans only for maximally 300,000 years. Just imagine what they might know.

Starting point is 00:13:45 And why do we think there might be an overlap? Well, just give two examples. This is the mirror test. I don't know how many of you guys are aware of it, but the idea is you show an animal a mirror. Often you will paint a dot on them. And when they look in the mirror, they see themselves, they see the dot that they couldn't see before,

Starting point is 00:14:02 and they try to get it off. This dolphin is looking at its abs, which I think is a relatively universal human experience when you get to a mirror. But this shows a kind of self-awareness, right? Like, you have to have self-awareness. That's a deep and profound experience that they may well communicate about.

Starting point is 00:14:22 So that part of the shape might be shared. Let me give another example. This is a lemur taking a hit off of a centipede. They do this, and they get high. They go into this trans-like state. They get super happy. It turns out dolphins do the same thing. but with pufferfish, they will inflate a puff or fish in a group

Starting point is 00:14:42 and then pass it around to get high, which is the ultimate of puff-puff pass. So elevated states of consciousness and seeking that is another thing that is shared across a wide variety of species. So that's something where we'd expect some kind of fit. But, okay, how do we go about building this shape? And it turns out it's really hard.

Starting point is 00:15:06 Getting the data is hard. That's why we have such a long list. of partnerships, like Ari, who's here, we'll talk about how hard it actually is to go out into the field. This takes blood, sweat, and tears. Turns out whales don't exactly just want to, like, stick around and, like, help you. So, you know, as we started to dive into it,

Starting point is 00:15:21 we realized there were a lot of really, really hard problems we're going to have to solve before we could start asking these kinds of questions. So here's another question. What animal makes this sound? This is the beluga. This is a couple of belugas communicating. And to me, it sounds like an alien modem.

Starting point is 00:15:51 Wouldn't it be nice, though, to know which beluga was speaking? You sort of want to separate them out into their own individual tracks. So actually, one of our first papers was trying to tackle this particular problem. So this is two dogs barking, and we learned how to separate them using AI. Okay, so I've talked about the ability to translate between human languages, but maybe that just works because we all share the same anatomy, physiology. But actually, there's something deeper going on. So I want to talk a little bit about multimodal translation.

Starting point is 00:16:25 Have you guys seen, like, all of the AI generative art that's been happening recently? How does that work? Dolly, exactly. Here's how to think about it. So you can build the shape for a language, but you can also build a shape for images because it's just the relationship between things. You then look over the internet to find all of the images and captioned pairs, and that learns to associate languages and images.

Starting point is 00:16:47 And so, okay, so this is multimodal translation. You can translate between two very different sense modalities. And this makes us believe that this kind of thing can work across species as well. So what kind of data do we work with? This is actually ARI in Antarctica, tagging whales, and you can see that the data that comes off of it is how they, animals move, kinematics. You get visuals, so you get video, and you get audio. So you can start to translate between these. We've actually just were awarded one of the National Geographic

Starting point is 00:17:20 Explorers Awards. And the project is led by Benjamin Hoffman, who is working on turning all of that physical motion data into meaning. Like, what are their behaviors? How do you categorize it? And the reason why I want to do that, in part, is because this, like, this, like, you start doing really interesting things. I say, okay, given this motion, what sound goes with it? So you could imagine saying, we have two elephants coming together. You model that and you say, AI, generate me the audio. That is the sound of two elephants coming together.

Starting point is 00:17:52 And that'll give you the affiliate calls, the contact calls, how they say their name. Or you might say, okay, we want to like intervene with ship strikes hitting whales. Could it be possible to say to a whale, like dive? And we would then say, what would you say to have a whale dive? and it will generate the audio for that. Now, before saying they're like, ooh, we should just go do that, it comes with a lot of really complex ethical issues.

Starting point is 00:18:16 Are we forcing the animal to dive and it's missing food or expending energy that it can't afford? It's just like one of the kinds of things that we might run into there. So, and this is sort of the area that I'm really interested in exploring today, I just want to show one more video,

Starting point is 00:18:33 and this is with another partner of ours, Michelle Fornay. This is from her very recent documentary called Fathom on Apple Television about her experiments that were starting to work with her on. The oldest cultures are not human. They're from the ocean. 40 million years ago, before we walked upright, before we sparked fire, whales evolved to build relationships.

Starting point is 00:19:06 ships in the dark. I'm trying to start a conversation is the most basic way you can say it. I'm trying to put a speaker in the ocean and talk to a whale and hope it talks back. Starting to play back. If this work is successful, it will be the first experiment where we have engaged in a dialogue with a humpback whale. The punchline is that it works. She's saying hello to the whales, which sounds like

Starting point is 00:19:39 and it also apparently encodes their name, and they respond back. The next question is, can we say something more complex than just recording something and playing it back? So one of our researchers, Gen U has been working on building language models directly on top of

Starting point is 00:19:56 audio, and so this is an example of that. This is a humpback contact call that hello, maybe with name, a real one and then two synthetic ones. So what does this mean? This means in our domain, at some point 12, 24, 36 months, we're going to be able to do this for animal vocalizations.

Starting point is 00:20:23 And so just like I can build a chatbot in Chinese without needing to understand Chinese that still convince a Chinese speaker, we will, likely, we haven't done it yet, be able to sort of pass the dolphin-turing test or the whale-turing test or the tool-using crow-turing test. And it's really exciting because that means there is a kind of first contact

Starting point is 00:20:46 or something that's about to happen, but not in the way I think we originally expected where we decode first and then begin to communicate, but there'll be this really surprising ability to communicate before we understand. And so obviously there's some different, deep ethical issues here, as well as some really exciting opportunities. So one of our partners Christian roots, who works with Twos and Crows, sort of commented on a roadmap and said,

Starting point is 00:21:15 hey, you know, humpback whales, their song, it can go viral. So a song sung off the coast of Australia might get picked up and sung, you know, within a year across much of the population. So if we're not careful, we may have just invented like a crisper of culture. and messed up or intervened in a 34 million-year-old culture. And so I think now is the right time to start thinking about when you invent a new technology, you invent a new responsibility. What are the responsibilities for acting with a duty of care for the natural world? And in some ways, that's the whole point of our species in the first place,

Starting point is 00:21:58 is how do we shift our relationship with the rest of the future? nature. So I think it's a really exciting time to have this kind of conversation because I think we think of AI as the invention of modern optics. It's like the telescope in the sense that before when we invented the telescope we looked at it in the universe and discovered Earth was not the center and hear the opportunity that we get to look out at the patterns of the universe and discover that maybe humanity is not the center of the universe. Thanks for listening to this slightly different episode of Your Undivided Attention. We'll have a link to the Earth Species Project and its work in the show notes.

Starting point is 00:22:47 I also wanted to mention that the Earth Species Project is hiring for our director of research role, specializing in AI. So if that's you or if it's someone you know, please check out the Earth Species Project website. Your Undivided Attention is produced. by the Center for Humane Technology, a non-profit organization working to catalyze a humane future. Our senior producer is Julia Scott.

Starting point is 00:23:11 Kirsten McMurray and Sarah McRae are our associate producers. Mia LoBelle is our consulting producer, mixing on this episode by Jeff Sudaken. Original music and sound design by Ryan and Hayes Holiday, and a special thanks to the whole Center for Humane Technology team for making this podcast possible. A very special thanks to our generous lead supporters, including the Omidio Network,

Starting point is 00:23:31 Craig Newmark Philanthropies, and the Evalve Foundation, among many others. You can find show notes, transcripts, and much more at humanetech.com. We also want to hear your questions for us. So send us a voice note or email at ask us at humanetech.com or visit humanetech.com slash ask us to connect with us there and we'll answer some of them in an upcoming episode. And if you made it all the way here,

Starting point is 00:23:55 let me give one more thank you to you for giving us your individed attention.

Your Undivided Attention - Talking With Animals… Using AI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.