The Great Simplification with Nate Hagens - Aza Raskin: "AI, The Shape of Language, and Earth's Species"

Episode Date: June 8, 2022

On this episode, we meet with cofounder of the Earth Species Project, cofounder of the Center for Humane Technology, and cohost of the podcast Your Undivided Attention, Aza Raskin.  Raskin gives us a... general overview of what artificial intelligence is, how it's about to become more deeply embedded in our lives, and how he and his team plan to use AI as a Rosetta Stone to translate the languages of other species to - hopefully - expand human consciousness, empathy, and awareness of the other beings we share this planet with. About Aza Raskin: Aza is the cofounder of Earth Species Project, an open-source collaborative nonprofit dedicated to decoding animal communication. He is also the cofounder of the Center for Humane Technology and is the cohost for the podcast Your Undivided Attention. Trained as a mathematician and dark matter physicist, he has taken three companies from founding to acquisition, a co-chairing member of the World Economic Forum's Global AI Counsel, helped found Mozilla Labs, in addition to being named FastCompany's Master of Design, and listed on Forbes and Inc Magazines 30-under-30. For Show Notes and Transcript visit: https://www.thegreatsimplification.com/episode/22-aza-raskin

Transcript
Discussion (0)
Starting point is 00:00:02 You're listening to The Great Simplification with Nate Higgins. That's me. On this show, we try to explore and simplify what's happening with energy, the economy, the environment, and our society. Together with scientists, experts, and leaders, this show is about understanding the bird's eye view of how everything fits together, where we go from here and what we can do about it as a society and as individuals. Greetings, humans around the world tuning in to this show. As frequent listeners know, I believe we live as part of a system. And we have to understand how the parts and the processes of the system fit together in order to understand the roadmap and what to do ahead. I understand a lot about ecology, energy, human behavior, but there are quite a few topics that I know very little about.
Starting point is 00:01:01 And there's probably a lot of topics I don't even know that I don't know about. But one topic increasingly relevant in our lives is machine intelligence and artificial intelligence, AI. It does seem that AI and big data are a poise to play a much bigger role in our lives. With us to unpack this today, we're fortunate to have a pretty special human being and one I consider a good friend, Aza Raskin. Aza is the son of Jeff Raskin, the developer of the Macintosh computer. Aza is the co-founder of the Center for Humane Technology and also the co-founder of the Earth
Starting point is 00:01:43 Species Project. Today, Aza is going to give us a general overview of what artificial intelligence is, how it is about to become much more embedded in our lives, and specifically how he and his team plan to use AI as a sort of a Rosetta Stone to translate the languages of other species to perhaps, hopefully, expand human consciousness, empathy, and awareness of the 10 million other species we share this planet with. Thank you all for joining me on this journey of learning and informing what's possible. I present Aza Raskin.
Starting point is 00:02:26 Jambo, Habari Eza, how are you? Hey, Nate, I am doing fantastically. How are you? I am good. I am good. Spring is finally here. I've never spoken Swahili to you, but you mentioned to me when we were last together that you spent time with the Pygmies, I believe, in Congo, so you probably do speak a little bit Swahili. Well, but mind you, so I did spend a little time with the Bayaka tribes when I was in the Congo rainforest, studying forest elephants. They're like, most people speak French, but there's a trade language whose name I don't
Starting point is 00:03:19 remember that the Bayaka speak, but they have their own language, very independent from Swahili. And interesting enough, a lot of their culture centers around music. So I was sort of surprised to discover when they're, when they are flirting, their song for that. There was a moment that we were driving in a truck. There are only three in that area of the Congo. As we passed this group of Biaka, all 17 of them piled into the back with me. And as soon as we started moving, they started singing, which, you know, eventually I joined in with a poor rendition, but it's very polyrhythmic, polyphonic, and they kept singing all the up into the time. They jumped out of
Starting point is 00:04:03 the car. So it's not just language. It's the full spectrum of human experience that they, communicate with. And meanwhile, in the developed United States and inner cities, we have replaced all of our tribal tapestry of rich human experiences with monetary markers and technology. And oh, to just sing and experience nature like that. I mean, there is a certain part of me that really longs for that. And it makes me happy to hear your stories as the same one when you were chased by an elephant. That's a different sort of experience. All right, my friend. So you are a colorful
Starting point is 00:04:46 human being. And so I've worn a colorful shirt for our conversation today. There's a lot that I would like to unpack. What I really want to get to is your work, your new initiative called the Earth Species Project, which is applying artificial intelligence to understand the language of other species, but before we get there, we have a winding road. For people who don't know you, maybe you could explain why there is a daily activity in most people's lives that you were responsible for and your current feelings about that. This fits into the camp of you will always be defined by the one worst thing that you ever did. So if you cast your minds back, 2006, this was an era when MapQuest was just ending,
Starting point is 00:05:48 where you click a little button to move to see the next part of the map. Google Maps had come out. You could start to scroll around. And I was working on a lot of technologies back then. This was just before I joined Mozilla, helped found Mozilla labs, at a geolocation to the web. did a lot of the very first prototypes for what a browser on mobile would even look like. And one of the things I was thinking about was, well, why are those more, let me say that again, why are there those little buttons at the bottom of search pages that say more? Why do I have to click on them? And isn't there already a meaning to scrolling, which means I have yet to find the thing I'm looking for?
Starting point is 00:06:36 And so I had this very simple idea of using at that point a very new technology, something called Ajax, let you load new content without refreshing a webpage to as you scrolled and got to the bottom of a page, it just automatically loaded more content. And that was the invention of the infinite scroll. What's I think interesting about that is that as a designer, I was doing my job, I think, very well. Every time as a designer, you ask the user to do something they do not care about, make a decision they don't need to think about, you have failed. And so it made perfect sense to remove that sort of stopping queue to say, if you're scrolling, just keep loading more. What I didn't realize, as I went around to Google and Twitter and talk to them about this great new, better interface, is that me optimizing for an email.
Starting point is 00:07:32 individual user might be breaking something at the collective level. That is, you know, I did the calculation a couple of years ago when somebody challenged me to how many hours does that one invention waste? And it's very conservatively over a million human lifetimes per day, that it's convincing people to scroll more. And so I think this is, to me, a really important waking up realization that if you don't understand the context into which something you invent is going to be used, then whatever perverse incentives of that larger system are are going to co-opt that good intention of the thing you create.
Starting point is 00:08:18 That applies to so many things. That's a microcosm for a lot of aspects of our current world and our aspiration. While you were telling that story, it made me think, not only does it waste, in air quotes, a million human lifetimes, but what about non-human lifetimes? Depending on the boundaries, 15 to 20 percent of our total electricity in the world is linked to device use and the servers and the technology that underpins it all. So extra time online and scrolling ends up being more coal-fired servers around the planet, which have interspecies intergenerational impacts as well. You know, that's also making me think about, you know, the large-scale effects, as you said, intergenerational of technology is, well, how have civilizations like, since.
Starting point is 00:09:23 succeeded. Like, how do they work? Well, you have to have additive, cumulative cultural knowledge transfer. That is, unless, you know, via language, you can communicate what you've learned to the next generation. Civilizations quite literally do not exist. And what is technology doing? It's sort of taking scissors and it's cutting the transmission lines from one generation to the next because It's source everyone into their own generation. So that very thing that let civilization continue is the thing that our current round of technology is sniffing. Hmm. That's, I hadn't thought about that that way.
Starting point is 00:10:10 Um, okay. So let, let's, uh, transition into, um, the current, uh, zeitgeist of advanced technology. which is artificial intelligence. To be blunt, as you know from my work that we've discussed, I know a lot about energy and systems ecology and climate. I don't know a lot about artificial intelligence. I would guess 80% of what I know I've learned from you. So let's unpack this a little bit.
Starting point is 00:10:45 I think a lot of people watch Hollywood movies like Deiasek Machina and, and Terminator and things like that. But could you explain what is AI, artificial intelligence? What is it? Let's just start there. So it turns out that AI is not terribly complex. It's massively parallel matrix multiplication that does smart trial and error. That is, even if you don't understand the massively parallel matrix multiplication, part of that.
Starting point is 00:11:24 sentence, the thing to hold in your mind is that the computer is doing, you know, billions to trillions of calculations to do smart trial and error to figure out what works and what doesn't. Another way of thinking about it for people who have a little bit more of a math background is these techniques have a property called universality, meaning that no matter what function you draw, these algorithms can approximate them. So what does that mean? That means matter what shape you draw, the computer can recreate that shape. And I think, I don't want to go from here. That's sort of a very theoretical way of saying it. Often the way this works in practice is you are asking the computer to do something like fill in the blanks or predict the things that
Starting point is 00:12:29 comes next. Almost all of the major advances that you hear about, whether it's GPT3 and these language models that as of a couple months ago can now generate text that can pass the Harvard essay entrance exams, that those are all trained by saying, Look, read over the entirety of the internet and see if you can predict the word that comes next. Or if I give you a sentence, if I drop out a couple of the words, that you can predict the words that were there. And to do that, the computer has to do a whole bunch of things. It has to begin to understand grammar. It has to begin to understand syntax, what word comes after another.
Starting point is 00:13:20 If you have a word like ice, you should expect that the word ice will appear next to the word cold more often than ice will appear next to the word fashion. So you're starting to get hints about meaning in the co-occurrence of words. And because the computers are able to do, you know, trillions of trial and errors to be able to sort of fit the shape of what language looks like, eventually it learns how to proximate English very, very well. So why didn't this happen? I mean, that made sense what you just said. It's kind of common sense that we would ask eventually computers to accomplish this. Why didn't this happen 10 or 20 years ago?
Starting point is 00:14:12 What was the limiter? Quite simply, it was compute. It was the ability to do it fast enough. There have been very few. major theoretical breakthroughs. I mean, there have. There have been ideas like attention networks. Let me say that again.
Starting point is 00:14:28 There are ideas like attention and attention heads so that the AI can know what is important to pay attention to. But the big change was the rise of, one, GPUs developed for gaming that let you do these matrix multiplications very efficiently. And two, it was the rise of bigger standardized data sets, the most common being or the most famous being ImageNet, created originally by Fay-Fa-Lee, which created the benchmark by which, or the yardstick, by which everyone was measuring themselves. So big data sets, because, as Peter Norvig says, data ends up being 10 times smarter than
Starting point is 00:15:17 algorithms, the more data you have, the better you can fit it, the better your predictions. And cheap compute. Those were the changes. Okay. So what is the current state of artificial intelligence in 2022? And how does it, how does it work? Um, kind of break that down for us. I think it's instructive to walk through a 2017 breakthrough, which was the sort of
Starting point is 00:15:44 the starting mark for why we said, now is the time. to start working on translating non-human languages. Because walking through this example, I think we'll give, you know, the audience, everyone listening, the conceptual tools to start to reason and think about AI on your own, which I think is exciting. So here was the insight. We've already been talking about how you can, like, model language. But I want to go a little deeper.
Starting point is 00:16:17 One of the things you can ask the computer to do is build a shape that represents language. And this shape is special. So imagine in your head a galaxy where every star is a word. And words that mean similar things are near each other. And then words that share semantic relationships become and turn into geometric relationships. So an example. king is to man as woman is to queen, right? That's that common analogy,
Starting point is 00:16:51 which means that the relationship between queen, king, and man is sort of the same as the relationship between woman and queen. So because you're in a shape, that semantic relationship becomes a geometric relationship. So there's a distance and direction from king to man. And if you sort of imagine that as a vector, and you add that vector to woman, it'll equal queen. If you add that vector to boy, it'll equal prints.
Starting point is 00:17:21 When you say vector, you mean a small binary computer language representation of that concept? Yeah, but I think you could certainly represent it that way, but I like to think of it geometrically. So what is a vector? A vector is just an arrow with a direction and a distance. Like it's a, you go this far in this direction. So if you start at man and you walk so far, in such a direction, you'll end up at King. If you start now at girl and you walk the same distance and direction, you're going to end up at the point, which is princess. So was that in 2017?
Starting point is 00:17:58 Was that using the English language? Yeah. Well, actually, what I'm talking about right now is still back in 2013. The ability to build these shapes was an invention called Word to Vec, where you could do it efficiently. And it was you could take language and create a shape. that encodes all of the semantic relationships. And, you know, the first thing I tried when I got my hands on this data set was like, okay, hipster minus authenticity, plus conservative, and that just equaled electability.
Starting point is 00:18:31 You're like, computer, you're not allowed to write better jokes than I can. But, you know, this is a very general tool. So you can do, like, smelly minus malodorous. And malodorous is sort of the pretentious way, if you will, of saying smelly. you add that to book an illegal tome. You add that to clever and illegal a droid. And when you say when you add that, that means the human interfacing with the computer would put that in the prompt or the text or the voice or something like that.
Starting point is 00:19:05 Correct. You say, okay, there is a point which is smelly and a point which is malodorous. You sort of look at the distance and direction between them. And once you have that relationship, which is pretentiousness as a distance and direction, you just add that to any other word and you get the pretentious version of the word because that semantic relationship of pretentiousness became a geometric relationship. In English or across all languages. And this works across every language.
Starting point is 00:19:36 And where we're going to get to, and it's sort of the mic drop, is that it's not even just language that this works for. This works for every modality. And that's the sort of the eye opening, mind opening realization is that, you know, deep learning AI sort of has this one weird trick. And the one weird trick is it can turn whatever data set into points in space where those points in space turn semantic relationships into geometric relationships. So let's think about faces for a second. If you train an AI and a whole bunch of faces, where you end up are like, you now have a galaxy where every point is a face, points that are near each other are similar looking faces. And there is a direction which is make this face smile more. There is a direction which is make this face more male or more female. There is a
Starting point is 00:20:42 direction which is make this face older or younger. What the AI has figured out is how to turn all of these semantic relationships that we know how to reason about in our minds into geometries that we can work with on the computer. Okay, let's take a step back and go back to our pygmies in Congo example. Back in the day, all of us in our ancestral times had pattern recognition. We had our own reality neural net on our relationships of the Dunbar's number in our tribe, on the plants and animals and which things we could eat and which things were poison. And now that whole pattern recognition trial and error is sped up many, many, many, many orders of magnitude when we combine the compute, which is the Moore's law reducing the size
Starting point is 00:21:44 of semiconductors and chips so that we get more power per unit of area, plus the breadth and depth of global knowledge that we have access to, the data set, as it were. So we're applying the same neural net that humans used to do in a manual look at our reality. very slow day-to-day sense with massive compute and data set applied to it. Something like that? Yeah, I think that's a good way of saying it. It's no doubt that the neural nets we are creating now are nowhere near as efficient as the human brain. In what way?
Starting point is 00:22:26 But we're able to, just in terms of power, required a number of examples that we need to give a computer before it starts to learn, but we can feed in a lot more data a lot quicker. The payoff that I'm getting to with these, like, why am I asking people to imagine these geometric shapes and hold these multidimensional things in their mind? It's because we're about to get two really powerful payoffs. One, and this was the 2017 breakthrough. And it was, okay, so now we're thinking back in language. imagine dog.
Starting point is 00:23:05 Dog has a relationship to man. Dog has relationship to wolf, to being a guardian, to Howells, to Fur. If you think about all of the relationships that dog has, it sort of fixes it in a point in space. And if you solve every concept's relationship
Starting point is 00:23:24 to every other concept, it's like solving a massively multidimensional Sudoku puzzle and outcomes sort of a real, rigid structure that represents a language, which is already really cool. And the insight from 2017 was like, well, the shape which is German and the shape which is Japanese, these can't possibly be the same shape because we have different histories and different cultures, different ways of thinking about the world, different ways of relating to each other
Starting point is 00:23:54 and the natural world, different cosmologies. And yet what they discovered is that when you take German and you take Japanese, you can rotate one on top of the other. And the point which is dog ends up in roughly the same spot in both. And that lets you translate without a dictionary or Rosetta Stone. Explain what you mean, rotate it so that dog is in the same place. What do you mean by rotate it? So, you know, we've been talking about these galaxies, these shapes that represent, well, any data set, but language.
Starting point is 00:24:31 they're you know it's easier to think of them in three dimensions they're generally in 300 to a thousand dimensions but they have some shape there's an overall structure to them and you just take one and you line it up with the other you just move it around until the two shapes match and then the concepts that relate in the same way overlay with the concepts that relate in the same way across actually not just German and Japanese, but Finnish, which is a really weird language in Turkish and Aramaic and Urdu, it appears that, you know, almost all human languages share a kind of universal meaning shape. And of course, there are words in one language that don't appear in another, and that means like there's a point in Japanese that's not in the exact same spot as the point
Starting point is 00:25:19 in English, but the overall shapes and you blur your eyes are the same. And I just think that's so beautiful. It is beautiful because it almost, to me, it conjures up our shared past of tens of thousands of generations on the Pleistocene before we spread out around the world. There is a common brain structure. And then the differences would obviously become over time due to cultural differences. And it's good to know that the relationship between dog and humans is roughly the same across all cultures. I'm encouraged to hear that. But yeah, okay, keep going. Fascinating.
Starting point is 00:25:58 Okay. So, I mean, that's interesting. And I think where you're going is exactly right in the natural inclination to say that, well, it must be something about our brain structures that is causing the shape to be the same. And of course, the punchline where we're going to go with Earth species is can we figure out how to use this kind of technology to translate animal communication by building the shape for their language and seeing if it fit? But the way I think about language is that it is a model of the world and how we feel about it. It's not just telling us about our brain shape and structure. It's telling us something about how the relationships in the world work. And so, I'm just going to jump to another fascinating, much more recent breakthrough because we've been talking about how language is aligned in the same shape, but how about images? So I don't know if your listeners have seen Dolly or Dolly 2 or another thing called clip guided image diffusion, but Dolly 2 just came out from OpenAI.
Starting point is 00:27:07 And it has this incredible uncanny ability to you type whatever it is you want. Any prompt you can think of faces underwater by Salvador Dali. And within around a minute, it will generate that. image, and it's not just generating an image by finding an image that's similar on the internet. You can come up with things that are absolutely new that the internet has not seen before. You know, a unicorn in a space suit talking with Elon Musk while playing the piano, whatever you want. Yeah, this is in the style of Shagall, and it'll generate it. This totally blew me away when you showed me this at Joni's.
Starting point is 00:27:53 I'll post a couple of the things that you came up with. Was that Dolly that you used to give me those images? That was not Dolly. That was something called Disco Diffusion, which actually any listener right now, if you just Google that, you will find a Google co-lab that if you can code just a little bit, you can start playing with this technology on your own. And I actually used that piece of software to make a music video with my partner, Alice, As you know, Nate, which just got shown at TED.
Starting point is 00:28:24 And it surprised me, you know, it took $3 of compute and 48 hours. And it surprised me because the first time I've seen AI art that didn't feel amorphous and blobby, but made me feel emotionally. It was beautiful. We'll put that video in the show notes, submarines. But here's my reaction to all this. I find it. So when we were at a little cocktail party and you were using that software that you just mentioned,
Starting point is 00:29:01 and there was an owl on the porch and we said, Owl in about to hunt for the evening in Jasper John's style. And it spit out this beautiful abstract art image of an owl, yellow and black. And it was really cool. And my instant reaction was, binary. It was two bookends of a reaction. One is this is stunningly cool and beautiful and fun and I love it. And the other was a little bit of horror because first of all, as you know, I just did an Earth Day talk where I hired one of my former students who's a beautiful artist to make 80 tarot cards.
Starting point is 00:29:43 And I paid her a decent amount of money for that because they were beautiful. I could have done it for free using that software and just generated a bunch of tarot cards to represent the image that I wanted. So what does this mean for artists? What does this mean for income and wealth equality? What does it mean for junior level programmers? I mean, one of the things of AI is that AI can actually do basic and intermediate level programming for you.
Starting point is 00:30:16 And maybe you could talk about that a little bit. So I was simultaneously excited and petrified of the possibilities of what you just described. That feeling of excited and petrified is my normal nervous system default state about the future. It is simultaneous utopia and dystopia. And that's what makes it so confusing. We are about to have the greatest golden era humanity has ever seen, both in terms of of understanding ourselves, reverse engineering, how we work, being able to create art, having anyone be able to make music and visual creations beyond what we could possibly have imagined.
Starting point is 00:30:59 And also that power will be used to increase asymmetric power inequality, wealth inequality, and to exploit, because this is the fundamental paradox of technology is the better we understand ourselves, the better we can serve and protect, and the better that we can exploit. So, well, and the problem is, is the power law is I believe in general human goodness. And which is why I'm really glad that you are deeply in this AI community and I believe in your power for good. But I do think one or two out of a hundred people that seek power to use things for non-pro social ends end up capturing some of this, a, a, disproportionate amount of the space. So I do worry about that. So getting back to that's exactly right. Yeah. Getting back to this though. So what are the what are the applications of the the dolly sort of thing, but then beyond art?
Starting point is 00:32:11 What are the applications of that sort of thing? And how soon will that be available for the general public. Yeah. So I think it's just important for listeners to realize because it's, I've tried to explain this a number of times now, like what Dolly is without people getting to see it. And it's hard to grasp how good it is until you go look at it and realize that literally anything you type, it can visualize. And once you have that, then everything I'm about to say will make more sense. And how far are we Aza from instead of typing? it, you just say, Dolly, make me a Jasper Johns of an owl on your phone? Like, how far are we from that? Nothing technically is stopping us from doing that. That could be a right now thing. It's just that
Starting point is 00:33:03 people haven't actually hooked up the pipes. In fact, if I wanted to, or anyone that had the skills could go home and within one evening of hacking could have a version of that running. So the first thing to realize is that language is going to become a joystick by which we control everything. We have a major paradigm shift in computing where things will become like creation will become conversational. So very soon we're going to move away from this very sort of old school way of listening to music where if you don't like the music that you're listening to you're going to have to go. find the right song or maybe a playlist that has the vibe. You're just going to start saying things like, well, that's great,
Starting point is 00:33:52 but make this song a little bit more desert house with the feeling of purple nostalgic sunsets and sparkles. And it'll either generate it or find you the music that you're looking for. We're going to see a massive shift. It will generate music like that's not by Rush or Porcupine Tree or Anne Murray. It'll generate new music. Yeah, totally, because it's not so much further to go from, I say something to it generates an image that's never existed before, but that we find aesthetic and powerful and profound to I say something and it generates music that I find beautiful and profound. And what happens if, it's a higher dimensional space, so it'll take longer.
Starting point is 00:34:35 And what happens if, well, it'll take longer, but it'll still be lightning fast by human time scales. What happens if the music preferences of the youth in society end up gravitating towards AI-generated music? What happens to the artists that currently exist that are human, creative minds and not computers? Yeah, great question. So I think the nouns of what we make, the music, the art, those things are going to get commodified. And also I should point out there's a whole set of copyright issues here, too, because I can, ask computer to generate something in the style of, say, my friend, Victor Nye. And it's not any of her work. So she doesn't own the copyright. In fact, the current case law says nobody does. It's in
Starting point is 00:35:24 the public domain. The computer made it. But computers can't take away the verb of the act of creation. So I think handmade art or human made art will become artisanal, sort of like artisanal soap. It's going to be artisanal art. The thing is that humans are going to have a hard time competing. So let's go back to the language as a joystick for a second. We're about to see the rise of emotive media. What does that mean? That means right now when you watch a movie, it's the same every time through. It does not change based on your emotions. You know, with the language as a joystick, you can just say, and this technology or already exists, and Vidia put out some demos of this where you can take a scene and re-render it
Starting point is 00:36:15 nearly in real time to make the character feel more anguished or make the character smile more. And this works on CG characters and it works on like real life actors. So I can rewrite sort of like a style sheet for the emotions and the way that a actor is acting. That's already cool, right? That means in another like five years, we're going to look back at every piece of media we consume now as flat. Like, why is it that when we watch a movie, it should feel more like a play. It should be different every time and it should react to how I feel every time versus the same every time. And of course, how is this going to show up in the marketplace?
Starting point is 00:36:56 It's going to show up with an engagement-based metric. The easiest thing to measure is the human paying attention. Now you can imagine Netflix turning on the camera. It's not about analytics, right? It's about this better experience where the characters are reactive to you. They gather, you know, hundreds of millions of watches. They find the 10,000 people that are similar to you in your psychological sort of emotional reactive profile. And now the movie is uniquely tailored to your current emotional situation and your people that are like you. These things are going to be so much more sugary and sticky and engaging than anything we've ever had before. I think that's fucking horrible. I have to tell you. I didn't know that. I was just thinking about these images and the political thing, which I want to get into.
Starting point is 00:37:47 So you mean I can, depending on my mood, I can watch an early version of a Star Trek movie or when Harry met Sally and the emotional timbre of the movie will change based on the computer's perception of my current state of mind. and my mood. Yeah, absolutely. And you'll be able to start saying things like, you know, I want to watch when Harry Met Sally, but at the end, I want to feel happy. I want to feel this way. And it'll just gently manipulate the whole thing to get you to feel that way. So Black Mirror was not that sci-fi in reality.
Starting point is 00:38:28 That guy was freaking brilliant, those scenes he made. I don't know if you've watched them all. I have not watched them all, but it's a mix of like spot on exactly and also a little chintzy cheesy. And when I watch them, I'm like, oh, this part, that's going to happen. That part's a little silly. So, but are we strong enough emotionally, psychologically, for this sort of phase shift on top of everything else going on with climate and resource depletion and the great simplification? I mean, doesn't this, isn't this just a cultural battle for the brainstem and technology is going to turn society into an idiocracy where it's all self-medicating with technology where our world becomes this, like you said, sugary, sweet, immersive tech Netflix marriage? I mean, I don't, I can't even process it, to be honest, Eza.
Starting point is 00:39:30 Yeah, this is a continuing. of a process that started long ago, magicians, con artists, discovering facts about how the human mind works and then learning how to use them towards whatever end that they have. So, well, the end, the end is profits, right? Profits for the companies that design these technologies. Yeah, that, that is, yes, that is exactly right. And because they have an asymmetric amount of information about us and are discovering new species of ways to persuade us, unless they are acting as a fiduciary to us, that is, in our interests, we're sort of sunk. We as humanity have told ourselves these just so stories.
Starting point is 00:40:26 creativity is the thing that defines us and will save us. Empathy is that core part of the human experience that will save us. And actually, isn't it surprising that creativity is sort of the first thing that the AI is coming for? And that empathy, as beautiful as it is, and is a core part of my life, what Earth species is about is also the biggest backdoor into the human mind. And that loneliness is going to be every country's largest national security threat. Oh, my God. Okay. I definitely want to carve out a chunk of time to go into your big project, Earth species, because I really believe in it. But you also wear another hat. as the co-director of the Center for Humane Technology, how does AI merge into these risks of
Starting point is 00:41:29 polarization, social media, hijacking our attention, and then we get further and further apart on the political spectrum so that we can't even really have a discourse about our reality. How does AI going to change your work at CHT? Yeah. Well, there is sort of like narrow AI and this more like general sense of AI that we're talking about now. Can you define narrow AI? Yeah. Narrow AI, I just really mean it's like the dumb stuff.
Starting point is 00:42:00 Like if you click on something, show you more of those things. If other people like click on something like you do, show you more of what they click on. It's not this is not advanced AI in the same sense of Dolly. But it's pernicious because you end up having. you know, a trillion dollar market cap company like Facebook pointing. Not anymore. 500 billion, but go on. 500 billion.
Starting point is 00:42:29 Things in part to like, well, maybe some of our work to Francis Hogan, like them, like TikTok eating their lunch. You never know exactly what makes what happen. But 500 billion dollars of market cap powering quite literally the largest deployed AI systems in the world. looking for whatever generates the most engagement. And actually, that's not exactly right. In the beginning of 2018,
Starting point is 00:42:58 Facebook switched to using a metric called meaningful social interaction. And what does that mean? They were measuring, they're up-leveling the content, which causes the most reaction from your friends. You post something if your friends react to it, and if they react it with an angry face, it gets a 5x boost, then that is the content that gets promoted.
Starting point is 00:43:28 Wait, if my friends hate what I said because it gave him an angry face, that gets a 5x vote in the algorithm? Yeah. Now, I don't know if that's still true, but this is some of the documentation that came out from Francis Hogan's Facebook disclosures. What about a love icon? At that case, I think it was ranked less. But now I don't remember.
Starting point is 00:43:56 The- Well, I'll have to look on that because if hate is valued more as an engagement than love, that's a fundamental problem with our entire system. I completely agree. And there's some Jonathan Haidt just pointed out some new research that the most viral thing that the most viral thing that the thing that gets the most engagement, the most reaction on social media is hate against outgroups. Like that is number one, the most viral thing. So he's going to be on the show after his book is done. So I love his research. But that that is who we are as tribal animals, right?
Starting point is 00:44:44 is ostracizing outgroups is a core conserved aspect of our of our genome and we're seeing it unfold in real time yeah but you know human beings are complex and the thing I think to take away it's like are we narcissistic and tribalistic or are we creative and altruistic and the answer is like, well, neither. It's that we're both. And it's that the environment that we are living within can sort of, I sort of imagine these, these aspects of ourselves as resonant tuning forks. And if the environment outside is humming at a circumfquency, it'll activate different tones within us. And so if we live in an environment where we are literally getting paid in likes and follows and an engagement for hating on the outgroup, it's that part of us that's going
Starting point is 00:45:45 to be most activated. And, you know, it's not that technology is an existential threat. It's the worst of society is an existential threat. And technology is activating the worst of society. So I wanted to give one, like, really concrete example of the way that the sort of narrow AIs cause a kind of global psychosis where we cannot hear each other or even believe that we're coming in good faith. And it sort of works like this. It's like a trauma inflation. So, you know, let's say you have, you have something that you're particularly sensitive to. Let's say it's
Starting point is 00:46:26 Asian American hate. You're scrolling on Twitter, you see an example of the video of Asian American hate. And of course, that activates. you, that activates your trauma. And so you click on it. And because you've clicked on it, it's very activating for you. Your feed starts to become the very worst thing you've ever clicked on. You start getting shown more and more first person examples, videos of Asian and American hate. So you now know, like, you are inflate, your trauma is inflated, and you're like,
Starting point is 00:47:02 this thing is happening everywhere. Why can't other people see it? On the other side, you might have someone who is, like, their trauma is hit when they see, say, protesters throwing things or beating police, like inflicting violence against police. And so they click on that because there are certainly examples of that out there. They see a first-person video. Their feed starts getting filled with the worst thing they've ever clicked on. And now they have a never-ending infinite feed of the.
Starting point is 00:47:35 videos of protesters beating police. So everyone has their own little nightmares expanded on Facebook as a, if those things are all happening as 1% of our reality, that each of us think that it's 20% of our reality, whatever we're most triggered by. And so we all have a different sense of what's really going on. All those things are happening, but at a lower percentage than our perception from using social media. Yeah, that's exactly right. It's like a fisheye lens and things are distorted. So it's really happening, but you're getting a incorrect view, a non-representative view of what's happening in the world. And now when I come to talk to you, I know what I've seen. I have seen these first person perspectives. And so if you tell me that's not an issue, I know that you're disconnected from reality. You're not really seeing the real world.
Starting point is 00:48:35 maybe you're just coming in bad faith. And you know the exact same thing from me. And so you can see how we now cannot come together. We disagree. We both say the other person is not based in reality. If you're not based in reality, how can we even have a conversation in the first place? And that's happening along every division at scale with that $500 billion market cap AI, finding, you know, as Tristan likes to say, finding the fault lines of society and then fracking them for profit. So this is happening now. How is the Dali and GPT3 and OpenAI and the other advancements plus compute coming in the next 24 months going to accentuate the problem you just outlined? Well, you know, one of the things I tried when I started playing with these image-to-text translations was explosion over a Kharkiv.
Starting point is 00:49:51 And it generated a pretty realistic-looking photo of a breaking news style story. And then I hit another button and I generated 16 of them. We are about to have this sort of, almost think of it as a trust Jubilee in the best possible sense. But it's really like a neutron bomb for trust. We will not trust in the next, you know, I don't know, two years or so any photographic or video evidence. So this is horrifying to me because I'm a teacher. And so I'm trying to construct a 10 hour. this is my project, the second half of this year, is the Earth Day Talk was a horizontal,
Starting point is 00:50:35 all the 80 ecological concepts relevant to our future. And now I'm going to do a vertical drop down to do the depth in each of these. But I'm doing these online video educational resources for young people around the world. At the same time that AI is stripping out the ability for us to trust anything. So isn't this just a threat to science and in addition to trust, but into science and truth and facts writ large? Yes. We as a society do not yet have the antibodies to deal with this. And of course, it's not just the images. And mind you, like Photoshop, we're going to look back as being so antiquated because with Photoshop you have to manipulate like contrast and exposure and pick. pixels and the new version of Photoshop that, you know, again, this is already available in research. So it's just really now about like productizing is that you look at a photo and you be like,
Starting point is 00:51:38 well, make that person smile more, add an image of Nate into the background and change it from daylight to sunset. Like that's the kind of semantic edits. We're just going to become second nature. We're also going to do that with text, right? you're going to be able to say, hey, GPD3, generate me a scientific sounding paper for why vaccines are harmful. Please cite real studies, use real graphs, and now generate me a thousand of them, and do a thousand in the other direction for why vaccines are actually the best thing. Now do it for masks.
Starting point is 00:52:24 And what you see here is that it's not about any one viewpoint. It's sort of like how Russia sort of demoralizes its enemies. This is about just flooding the zone with artifacts that the human brain has to process and it's going to be unable to process these things at scale. So really terrifying. My hope is this, that after sort of the game is up, and we know that whatever text we see on the internet, images we see on the internet may not be true.
Starting point is 00:52:58 We're going to hit rock bottom, just like in depression. Sometimes you have an addiction, you have to hit rock bottom before we start coming up and say, okay, from this place where we can all point at the fact that we know none of this stuff is true, how are we going to rebuild our trust and our epistemics? That's, I think, the conversation we need to be having now. And it can start simply as, you know, like maybe our phone start. signing our images so that it looks at the depth sensor. It puts all of that into a package, signs it and says, this was taken on a real device, why a depth sensor? So you can't just take a
Starting point is 00:53:32 picture of a picture. You have to actually take a picture of a 3D object. People then start preferencing images that don't have filters because if it's a filtered thing, it's been modified, you can't trust it. Who knows it's underneath? And we start having a new currency of trust built from the ground up. Okay. So I want to get to your new project, but I have a few more questions to follow up on here. First of all, could you define the Turing test and why is that currently relevant and how close are we to that moment? The Turing test, Alan Turing came up with it as a way of measuring how far,
Starting point is 00:54:24 computers have got in terms of intelligence. Because you don't really know from the outside whether something is conscious or not. You can't tell. And this was the sort of phenomenological way of testing. It's a lot of words for a really simple test. It's sit down. You're typing with a computer or it may be a human. You don't know. And you have to guess in conversation, is this a person or is this a computer? If you cannot tell, then the two things are indistinguishable to you in that medium. Then the computer has passed the Turing test. And you just said that those, they could write a paper referencing real literature on pro-vaccine or anti-vaccine. And could humans tell that it was a fake or not?
Starting point is 00:55:15 So the computers can't do that well enough yet. they struggle, where we're working on is longer form. So at the paragraph level, the computer stays coherent. The two or three paragraph can stay pretty coherent. Whether the time you get to a full of scientific paper, it's not really maintaining coherence. It still takes a lot of brain power to go figure out why. But, you know, it can write a college admissions essay that will pass. So right now it's at the annoyance, time sync, distraction level,
Starting point is 00:55:48 but we may not be far off from us being unable to distinguish what's real and what's fake. That's right. I think of it is like we are just finished crossing through the uncanny valley. We're sort of like climbing up the final slopes and we're heading into the synthetic valley where we don't know what's real and what's synthetic. But mind you, functionally, we've already passed the Turing test. Bots on Twitter, people interact with and think that they're real. a couple years ago, Microsoft released a chatbot.
Starting point is 00:56:23 And really a chatbot sort of like puts it in your mind at the wrong place. I think of them as synthetic humans and they've been sort of crappy synthetic humans. So they're getting to be better synthetic humans. And it's not that we're just going to like talking with chatbots. We'll have synthetic relationships. A quarter of their users for Shao Ice have said, I love you to their synthetic human friend. And that makes some sense because. you know, real humans are messy. Like you have to work with them. They have their own needs.
Starting point is 00:56:52 They maybe don't know your hobbies. They're not always available. AIs and synthetic relationships, they're always there for you. Well, let me, let me expand on that. So I personally find some of this stuff abhorrent, but maybe it's just because of I'm a Neo-Ludite and I like forests and dogs and camping and stars and things. But I think to a lot of people, this actually may be a welcome next step from Netflix and the Oculus Rift. And if I have a synthetic human who is kind of built via AI to conform to my emotional needs, what's not to like about that?
Starting point is 00:57:43 It might be hugely popular, yes? I think it will be incredibly popular. And Daniel Schmachtenberger has this really incredible point that in a world of hyponormal stimuli, that is in a world where we've already replaced real connection with the sort of brittle synthetic connection we get when we interact with humans on our phones, when you are feeling understimulated, then your hypernormal stimuli are a lot more powerful. That is, if you live in a world of crappy food, then really sugary food is going to taste even better than if you lived in a world of really nutritious, great food. So this could be a culture-wide coping mechanism to the end of growth and the great simplification in my terms that's ahead of us?
Starting point is 00:58:37 I think so. And I'll just add as another part of the coping mechanism, what happens? when the human mind encounters like entities or systems that have a complexity, you know, at its scale or bigger that it can't understand. Well, it, like the human mind turns towards religion or spirituality or mysticism to explain that which it can't explain. China, as of this year, made the first model, like, massive model that I think had 178 trillion parameters, which is for the first time the same
Starting point is 00:59:23 order of magnitude. Billion or trillion? Really? Trillion. That's the same number as a number of synapses of the human brain. So we are crossing these fundamental thresholds. And so we are going to encounter more and more things in our reality that we have trouble understanding.
Starting point is 00:59:42 In some sense, I think we've hit peak understanding. Because previously, you know, we've understood more and more of the complex systems in our world, even as we've built more and more complex systems. But now there are more complex systems that we're generating than we're understanding. So expect to see a massive shift in the next, you know, 20 years to increasing spirituality and religious, religiosity. I expect that anyways because of the stuff that I work on. But I just wonder, Aza. And, you know, I know we're going to get to some of the positive possibilities from AI. But I just wonder if society is going to bifurcate in such a way that once this is understood,
Starting point is 01:00:29 maybe this gets back to your, we have to hit rock bottom before we find our way. I wonder if 10%, 20%, 30% of society will just say no moss and give up all this technologically. despite what it offers us in the sugary sweet distraction and release. And if those people aren't the ones that maintain the semblance of human sanity and true north compass of humanity, I don't know, I'm just musing, but all of your last half hour makes me not want to use this much, if at all. And yet, and yet I am in my attempt to change other people's heart. and minds about the future, I'm compelled to use these resources and I've actually been using
Starting point is 01:01:20 them more than I ever have because I'm trying to get people to watch my videos and listen to podcasts, et cetera. But do you have any thoughts on what I just said? I mean, I think you've articulated what's so inhumane, which is that we are forced to use systems for the things that we need that are fundamentally unsafe. The, the beautiful hope, right, is that we could be living through the golden era of humanity where we use these tools, you know, to increase our ability to perceive because our ability to understand is limited by our ability to perceive. And then to really understand what it is to be human. Like, let's understand our ergonomics, understand our cognetics, like how our bodies bend our
Starting point is 01:02:10 how our minds bend are unfold. You know, the way you could sum up, like, the interdependent and escalating cascading catastrophes that are about to hit us is that we as a society, our collective power using our technology is being, is outpacing our collective wisdom to wield that. power. This is the E.O. Wilson quote, we have paleithic motions, medieval institutions, and godlike technology. But another way of saying it is we could understand sort of the ergonomics, like how collective intelligences bend and fold. We could have a field of collective intelligence interaction design so that we could match the wisdom that we wield collectively
Starting point is 01:03:15 with the power that we wield collectively. That is within reach. It's just on the other side of a set of perverse incentives and systems. Another way of saying it, you know, in sort of your language is that if you could give the super amoeba a mirror that it could look at itself, in, not at the individual human level, but at the super amoeba level. So it could understand what its attributes are, what it's good at, what it's bad at. And we started designing at the super amoeba level.
Starting point is 01:03:51 Well, the super amoeba doesn't want to kill itself if it became aware of itself. That's the kind of thing that we could be working on if we could get to the other side of this sort of like perverse incentives valley. Yeah, that's fascinating. So before we get to Earth Species Project, are there any other really exciting positive possibilities from AI that you can envision or are aware of? We've talked a lot about the risks. Are there any really cool positive things that you can think of?
Starting point is 01:04:28 Well, I mean, starting at the simple level, the ability to create art that really speaks to you and the people around you, that I think is wonderful. The ability to be in dialogue with the system, so you can start visualizing the things that you're thinking about, seeing them and reacting. Like, I think of our jobs as communicators is there's a big sphere of things that we can think about. And outside of that are things that are things that are things that are unthinkable. And then inside of the thinkable sphere, there are things that are imaginable,
Starting point is 01:05:10 that we can visualize, that we can touch and taste. And it's our job as communicators. It's like run across the line from the thinkable to the unthinkable, grab an idea and bring it back. That's what metaphor does. And then to go from the thinkable
Starting point is 01:05:21 into the tangible, touchable feelable. Because once it's tangible, touchable, it's a thing you can share and we can work on together. We have a phrase that Tristan and I share, which is that it's not just enough to make the invisible visible, you have to make the invisible visceral, like feelable. AI, these tools are going to increase the frame rate at which we can think about something and then visualize it, which means it's going to decrease the time between you can start
Starting point is 01:05:52 to think about something and make it happen in the world. So that's exciting. Then my friend Jeremy Howard, who started Fast AI and Kaggle, started doing research on like essentially all of all of the geniuses like the Einstein's and the von Neumanns and people like that and the question that he was asking is like what's what's the same across all of them and a lot of the sort of like the best thinkers you know in the past 500 years or more all shared the property and that this is his research and not mine so you know I might be but the overall contours are right they all have had tutors, people that were specialized, sitting with them, teaching them to whatever the current
Starting point is 01:06:44 developmental stage was, bringing them to their adjacent possible. And he told me some stat, and again, I'm not going to get exactly right, but having a tutor for your child puts them in the 98th percentile of kids. What I think is interesting about this is, of course, it's impossible to get tutors for everyone, but this is an area that AI can excel at. Imagine you're transitive an Einstein bot or Von Neumann bought based on all of their work, plus great pedagogy. Now, any child or any human, me included, can have a kind of tutor that understands sort of like where I am, ask the right questions, give me the right problems to get me to
Starting point is 01:07:27 my adjacent possible. And that gets us closer to this idea of a agent. that sits alongside of me that can help me in the journey of buildong of lifelong human development. And I think that is all really exciting if it doesn't get captured by perverse incentives. Yes, agreed. And one final question on this, you have followed along a little bit with my story about resource depletion and growth and the fact that we're growing our debts. We're doubling our debt every eight or nine years, and we're doubling our GDP, which is the income stream to support the debt every 25 years. And that is a problem.
Starting point is 01:08:12 And that's before oil starts to decline in this maximum production. How energy intensive is AI? And even if we do have a 20 to 30 percent drop in the size of the global energy availability, is that plenty to scale some of the energy. initiatives that you've been talking about. Well, Nate, I just want to say thank you to you because it was your work that really opened me up to realizing how energy and material blind my worldview has been. And it is, I honestly, I'm still integrating all of your lessons into my own thoughts. these things are of course very energy intensive, although not nearly as energy intensive as Web 3 crypto world.
Starting point is 01:09:11 But there is a lot of work now going into making these large models more energy efficient. I am not up to speed in all the latest numbers, so I'm not the best person right now to talk about, which honestly speaks exactly to your point that even as I'm thinking about these systems, I'm still operating from a place of energy blindness. But if someone's going to use Dolly or GPT3 or in the future, my personal artificial or synthetic human as my friend, where does the energy reside on a server somewhere or in my house because it's attached to my phone or some central,
Starting point is 01:09:56 location where where does the energy come from I would assume it's similar to the way we use the internet now or what yeah that that's exactly right okay so the bulk of the energy use is in training these models so that that is collecting you know an internet's worth of data taking you know tens of billions to now trillions of parameters and teaching them how to make good predictions. Once you have one, it's actually much cheaper to ask them to make an inference, to make a guess. Often you can now take these models and download them, say, to your local computer or to your phone and smaller versions and have the calculations happen locally. A lot of work is going into making neural net computations happen directly on chip, so it's
Starting point is 01:10:51 not even happening in software. So you'll see these things becoming light. lightweight enough to be not just an Apple watches, but like on little backpacks that you put onto animals to measure like their audio, like what they're saying and how they're moving. Okay. Thank you for that excellent wide ranging introduction to your current project and one that I am fully in support of and I find fascinating, which is the Earth Species Project, one of the many hats you wear. And so now if you could go back to your example of Japanese and German mapping and let me know what you're working on and what your hopes are and what's going on with the Earth species project. Yeah.
Starting point is 01:11:41 And just before I dive into the full Earth species, I wanted to like finish a final punchline with Dali and how that works. because we've now talked about how you can take languages and turn them into shapes, mats them together to do translations. But how Dali works is very similar. You build the shape for images. You build the shape for language. You then run over the internet and you look at images and captions. And you say, okay, this point in image space goes to this point in language space.
Starting point is 01:12:15 And you do that again and again. It's not a rotation, but you're finding a way of mapping of aligning the shapes of images. is and the shapes of language. And from here, you just say whatever you want, portrait of Chile as a person. That goes into a point in language space, which then gets converted to the point in image space, and you ask the computer, make the image that represents this point, and it does. That's how the technology works. And why I find this so profound is it's not just that you can map one language to another. It's that you can even do translation across modality. because what is translation?
Starting point is 01:12:53 Translation is a transformation that leaves meaning the same. And why this is so important is, you know, just to give an example from sperm whale, a number of, I should say some scientists, this is definitely not the consensus view. I just like to think about it because it broadens, it expands the aperture of when we say language, what could we possibly mean for non-humans that when sperm whale, they have these incredibly powerful clicks, so powerful that if they want to, they could just shake your body to death. But when they click, then they do echolocation on you, they get back a full 3D model of you, right?
Starting point is 01:13:37 They see where your bones are. They see how much you've eaten. They see where your organs are. And that maybe when they speak, they are speaking not as sort of representational language, but they're just beaming back and forth sort of 3D sound holograms, just images. They see an orca. They beam the orchard to the other, their sperm whale. And so it's important because the punchline of you can take English and Japanese and Esperanto
Starting point is 01:14:02 and Finnish and Urdu and Aramaic and fit them into one universal meaning shape is that as we gain the ability to build these shapes for animal communication, you can see which parts overlap does, say sperm whale or beluga or raven. or orangutang, does that, does the shape of their communication fit anywhere into the universal human meaning shape? And if it does, we can get direct translation to the human experience. And we'd expect some part to do that because, you know, animals, they share grief. Like pilot whales will carry around their dead young and push them up to the surface for weeks. Many, there are five types, there are five species that go through menopause. One human.
Starting point is 01:14:48 for a whale, narwhal, pilot whale, orca and beluga. All of those species, like they have culture that gets passed down. They have dialects. Grandmothers are the culture holders. So if you have a really knowledgeable grandmother, their grand offspring survive more. So we have that kind of family structure, like apes. If you show them magic tricks, they will show joy and incredularity when you make. make a ball disappear when they didn't realize it.
Starting point is 01:15:21 So we share it that. There is a lot. Actually, lemurs will take centipedes and bite on them to get high. And they go through this whole trance, like crazy thing. It's incredible to watch. Just Google, like, lemurs getting high. Dolphins will inflate pufferfish and pass them around to get high. Oh, my God.
Starting point is 01:15:44 The ultimate puff, puff, puff pass. and let's see, and there's one more. Oh, and of course, like, you put dolphins in front of a mirror and you paint a white dot on them, they will look into the mirror and see the dot, which they hadn't seen before, and they'll try to, like, rub it or get it off. They'll look inside of their own mouths. Magpies do the same thing. Elephants pass, so they have a rich interiority, a self-awareness.
Starting point is 01:16:11 You have to look in a mirror and understand that that thing I'm looking at is me in order to pass the mirror test. So there's a lot that's the same thing that's the same thing. same. So we should expect part of that shape to overlap. And of course, part of that shape is like their world is just so different than us. Dolphins can speak to two different animals at the same time. They can bifurcate their communication stream. That's something we cannot do. So we'd expect some part to not overlap. So it would be like I was talking to you right now, but also talking to my program manager and talking about our project and doing both simultaneously.
Starting point is 01:16:43 Yeah. Well, to be very specific, what is no. is that they can bifurcate their stream and hit two different targets at the same time. Oh my God. It is unknown whether they're using that as a full communication channel for both streams. So I have a ton of questions. Yeah. This is a naive hypothesis. Wouldn't so you said that Japanese maps to German, maps to Aramaic, et cetera. Yeah. And your project is to try and understand what these other languages and non-humans look like, first of all, and then how do they relate to human languages? And ultimately, I think your goal, and one of the reasons that I'm keen to help you is, wouldn't that be a change in consciousness for our culture if people started to recognize
Starting point is 01:17:39 deeply, emotionally that we share this planet with 10 million other species. Many of them are self-aware, conscious, and have rich vocabularies and languages and daily interactions the same way we do. So then we extend the boundary of our empathy to other humans. That's a long shot, but that is the shot that we have to take in the coming century, in my opinion. And so Godspeed to you on that. But my question is, wouldn't it make sense that from an evolutionary standpoint, the nearer in time historically that we diverged from our nieces, nephews, and cousins in nature, the closer our language would be.
Starting point is 01:18:30 In other words, the other great apes, bonobos, gorillas, chimps would map the closest relative to dolphins and cetaceans, which is what 70 million years ago, the split or something like that of a common ancestor. Do you have any hypothesis or information on that? Or what do you think? I think you're likely correct that the closeness of communication between, primates and humans, like we share an umwelt more closely than we do with say, well, so there are our communication.
Starting point is 01:19:07 What's an umwelt? It seems like a good hypothesis. An umwelt is like your worldview, like how you experience your world through your sensory apparatus. And, you know, that's obviously very different for, say, a bat than a human because a bat is mostly seeing, especially at night in echolocation and like sound imprints. And dogs like a beige or portion of their umwelt is via smell. It's how they like relate to the world.
Starting point is 01:19:31 And so like our different ways are relating to the world means we're going to have different representations, which means we're going to communicate about it differently. And yet, this is why I started with that example from sperm wells communicating maybe in like sort of 3D sound holograms because these tools that have been talking about like Dolly let you translate across modalities. It's not even just we're translating. from one language to another, we're translating from images to language. And that gives me hope that even something as distant as a dolphin or a whale, we can begin
Starting point is 01:20:10 to map because we can do cross-modal translation. Dolphins also have the added benefit as to whales, right? One, just I think this is mind-blowing. Humans have been around communicating vocally for 100,000 to 300,000 years. Dolphins and whales in their current form have been communicating vocally, passing down culture for 34 million years. 34 million years passing down culture. How do you know that? Yeah. That's when they, like the current form of dolphins and whales that use echolocation, etc. That's when they evolved. And then you look at their behavior, you know, like humpback whales have pop songs.
Starting point is 01:20:58 they will be a song that will begin off the coast of Australia that whale will travel up and soon it's being sung off the coast of Ireland and it will spread like wildfire to the rest of the population around the world and that doesn't always happen it just depends on when it's particularly catchy what is your grand hope with this project what what are the objectives of our species project and in a perfect world if everything aligns and you get sufficient funding and backing of scientists and data, what do you hope to accomplish? You know, I think I and everyone on my team is really, and I should just stop for a second and point out that,
Starting point is 01:21:49 like, you're hearing just me speak, but I'm really just like the fruiting body of like a fungus network, which is many biologists, many institutions. We have an incredible AI team. I have two other co-founders, Brit, Selvital and Katie Zakarian. And the ideas that you've here developed here, actually for both Center for Humane Technology and our species,
Starting point is 01:22:11 comes from a big network of people like talking with you, Nate, Daniel Smoktenberg, Tristan, Johnathan Hayt. Anyway, I just, it's, I think, really easy to confuse that, like, one person speaking is the sum totality of it. And that's not the case at all. But you are the fruited body that I have on my podcast right now, And you've taught me a lot about this. And I am energized and really motivated by what you say.
Starting point is 01:22:39 So what do you, what can you envision this accomplishing? No, all those other people involved noted. So 1968, Roger and Katie Payne released this album, The Songs of the Humpback Whale. And it's the first time we as a Western society have heard the art, the communication of, you know, whales and it has a profound effect. It creates Star Trek 4, right? Go back in time to save the whale. I grew up never knowing that, like I always knew that humpback sang. It goes on Voyager 1 to represent not just humanity, but all of Earth. It's the very first thing on the golden record after human greetings. And it's played in front of the General UN Assembly. And it's sort of
Starting point is 01:23:31 the galvanizing artifact that ends up banning deep sea whaling. It's sort of the reason why we have the humpbacks today. There are these moments in time that become movements that shift our perspective so we can see ourselves from outside of ourselves and in so doing that shifts our identity who we think we are and our place on the planet. You know, the other classic example, of this is the space race going to the moon. You know, Earthrise and blue marble are still the two, I think, most viewed photos in world history. They sort of dose humanity with the overview effect. We realize we're this tiny pale blue dot floating in a mode of light, a speck of dust in space. It's why William Shatner, like, freaked out after he got back a few months from that outer space thing.
Starting point is 01:24:31 oh my God, I'd never thought of it this way. Exactly. Like it just shifts your perspective on what is meaningful. And, you know, what did it do to our political environment? Well, when there's a human being standing on the moon, the EPA was born, Noah came into existence, modern Earth Day was founded, the modern environmental movement got going. The Clean Air Acts was passed.
Starting point is 01:24:57 And that was in the Nixon era. So, and I should point out, it's not just that one thing, right? It's not like just standing in the mood did all of that. It came in the context of a larger movement. There was silent spring. There was increased inclusion. We were beginning to understand the ozone hole. So it's these moments, though, that happen within movement that create political will.
Starting point is 01:25:26 And that's really a major part of what Earth species, is about. Well, I think it's, it's almost deeper than that. We talked earlier about antagonism and hatred of outgroups is one of the prime triggers for our phenotype. And what the, um, the overview effect of the moon and some of the, um, the things you just said is it allows us to feel as if we're all one. It expands the boundaries of our in group.
Starting point is 01:25:58 For instance, if there were an alien race to come and attack Earth, we would do all the things that we need to do now to protect our oceans and to prepare for the end of growth because there would be a common entity outside of our one earth. And so maybe recognizing the brothers and sisters that we share this lonely planet with and all the universe, all of the known complex life who is self-aware are on this planet. And most of them are in the oceans. So carry on with what you hope you can accomplish. Well, you know, I really think of these new AI technologies as the invention of optics and specifically the telescope.
Starting point is 01:26:55 in the sense that the telescope let us look out at the patterns of the cosmos and what did we discover? We discovered that the Earth was not the center. And I think these new tools are going to let us look out at the patterns of the cosmos. And what are we going to discover that humanity is not at the center? And it's that kind of shift to human ego that I think is the core problem to be. solve. That is, even if we invented a technology tomorrow that could draw down all the carbon, which would be amazing and we should do that, it wouldn't solve the core problem that we would just invent the next runaway system that would tear apart the world. And, you know,
Starting point is 01:27:43 I don't have any illusion that Earth species is like a silver bullet that's going to make that happen. But I think it's part of a portfolio of things that can, shift our relationship to ourselves. And along the way, you know, we went to the moon, we got out Velcro, we are inventing a number of technologies, which can help conservation right here, right now, and help biologists. And I can talk about some of those.
Starting point is 01:28:19 So I think it's that pairing of, you know, human beings. Like, we pride ourselves on, language like the Christian tradition, the universe, like in the beginning, there was the word. And the Arabic traditions, the universe begins with own. Identity, our identity is wrapped up in language. It's why it's such a political issue. And so my hope is that by working on this project, and by giving people hope that communication is even possible, that lets us shift our myth of poetics, change the stories we tell ourselves, which then shifts our identity, shifts our behavior. I'm fully on board with that.
Starting point is 01:29:03 I think we are a can-kicking species from the time of Thomas Malthus. We didn't have fossil fuels. From the time of Paul Ehrlich, we didn't have debt and globalization. We continue to kick the can of what growth is possible. And it's my belief that the next can to kick is not a physical one. It's not a technological one. It's in our brains and it's in our understanding, our connectivity. And what is sacred?
Starting point is 01:29:31 Is it traditional religion? Is it economic growth? I believe it is the natural world that it resides on this planet. So what would be the intermediate steps if you were to have success with bonobos or crows or humpback whales or dolphins? First of all, do you, is there a readily available? data to be able to apply AI to start constructing these galaxies of points in AI. What are the bottlenecks?
Starting point is 01:30:04 What's keeping you back from success and what would be the first couple or three hurdles to cross? Great question. So generally, we're all a little data starved and especially data that has been annotated bi biologists to include whatever behaviors we care about. So like all the context. Luckily, there's been a big push in the last couple of years as sensors have gotten cheaper to put tags on wild animals that have video, that have microphones, that have gyros and accelerometers.
Starting point is 01:30:42 So you can start to reconstruct like what, say, a pot of whales has been doing for the last 24 hours. a guy named Ari Freelander who works in Monterey Bay and down in Antarctica. He's a professor at UC Santa Cruz. Yeah, I met him at our dinner. That's right. Yeah. Well, I mean, what were your impressions?
Starting point is 01:31:06 Yeah, I mean, if I wasn't worried about the great simplification, I would love to have his job. What a fascinating, important thing to do. Yeah, I agree. And so a lot of our early work is starting to work with biologgers, as they're called, on top of whales. We're also working with Christian roots who works on tool using crows and putting sensors on many crows in an aviary so we can reconstruct like the motion paths of every crow, turn that motion into meaning. Like what kind of behaviors were they doing? We have a big project right now on that. Then listening, like once you have many animals speaking at once,
Starting point is 01:31:53 the most interesting things are generally being said when they're in big groups. It would be really helpful if you could disentangle them. That is to say, go sort of imagine like you're listening to a band. It would be great if you could like take the vocalist and the guitar player and the pianist and separate them out into their own individual tracks. It would be great if you could do that for say pods of whale. or aviaries full of crows because then you can have the individual stream by which you can start to build these shapes.
Starting point is 01:32:23 One of the things I wanted to do if it's okay, I'm curious if it will come through is just play you a little bit of my current favorite species communication. Yeah, go ahead. I assume it's some cetacean, but it almost sounded like a high. I have no idea. This is a beluga. These are a couple belugas communicating. And when you hear the high fidelity version, it sounds like an alien modem.
Starting point is 01:33:04 And it turns out, you know, so dolphins have contact calls, names that they call each other by, that they learn the first couple of months. But that's a whistle. It sounds like, bollugas, when they have contact calls, their names are broadband more like modem packets. And what's interesting about this data, one, we listen up to 20 kilohertz maybe, Blugas are speaking up to 150 kilohertz, that Valeria Vergara, who is doing research onto Bluga names and dialects in her study, you know, she had tags on whales that had microphones, but because the whales are moving around so much and they can speak into somebody
Starting point is 01:33:49 else's microphone, she doesn't know who's speaking or can't disentangle them. And so in her research, in trying to understand beluga names, she had to throw away 97% of all of her data. And that's, to me, incredibly exciting because the oceans are 5% explored, Beluga communication, at least for this kind of data set, for this reason, is 3% explored. Like, this is this is the next frontier. And so a thing we've been working on is if you could listen to, say, blue communication or elephant communication or crow communication, separate them out into their own individual tracks, that would give you access to the supermajority of communication data. And so we just had published in Nature's Scientific Reports, our first paper showing for the first time, you can do this in the biocoustic domain.
Starting point is 01:34:44 So not only could you decode or create a map of beluga communications, but over time, you could see how belugas and crows and elephants and bonobos communication maps like German and Japanese and Swahili sort of thing. The first thing we worked on was solving this cocktail party problem. so we could begin to open up these data sets. It works right now in lab-like settings. There's still significant work to be done to get it to work in the wild. So whether there's noise and lots of other complicating factors. The other projects we're currently working on are, one, the ability to generate novel calls.
Starting point is 01:35:40 That is, can we have a computer learn to speak, like a humpback whale or a beluga or a raven and use that in playback experiments. Could we end up where- Right. So you would need a huge data bank for which AI could access to then make inferences to, in the same way that we have a synthetic human that's our friend, you would want to get at least somewhere in that direction to be able to converse with another species, yes? Yeah, exactly. you could imagine that way before we understand what any of it means, you could have a whale in dialogue with an AI.
Starting point is 01:36:20 They're going back and forth saying something, but we don't know what it is yet. And that in and of itself, I think, would be fascinating. And what if the very first thing a whale said to us was what in the fuck are you doing to the oceans? Would it surprise you if that's what they ended up saying? Actually, they probably wouldn't swear it. They've been around for 34 million years. Either they have the very best swear words that we've never thought of or they're beyond that.
Starting point is 01:36:52 Here's my biggest fear about your project is that if you massively succeed and you can demonstrate that other species have other languages which are mappable and we even can begin to converse with them using some hybrid computer interface that society will just shrug. and meh and go back to their synthetic humans and energy consumption and that it all worked but that we shrugged and missed this opportunity for a change in consciousness that's my fear I think that's a reasonable fear to to hold and then I think it's about how do we bring enough people along on the journey so that they're invested that we can create those moments of collective action and political will to do something. And I think, you know, the next 10 years are going to get increasingly bad. The next 35 are, we're moving into exponential suck.
Starting point is 01:37:57 The landscape into which this kind of knowledge lands is changing, which I mean I think more and more people are ready and receptive to make a change. And they're just looking for that final straw. And I think there are many different types of final straws of which I hope this is a powerful one. So given what you know about your field and the the broader human amoeba situation, what sort of advice would you or do you give to young people today who are discovering and understanding the human predicament, writ large with energy, climate, the oceans, other species, everything. What sort of advice would you give to young people?
Starting point is 01:38:52 Three things. One, read Dinella Meadows thinking in systems and places to intervene in a system. That I wish I had to much earlier. It really is an important lens to look through. And it teaches you that, you know, embedded in our legs. language is sort of like the adversary, the thing that keeps us from seeing things as they are. The book makes the point that in the term side effect, we're already hiding the thing because it's not like here's the world and we have climate change and systemic inequality happening to it. It's that we've built a machine that when you pedal the machine, it will programmatically create these outcomes.
Starting point is 01:39:44 And so to cause them side effects is to already diminish their view. They're just as principal as an effect as every other effect. So I think one, read Demo Meadows. Two, acknowledge that we are living in a time of increasing chaos. What is chaos? Chaos is that you cannot predict what happens next from what is happening now. And that's scary and also hopeful because. because in chaos, initial conditions matter.
Starting point is 01:40:18 And that's worth fighting for. It's that little changes now can make huge changes soon. I love that. That's what I'm trying to do. Yeah, exactly. You don't know what your ripples are going to be. And the last is this. And this, you know, in my youth, in my youth, I don't know how young I am, but
Starting point is 01:40:42 But when I was younger, I really idolized rationality, that being rational, I thought, was like, the most rational thing. And as I've gotten older, I've realized that to be perfectly rational is, in fact, not rational at all. What do I mean? I mean that every model of the world is wrong because the world is highly dimensional. Every model is a dimensionality reduction. So to see the world as closely as you can, you want to have many different models whose errors cancel out.
Starting point is 01:41:19 So we have many ways of knowing. We have our rationality, which is good at some things and bad at others. We have our emotional way of knowing. We have our embodied way of knowing. We have our pattern-match way of knowing. We have our social way of knowing. And the best thing we can learn to do is to integrate across all of those different systems of knowing so that we can get closest to seeing the world as it is. And as part of that,
Starting point is 01:41:49 I strongly recommend building yourself an intentional movement practice. For me, it's surfing and a stongi yoga, but investing in your full spectrum being is the best way to be most effective. What's intentional movement? Oh, it just means something where you are intentionally moving. Like I went for a bike ride before this podcast. Right. And so as long as you're putting your intention into the bike ride versus, say, zoning out and like thinking about something while you're doing it. You want something which saturates your being and takes you out of your mind and back into your body.
Starting point is 01:42:36 You know, surfing is an example for that for me because now it's muscle and breath, pitted against unnegotiable physics, and I have no choice but to be saturated with presence versus that thing that when I say presence is the abstraction of my runaway mind. It's that practice of getting out of my mind that gives me the perspective to come back, right? There's that phrase, staying home versus leaving and returning home are two very different things. I like that. Thank you for that. So I'm curious how you're going to answer this. Eza, what do you care most about in the world? That reminds me of the Wallace Stevens quote, that the most beautiful thing in the world is, of course, the world itself. And I'll just have to pair that with one of my own, which is that life is too short for aphorisms.
Starting point is 01:43:36 Okay. The thing I care most about in the world, that's a really hard one. one, but the thing that's most alive for me is... In our interactions, I can, I know that you care about a lot. Yeah. Yeah. I really care about finding things worth caring about. And intentionally making meaning, acknowledging that, you know, every ideology is wrong,
Starting point is 01:44:08 but you still have to live within one and that you should choose. your ideologies and how you move between them intentionally and that you do not change when you speak, which you change when you listen. It's an awkward thing for me to say, having just spoken a whole bunch. But I really care about learning to listen better because that supports all the other things. It's like the conduit through which care flows. You could include some of the things we talked about on this call because there were plenty things to worry about or not.
Starting point is 01:44:41 but what are you most worried about in the coming decade or so? The continued reverse engineering of the human mind is the thing I'm most worried about. You sort of zoom out, great filter, why do we not see other alien civilizations? Maybe it's because, you know, every species, every technological species sufficiently advanced will eventually turn the telescope of its own intelligence back on itself, reverse engineer how it works, learn how to open up its metaphorical skull and pull its own puppets strings and that loop of like pulling your puppet string which jerks your arm more, which pulls the puppet string more and you get into this infinite chaotic feedback loop, sort of like when you'd
Starting point is 01:45:24 point a camera at a screen and you get into that loop, that I think is terrifying and maybe why other species didn't make it in part. And that's also the thing I'm most hopeful about is in a clear-eyed, compassionate look at who we are as a species. in so doing we can fit our technology around us and our societies like a sort of an exoskeleton glove and serve and support us versus exploiting us. And whether it is, this is as Buckminster Full says, whether it's utopia or dystopia, we go will be a touch and go race and back to the initial conditions to bend the arc of technology towards the good and the just and the compassionate and the clear-eyed seeing is,
Starting point is 01:46:21 I think, our collective work. Amen to that. Amen to that. If you were, here's a new question that I'm adding. If you were benevolent dictator or could make one wish for humanity in our present circumstances, what would it be? If I could wave a magic wand, I would help people let go of all of their insecurities. Because it's our insecurities and our fears, you know, that they're the map to our freedoms.
Starting point is 01:46:54 They're the thing that fuses stimulus to response. You want to have that space for reflection where awareness gives the opportunity for choice. I think, you know, it's our insecurities that drive our ego. We could let go over insecurities. We would see exactly how we were showing up and the way we could change our individual action and our systems to support everyone and all of our beings because so much our bad behavior is just insecurity driven. Yeah.
Starting point is 01:47:30 And I think that dovetails with your comment early about listening and quieting the ego. It's really hard to do. My friend, thank you for this. this whirlwind conversation. I definitely want to have you back. You are a good human being. You know a lot of crap that I have no idea about. And I wish you luck on your journey. Do you have any closing words of wisdom, advice, closing thoughts? One, Nate, thank you so much for having me on your show. It's been a real pleasure. As I said, I have learned so much from you. You've really opened my eyes to a much broader
Starting point is 01:48:13 world of how to be less energy and less material blind to think about the carbon pulse. If there is one final piece of advice or wisdom or thought, and that is, I just love to leave the listeners with a Brian Eno term, which is seniors. So genius is the genius of an individual. Senius is the genius of a scene of people. And I love this as a concept as the thing we should be optimizing. for because we live in a world of increasingly large interconnected hyper objects that we cannot fit into one human brain.
Starting point is 01:48:52 The only way for us to tackle these things is through collective load balancing these problems through our collective brains. And so I love walking around with the lens of seniors in my mind. Excellent. Thanks so much, Aza, to be continued, my friend. Thank you so much. Nate. If you enjoyed or learned from this episode of the Great Simplification, please subscribe to us on your favorite podcast platform and visit thegreat simplification.com for more information
Starting point is 01:49:27 on future releases.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.