Lex Fridman Podcast - #151 – Dan Kokotov: Speech Recognition with AI and Humans
Episode Date: January 4, 2021Dan Kokotov is VP of Engineering at Rev.ai, an automatic speech recognition company. Please support this podcast by checking out our sponsors: - Athletic Greens: https://athleticgreens.com/lex and use... code LEX to get 1 month of fish oil - Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premium - Business Wars: https://wondery.com/business-wars/ - Cash App: https://cash.app/ and use code LexPodcast to get $10 EPISODE LINKS: Rev: https://www.rev.com Rev.ai: https://www.rev.ai PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/LexFridmanPage - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (09:17) - Dune (12:34) - Rev (18:33) - Translation (25:22) - Gig economy (34:02) - Automatic speech recognition (44:53) - Create products that people love (53:02) - The future of podcasts at Spotify (1:14:41) - Book recommendations (1:16:02) - Stories of our dystopian future (1:19:45) - Movies about Stalin and Hitler (1:24:59) - Interviewing Putin (1:30:56) - Meaning of life
Transcript
Discussion (0)
The following is a conversation with Dan Gacotov, VP of Engineering at Rev.ai, which is, by many
metrics, the best speech-to-text AI engine in the world. Rev, in general, is a company
that does captioning and transcription of audio by humans and by AI. I've been using
their services for a couple of years now and planning to use Rev to add both captions and transcripts to some of the previous and future episodes of this podcast.
To make it easier for people to read through the conversation or reference various parts
of the episode, since that's something that quite a few people requested.
I'll probably do a separate video on that with links on the podcast website.
So people can provide suggestions and improvements there.
Quick mention of our sponsors.
Athleta Greens, only one nutrition drink.
Blinkist app that summarizes books, Business Wars podcast, and cash app.
So the choice is health, wisdom, or money.
Choose wisely my friends, and if you wish, click the sponsor links below to get a discount
and to support this podcast. As a side note, let me say that I reached out to Dan and the
RevTeen for a conversation because I've been using and genuinely loving their service
and really curious about how it works. I previously talked to the head of Adobe Research
for the same reason.
For me, there's a bunch of products,
usually it's software that comes along
and just makes my life way easier.
Examples are Adobe Premiere for video editing
as a topRx for cleaning up audio,
out of hotkey and windows for automated keyboard,
mouse tasks, E-Mex,
as an IDE for everything everything including the universe itself.
I can keep on going but you get the idea. I just like talking to people who create things
I'm a big fan of. That said, after doing this conversation, the folks at Rev.ai offered to sponsor
this podcast in the coming months. This conversation is not sponsored by the guest.
It probably goes without saying,
but I should say it anyway,
that you cannot buy your way onto this podcast.
I don't know why you would want to.
I wanted to bring this up to make a specific point
that no sponsor will ever influence what I do
on this podcast, or to the best of my ability
influence what I think. I podcast or to the best of my ability influence what I think.
I wasn't really thinking about this, for example, when I interview Jack Dorsey, who is the CEO of Square that happens to be
sponsor in this podcast, but I should really make it explicit. I will never take money for bringing a guest on.
Every guest on this podcast is someone I genuinely am curious to talk to or just genuinely loves
something they've created. As I sometimes get criticized for, I'm just a fan of people,
and that's what I talk to. As I also talk about way too much, money is really never a consideration.
In general, no amount of money can buy my integrity. That's true for this podcast and that's
true for anything else I do. If you enjoyed this thing subscribe on YouTube,
review it on Apple Podcast, follow on Spotify, support on Patreon, connect with
me on Twitter, Alex Friedman. As usual I'll do a few minutes of ads now and no
ads in the middle. I try to make these interesting but I give you time stamps so if you skip please still check out the sponsors by clicking the ads in the middle. I try to make these interesting, but I give you time stamps,
so if you skip, please still check out the sponsors by clicking the links in the description.
It is the best way to support this podcast. This show is sponsored by, who's quickly
becoming my favorite sponsor, Athletic Greens, the all in one daily drink to support better
health and peak performance. I just in fact actually finished drinking it.
It replaced the multivitamin for me and went far beyond that with 75 vitamins and minerals.
I do intermittent fasting of 16 to 24 hours every day and always break my fast with athletic
greens.
I can't say enough good things, can't stop braving about these guys.
It helps me not worry whether I'm getting the nutrients.
I need one of the many reasons I'm a fan
is that they keep iterating on the formula,
I keep improving it,
like all good engineers and scientists always should be.
Life is not about reaching perfection,
it's about constantly striving for it
and making sure each iteration is a positive delta. The other thing I've taken
for a long time outside of Athletic Greens is fish oil. So I'm especially excited now that
they're selling fish oil and are offering listeners of this very podcast. Free one month supply
of wild caught omega-3 fish oil. When you go toeGreens.com slash Lex to claim the special offer. By the way,
if the link doesn't seem to work for you for whatever reason, sometimes it doesn't if
you have an ad blocker enabled. So try to turn off your ad blocker for this one particular
case. But they're also trying to fix it. So they're on top of it. Click the AthleteGreens.com
slash Lex link in the description to get the fish oil and
the all-in-one supplement I rely on for the nutritional foundation of my physical and
mental performance.
This episode is supported by Blinkist, my favorite app for learning new things.
Blinkist takes the key ideas from thousands of nonfiction books and condenses them down
into just 15 minutes that you can read or listen to.
I'm a big believer in reading, at least an hour every day.
As part of that, I use Blinkist to try out a book.
I may otherwise never have a chance to read.
And in general, it's a great way to broaden your view of the ideal landscape out there.
And find books that you may want to read more deeply. With Blinkist,
you get unlimited access to read or listen to a massive library of condensed non-fiction books.
I also use Blinkist's shortcast. That's a lot of essays. To quickly catch up on a podcast,
episode I've missed. Right now, Blinkist has a special offer just for the listeners of this podcast.
Right now, Blinkist has a special offer just for the listeners of this podcast. Let me take a sip of this drink first.
If you're listening to this, I dare you to try to guess which drink I'm drinking.
Go to Blinkist.com slash Lex to start your free seven day trial
and get 25% off a Blinkist Premium membership.
That's Blinkist spelled B-L-I-N-K-I-S-T.
Blinkist.com slash Lex to get 25% off
and a seven day free trial.
Blinkist.com slash Lex.
This episode is also brought to you
by Business Wars Podcast.
Tech entrepreneurs are in all-out race
to cash in on our collective addiction to social media.
That sentence was way harder pronounced than I thought when I first started saying it.
And the newest season of Wondery's Business Wars TikTok versus Instagram,
they tracked the war between two social media giants.
I've spoken about possibly entering this space by helping build a new social network.
This is something I struggle
with quite a bit because it feels like standing on the edge of a cliff hoping to fly. I want to
keep my mind at heart open, fragile, but it seems that the world can too easily destroy such a
mind. It's so I wonder if I'm able to face such challenges. Perhaps the choice isn't mind to make.
to face such challenges. Perhaps the choice isn't mine to make. Perhaps it's already been made.
Anyway, this podcast season looks at just one heated competition in the space.
Where the game in my view is not one that makes for a better world. Listen to the latest season of business wars, TikTok versus versus Instagram on Apple podcast, Spotify or listen
ad free by joining Wondery Plus in the Wondery app.
I just just in general, I highly recommend Wondery.
There's a lot of good podcasts on there.
Finally, the show is presented by Cash App, the number one finance app in the app store.
When you get it, use code Lex Podcast.
Cash App, let's you send money to friends by Bitcoin and invest in the app store. When you get it, use code Lex Podcast. Cash app, let's you send
my new to friends by Bitcoin and invest in the stock market with as little as $1. I'm thinking
of doing some conversations with folks who work in and around the cryptocurrency space.
Similar to artificial intelligence, there are a lot of charlatans in this space, but there's
a lot of free thinkers as well, technical geniuses that are worth
exploring ideas with and depth and with care.
And I think it's pretty clear that cryptocurrencies here to stay.
Bitcoin just hit $32,000.
Anyway, if you get cash from the App Store, Google Play, and use code Lex Podcast, you get
$10.
And cash will also donate
to $10 first, an organization that is helping to advance robotics and STEM education
for young people around the world.
And now here's my conversation with Dan Kokorov. You mentioned science fiction on the phone.
So let's go with the ridiculous first.
What's the greatest sci-fi novel of all time in your view?
And maybe what ideas do you find philosophically fascinating about it?
The greatest sci-fi novel of all time is Dune, and the second greatest is the children of
Dune.
And the third is the God Emperor of Dune.
I'm a huge fan of the whole series.
I mean, it's just an incredible world that he created.
And I don't know if you read the book or not.
No, I have not.
It's one of my biggest regrets.
Especially because in your movies coming out,
everyone's super excited about it.
It's ridiculous to say, and sorry, I didn't interrupt,
is that I used to play the video game.
It used to be Dune.
I guess you would call that real-time strategy.
Right, right.
I think I remember that game.
Yeah, it was kind of awesome.
90s or something.
I think I played it actually when I was in Russia.
I definitely remember it.
I was not in Russia anymore.
I think at the time that I used to live in Russia,
I think video games were about like the especially enough pung.
I think pung was pretty much like the greatest game I ever got to play in Russia, which was still a privilege.
Right, not age.
So you didn't get color?
You didn't get like...
Well, so I left Russia in 1991, right?
1991.
OK, so I was one of the few like it kids, because my mom was a programmer.
So I would go to her work, right?
I would take the metro.
I got her a work and play like on, I guess, the equivalent. I've got our work in Flalik on, I guess
the equivalent of like a 286 PC, you know, nice, with floppy disks. Yes. So okay, back
to June. Back to June. And by the way, the new movie, I'm
pretty interested in, but they were skeptical. I'm a little skeptical. I'm a little skeptical.
I saw the trailer. I don't know. So there's there's a David Lynch movie,
Dune, as you may know, I'm a huge David Lynch fan, by the way. So the movie is somewhat controversial,
but it's a little confusing, but it captures kind of the mood of the book better than I would say,
most any adaptation. And like Dune is so much about kind of mood in the world, right? Back to the philosophical point. So in the fourth book, God Emperor of Doom,
there's a sort of setting where Lito,
one of the characters, he's become this weird sort of
God Emperor, he's turned into a gigantic worm.
I mean, you kind of have to read the book to understand.
What that means, the worms are involved.
Worms are involved, you probably saw the worms in the trailer,
right?
And in the video game. So you kind of like merges with this worm and becomes
the tyrant of the world and you're like oppresses to people for a long time, right? But he has a purpose.
And the purpose is to kind of break through kind of a stagnation period and civilization, right?
But people have gotten too comfortable, right? And so you kind of
oppresses them so that they explode and like go on to colonize new worlds and kind of renew the forward momentum of humanity,
right? And so to me, that's kind of like fascinating. You need a little bit of pressure and suffering,
right? To kind of like make progress, not not not get too comfortable.
Okay, maybe that's a bit of a
tool.
Philosophy to take away, but that seems to be the case.
Unfortunately, obviously, I'm a huge fan of suffering.
So one of the reasons we're talking today is that a bunch of people requested that I do transcripts for this podcast and do captioning.
I used to make all kinds of YouTube videos and I would go on
Upwork, I think and I would hire folks to do transcription
And it was always a pain in the ass from I'm being honest and then I don't know how I discovered rev
But when I did it was this feeling of like holy shit, somebody figured
out how to do it just really easily. I'm such a fan of just when people take a problem
and they just make it easy, you know, like just, there's so many, it's like there's so many things in life that you
might not even be aware of that are painful. Then Rev, you just like, give the audio, give the video,
you can actually give a YouTube link. And then it comes back like a day later or
Two days later whatever the hell it is with the captions and all in a standard as format
I don't know it was it was it was it was truly a joy So I thought I had you know just for the hell of it
Talk to you that one other product and just made my soul feel good one other product. They've used like that
Is for people who might be familiar, it's called
Isotope RX.
It's for audio editing.
And that's another one where it was like, you just drop it, I dropped it into the audio
and it just cleans everything up really nicely. All the stupid like the mouth sounds
and sometimes there's background like sounds
due to the malfunction of the equipment.
It can clean that stuff up.
It has a general voice denoising.
It has like automation capabilities
where you can do batch processing
and you can put a bunch of effects.
I mean, it just, I don't know, everything else sucked
for like voice-based cleanup that I've ever used.
They've used audition, Adobe audition.
These are all kinds of other things with plugins.
You have to kind of figure it all out.
You have to manually, here, just work.
So that's another one in this whole pipeline. It just brought joy to my heart.
Anyway, all that to say is,
Rev put a smile to my face.
So Kimi, you maybe take a step back and say,
what is Rev and how does it work?
And Rev or Rev.com?
Rev.com.
I'm saying thing, I guess.
That way we do have Rev.com. Same thing, I guess.
We do have rav.ai now as well, which we can talk about later.
You have the actual domain, or is it just the actual domain, but we also use it as a sub-brand.
So we use rav.ai to denote our ASR services, right?
And rav.com is more human and to the end user services. So it's like WordPress.com and WordPress.org. They actually have separate brands that like,
I don't know if you're familiar with what those are. Yeah, yeah. They provide almost like a separate
branch of a little bit. I think with that, it's like Rob WordPress.org is kind of their open
source, right? And WordPress.com is sort of their host that commercial offering. Yes.
And with us, the differential is a little bit different,
but maybe a similar idea.
Yeah.
Okay, so what is rev?
Before I launched into what is rev,
I was gonna say, you know, like you're talking about,
like, rev is music to yours.
Yeah.
Your feel was music to my ears.
Yeah.
To us, the founders of rev because rev was kind of founded
to improve on the model of Upwork.
That was kind of the original, or part of the original impetus.
Like RCO, Jason was an early employee of Upwork.
So he was very familiar with the work-to-company, Upwork-to-company.
And so he was very familiar with that model
and he wanted to make the whole experience better
because he knew like, at that time, Upwork was primarily programmers.
So the main thing, they offered us,
if you want to hire someone to help you code a little site, right?
You could go on Upwork,
you could browse through a list of freelancers,
pick a programmer, have a contract with them,
and have them do some work.
But it was kind of a difficult experience because
for you, you would
have to browse through all these people, right? And you have to decide, okay, like, well,
it's this guy good, or somebody else better. And naturally, you're going to Upwork because
you're not an expert, right? If you're an expert, you probably wouldn't be like getting a program
or from Upwork. So how can you really tell? So there's a kind of like a lot of potential regret, right?
What if I choose a bad person,
they can be late on the work,
it's gonna be a painful experience.
And for the freelancer, it was also painful
because half time they spent not on actually doing the work,
but kind of figuring out how can I make my profile
most attractive to the buyer, right?
And they're not next to them, not either. So like our idea was, let's remove the buyer, right? They're not an expert on that either.
So like our idea was, let's remove the barrier, right?
Like let's make it simple.
We'll pick a few verticals that are fairly standardized.
We actually started with translation,
and then we added audio transcription a bit later.
We'll just make it a website you go, give us your files,
we'll give you back the results, as
soon as possible. Originally, maybe it was 48 hours, then we made it shorter and shorter
and shorter.
Yeah, there's a rush processing now.
And we'll hide all the details from you, right?
And that's kind of exactly what you're experiencing, right?
You don't need to worry about the details of how the sausage is made.
That's really cool.
So you picked like a vertical, by vertically, I mean basically a canvas category.
Why translation is rather thinking of potentially going into other verticals in the future?
Or is this like the focus now is translation transcription like language. The focus now is language or speech services generally speech to text, language services,
you can kind of group them however you want.
But we originally, the categorization was work from home.
So, on a work that was done by people on a computer, you know, we weren't trying to get
into, you know, task rabbit type of things.
Something that could be relatively standard,
not a lot of options, so we could present
the simplified interface.
As a programming, it wasn't a good fit because
each programming project is unique.
We're looking for something that,
transcription is five hours of audio.
It's five hours of audio.
Translation is somewhat similar in that, transcription is, you know, you have five hours of audio, it's five hours of audio.
Translation is somewhat similar in that you can have a five-page document, you know, and then you just
can price it by that, and then you pick the language you want, and that's mostly all that is to it.
So those were a few criteria. We started with translation because we saw the need and
We picked up kind of a specialty of translation where we would translate things like boistertif cuts
So it's gracious documents things like that and so they were fairly
Even more well-defined and easy to kind of tell if we did a job. So you can literally charge per type of document.
Was that the, so what is it now?
Is it per word or something like that?
Like how do you measure the effort involved
in a particular thing?
So now, like so for audio transcription,
it's per audio immunit.
Well, that, yes.
For translation, we don't really actually focus it on anymore.
But back when it was
still a main business of Rabbit was per page or per word, depending on the kind of...
You can also do translation now on the audio, right?
Like subtitles, so it would be both transcription and translation, that's right.
I wanted to test the system to see how good it is to see like how well is Russian supported.
Thanks, though. Yeah. And it'd be interesting to try it out. I mean, one of the now it's
only in like the one direction, right? So you start with English and then you can have
subtitles in Russian. Not really. Not really the other way. Got it because it's I'm deeply
curious about this. One COVID opens up a little bit when the economy on the world opens
up a little bit. You want to build your brand in Russia?
No, I don't
First of all, I'm allergic to the word brand
Sorry
I'm definitely not building any brands in Russia
Nice
But I'm going to Paris to talk to the translators of Dusty Eski and Tolstoy
There's this famous couple that does translation
And you know, I'm more and
more thinking of how is it possible to have a conversation with a Russian speaker
because I have just some number of famous Russian speakers that I'm interested
in talking to and my Russian is not strong enough to be witty and funny. I'm already a Nitya in English.
I'm an extra level of like awkward idiot in Russian, but I can understand it, right?
And I also like wonder how can I create a compelling English Russian experience for an English speaker?
Like if I, there's a guy named Gugory Perman who's a mathematician who
obviously didn't speak in English, so I would probably incorporate like a
Russian translator into the picture and then you'll be like a not to use a
weird term but like a three-person thing where it's like a dance of
where like I understand it one way
they don't understand the other way but I'll be asking questions in English
I don't know I don't know the right way complicated it's complicated but I feel
like it's worth the effort for certain kinds of people one of whom I'm
confident as a Vladimir Putin I'm for sure talking to I really want to make it
happen because I think I could do a good job
But the the right you know understanding the fundamentals of translation is something I'm really interested in so that's why I'm starting with
The actual translators of like Russian literature because they understand the nuance and the beauty of the language
I'll go back and forth
But I also want to see like in speech,
how can we do it in real time?
So that's, that's like a little bit of a, a baby project that I hope to push forward.
But anyway, it's a challenging thing.
So just to share my dad, actually does translation, not not professional.
He's a, he writes poetry.
That was kind of always his, not a hobby, but he's, you know, he writes poetry. That was kind of always his...
Not a hobby, but he had a job, like a date job,
but his passion was always writing poetry.
And then we get to America,
and it's like you started also translating...
First he was translating English poetry to Russia,
and now he also like goes.
The other way, you kind of gained some small fame in that world
anyways because recently the spout like Lewis Clark, I don't know if you know, so American
poet, she was awarded the Nobel Prize for Literature and so my dad had translated one of her
books of poetry in Russian. He was like one of the few, so you kind of like they asked
him and given it to you, so I do spout board, like one of the few, so you kind of like they asked him and
gave an interview to Radyo Svaboda, if you know what that is, and you kind of talked about some of the intricacies of translating poetry. So that's like an extra level of difficulty, right? Because
translating poetry is even more challenging than translating, just, you know, interviews.
Do you remember any experiences and challenges to having to do the translation that's the got to like something he's talked about.
I mean a lot of it I think is work choice right it's the way Russian is structured is first all quite different than
the English is structured right just there is inflections in Russian and gender as in they don't exist in English.
One of the reasons actually why machine translation is quite difficult for English to Russian and Russian to English, because there are such different languages.
Then English has a huge number of words, many more than Russian actually, I think. So it's often difficult to find kind of born out of trying to take a vertical
on the upwork and then standardize it. So we're just trying to make the free-lancer marketplace
idea better, right? Better for both customers and better for the free-lancers themselves.
Is there something else to the story of a rev,
finding rev, like what did it take to bring it to actually
to life?
Was there any pain points?
Plenty of pain points.
I mean, as often the case, it's with scaling it up, right?
And in this case, the scaling is kind of scaling the marketplace,
I would speak, right?
Rev is essentially a two-sided marketplace,
because the customers and then there's the Revers.
If there's not enough Revers,
Revers are what color freelancers.
So if there's not enough Revers,
then customers have a bad experience.
It takes longer to get your work done, things like that.
If there's too many, then Revers have a bad experience
because they might log on to see what work is available and there's not very much work.
So kind of keeping that balance is a quite challenging problem.
That's like a problem we've psychology experiments on mechanical Turk, for example
I've asked to do different kinds of very tricky computer vision annotation on mechanical Turk is connecting
Connecting people in a more systematized way
I would say you know, between tasks and and
What would you call that worker? Is what mechanical tear calls it?
What do you think about this world of gig economies? Of there being a service that connects
customers to workers in a way that's like massively distributed, like potentially scaling to, it could be scaled to like tens of
thousands of people, right? Is there something interesting about that world you can speak to?
Yeah, well, we don't think of it as kind of gig economy. Like to some degree, I don't like
the word gig that much, right? Because to some degree, it diminishes the work's being done, right?
It sounds kind of like almost amateurish. Well, maybe in like, music is still like gig,
it's a standard term, but in work,
it kind of sounds like, oh, it's frivolous.
To us, it's improving the nature of working from home
on your own time and on your own terms, right?
And kind of taking away geographical limitations
and time limitations, right?
So, you know, many of our freelancers
are maybe work from home moms, right?
And, you know, they don't want the traditional
nine to five job, but they want to make some income
and rough kind of like allows them to do that.
And decide like exactly how much to work and when to work.
Or by the same token maybe
someone wants to live the mountain top life right, you know, cabin in the woods but they still
want to make so money and like generally that wouldn't be compatible before this new world
you kind of have to choose but like with Rev like you feel like you kind of have to choose. But like with Rev, like you feel like you don't have to
choose. Can you speak to like what's the demographics, like distribution, like where do reveres live?
Is it from all over the world? Like what is the divisence of?
What's out there?
It's from all over the world. Most of them are in the US, that's the majority.
But most of them are in the US, that's the majority.
Because most of our work is audio transcription and you have to speak pretty good English.
So the majority of them are from the US,
so we have people in some other
of the English speaking countries.
And as far as like US, it's really all over the place.
For some of the years now,
we've been doing these little meetings
where the management team
will go to someplace and we'll try to meet rappers.
Pretty much wherever we go, it's pretty easy to find a large number of rappers.
The most recent one we did is in Utah.
But anyway, are they from our walks of life?
Are these young folks, older folks?
Yeah, all walks of life, really.
Like I said, one category is the work from home-owned students who want to make some extra How long have you been working with your parents? I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time.
I've been working with them for a long time. I've been working with them for a long time. pretty, pretty wide variety. But like on the flip side, for example, one wherever we were talking to
was a person who had a fairly high-powered career before and was kind of like taking a break.
The cheers almost doing this just to explore and learn about the gig economy quote and quote.
Right? So it really is a pretty wide variety of folks.
Yeah, it's kind of interesting through the captioning process for me to learn about the
reveres because some are clearly weirdly knowledgeable about technical concepts.
You can tell by how good they are at capitalizing stuff, like technical terms, like a machine learning or deep learning. Like I've used Rev to annotate two caption,
the deep learning lectures or machine learning lectures
that did MIT and it's funny.
Like a large number of them were,
like I don't know if they looked it up
or were already knowledgeable,
but they do a really good job.
But like, I don't know.
They invest time into these things.
They will do research, they will Google things,
you know, to kind of make sure that they get it right.
But to some of them, it's like,
it's actually part of the enjoyment of the work.
Like, they'll tell us, you know, I'll do this because
I get paid to watch a documentary on something.
I don't know, learn something while I'm transcribing, right?
Pretty cool.
Yeah. So was that captioning transcription process to watch a documentary on something. And I learned something while I'm transcribing, right? Pretty cool.
Yeah.
So was that captioning transcription process
look like for the river?
Can you maybe speak to that to give people a sense?
Like how much is automated, how much is manual?
Was the actual interface look like all that kind of stuff?
Yeah.
So we've invested a pretty good amount of time
to give like our rivers the best tools possible.
So a typical day forever, they might log into their workspace, they'll see a list of
audios that need to be transcribed. And we try to give them tools to pick specifically the ones
they want to do, you know, so maybe some people like to do longer audios or shorter audios,
people have their preferences. Some people like to do audiosios or shorter audios.
People have their preferences.
Some people like to do audios in a particular subject
or from a particular country.
So, which I'd give people the tools to control,
things like that.
And then when they pick what they wanna do,
we'll launch a specialized data there
and then we build to make transcription
as efficient as possible.
They'll start with a speech rack draft.
So we have our machine learning model for automated speech recognition.
They'll start with that.
And then our tools are optimized to help them correct that.
So it's basically a process of correction.
Yeah, it depends on, I would say the audio.
If the audio itself is pretty good.
Like probably like our podcast right now would be quite good.
So, DSR would do a fairly good job.
But if you imagine someone recorded a lecture,
you know, in the back of a auditorium, right?
Where like the speakers are really far away,
and there's maybe a lot of crosstalk and things like that.
Then maybe DSR wouldn't do a good job.
So the person might say, like, you know what,
I'm just gonna do it from scratch.
Yeah, it's kind of really depends.
What would you say is the speed that you can possibly get?
Like, what's the fastest?
Can you get, is it possible to get real time or no?
As you're like listening, can you write as fast as...
Real time would be pretty difficult.
It's actually a pretty, it's not an easy job, you know that.
We actually encourage everyone at the company to try to be a transcriber for a day,
it's a description for a day.
And it's way harder than you might think it is, right?
Because people talk fast and people have accents and all this kind of stuff.
So real time is pretty difficult.
Is it possible?
Like there's somebody we're probably going to use Rev to caption this.
They're listening to this right now.
What's the fastest you can possibly get on this right now?
I think on a good audio, maybe two to three X, I would say.
Real time.
Meaning it takes two to three times longer than the actual audio of the, of the
podcast. This is so meta, I could just imagine the rivers working on this right now.
You're like, you're way wrong. You're way wrong. This takes way longer. But yeah,
definitely out of me. I could do real time.
Okay. So you mentioned ASR.
Can you speak to what is ASR automatic speech recognition?
How much?
Like what is the gap between perfect human performance and perfect or pretty damn good ASR?
Yeah.
So ASR automatic speech recognition, it's a classic machine learning problem, right, to take, you know, speech like we're talking and
transform it into a sequence of words essentially. Audio of people talking, audio towards,
and you know, there's a variety of different approaches and techniques, which we could talk about later if you want.
So, you know, we think we have pretty much the world's best ASR
for this kind of speech.
So there's different kinds of domains for ASR.
One domain might be voice assistance, so Siri.
Very different than what we're doing.
Because Siri, there's a very limited vocabulary.
You might ask Siri to play a song or word repeats or whatever. It's
very good at doing that. Very different from when we're talking in a very unstructured
way. Siri will also generally adapt to your voice and stuff like this. For this kind
of audio, we think we have the best. Our accuracy, right now I think it's maybe 14%
word error rate on our test suite that we generally use to measure
the word error rate is like one way to measure accuracy for ASR.
So what's 14% means across this test suite of a variety of
different audios, it would get in some way 14% of the words wrong.
14% of the words wrong.
Yeah.
So the way you kind of calculated this, you might add up insertions,
deletions, and substitutions, right?
So insertions is like extra words, deletions are words that we said,
but weren't in the transcript, right,
substitutions this, you said Apple, but I said,
but DSR thought it was able, something like this.
Human accuracy, most people think realistically,
it's like 3% to percent word error rate
would be like the max achievable.
So there's still quite a gap, right?
Would you say that?
So YouTube, when I upload videos,
often generates automatic captions.
Are you sort of from a company perspective,
from a tech perspective?
Are you trying to beat YouTube?
Google, it's a hell of a, so Google,
I mean, I don't know how seriously they take this task,
but I imagine it's quite serious.
And they, you know, Google is probably up there in terms of their teams on,
on ASR or just NLP, natural language processing, different technologies. So do you think you can
be Google? On this kind of stuff, yeah, we think so. Google just woke up on my phone. This is hilarious.
Now Google is listening, sending it back to headquarters.
For these rough people.
But that's the goal.
Yeah, I mean, we measure ourselves against like Google, Amazon, Microsoft, you know,
some of the some smaller competitors.
And we use like our internal test with it.
We try to compose it
of a pretty representative set of ideas. Maybe it's some podcast, some videos, some interviews,
some lectures, things like that, right? And we beat them in our own testing.
And actually, Rev offers automated, like you can actually just do the automated captioning.
So like, I guess it's's way cheaper, whatever it is,
whatever the rates are.
Yeah, yeah.
So it's up by the way,
it used to be a dollar per minute
for captioning and transcription,
things like a dollar 15 or something like that.
Dollar 25.
Dollar 25.
Dollar 25 now.
Yeah.
It's pretty cool.
That was the other thing that was surprising to me. It was actually
like the cheapest thing you could one of the, I mean, I don't remember it being cheaper.
You could on up or get cheaper, but it was clear to me that this, that's going to be really
shitty. Yeah. So like you're also competing on price. I think there were services that you can get like similar to Rev kind
of feel to it, but it wasn't as automated. Like the drag and drop the entirety of the
interface. It's like the thing we're talking about. I'm such a huge fan of like frictionless
like Amazon's single buy button, whatever. Yeah. That one click the one click That's genius right there like that is so important for services. Yeah, that simplicity and I mean
Revis is almost there. I mean, there's like some
Trying to think so I I
Think of I stopped using
This pipeline, but Rev offers it in that I like it,
but it was causing me some issues on my side,
which is you can connect it to like Dropbox,
and it generates the files and Dropbox.
It closes the loop to where I don't have to go to Rev at all,
and I can download it.
Sorry, I don't have to go to Rev at all and I can download it. Sorry, I don't have to go to Rev at all and to download the files.
It could just automatically copy them.
You're putting a drop like some a day later,
or maybe a few hours later.
It shows up.
Just shows up.
Yeah.
I was trying to do programmatically too.
Is there an API interface?
I was trying to, through Python,
to download stuff automatically,
but then I realized, this is the programmer in me.
Like, dude, you don't need to automate everything,
like, in life, like, flawlessly,
because I wasn't doing enough captions
to justify it to myself, the time investment,
into automating everything perfectly.
Yeah, I would say, if you're doing so many interviews
that your biggest roadblock is looking on the rough download,
but now you're talking about Elon Musk levels of business.
But for sure, we have a variety of ways
to make it easy.
The integration, you mentioned, I think
a store company called Zapier, which kind of connect
Dropbox to Revan by Spursa.
We have an API, if you wanna really customize it,
if you wanna create the Lex Friedman, CMS,
or whatever.
But this whole thing, okay, cool.
So can you speak to the ASR a little bit more?
Like what does it take, like approach wise, machine learning wise, how
hard is this problem? How do you get to the 3% air rate? Like, what's your vision of all
of this?
Yeah, well, the 3% rate is definitely, that's the grand vision. We'll see what takes
to get there. But we believe, you know, in ASR, the biggest thing is the data.
Like it is true of a lot of machine learning problems today.
The more data you have and the higher quality of the data,
the better level the data.
That does here get good results.
We had rev have the best data.
You're literally your your
literally is annotating the data.
Our business model is being paid to annotate data.
We're being paid to annotate the data.
So it's kind of like a pretty magical flywheel.
Yeah.
And so we've kind of like ridden this flywheel to to this point.
And we think we're still kind of in the early stages of figuring out all the parts of the flywheel to use, you know, because we have the final transcripts.
And we have the the audience and we train on that, but we in principle also have all the edits that the reveres mean, interesting. How can you use that as data? We would, wait, that's something
for us to figure out in the future. But, you know, we feel like we're only in the early
stages, right? So the data is there. That'd be interesting, like almost like a recurrent
neural net for fixing for fixing transcripts. I was remember we did segmentation annotation
for for driving data. So segmenting the scene, like visual data.
And you can get all, so it was drawing people, drawing polygons, around different objects
and so on.
And it feels like it always felt like there was a lot of information in the clicking,
the sequence of clicking that people do, the kind of fixing of the polygons that they
do. Now, there's a few papers written about how to draw polygons with recurrent neural nets
to try to learn from the human clicking, but it was just like experimental, you know,
it was one of those like CDPR type papers that people do like a really tiny data set.
It didn't feel like people really tried to do it seriously.
And I wonder, I wonder if there's information
in the fixing that provides deeper set of signal
than just like the raw data.
Of course.
The intuition is for sure there must be, right?
There must be.
And in all kinds of signals, how long you took to make that
added and stuff like that, it's going to be like up to us. That's why like with
the next couple of years, it's like super exciting for us, right? So that's what like the focus is
now. You mentioned rev.ai, that's where you want to. Yeah, so rev.ai is kind of our way of bringing
this ASR to the rest of the world.
When we started, we were human only.
Then we created this TEMI service.
I think you might have used it,
which was ASR for the consumer.
If you don't want to pay a $1.25,
but you want to pay now,
it's 25 cents a minute,
I think, and you get the transcript,
the machine-generated transcript, the machine generated transcript,
you get an editor and you can fix it up yourself.
Then we start using TSR for
on human transcriptionists.
Then the RavaEyes,
the final step of the journey,
which is we have this amazing engine,
what can people build with it?
What new applications could be enabled
if you have speed track that's that accurate? Do you have ideas for this or is it just providing
it as a service and seeing what people come up with? It's providing it as a service and seeing
what people come up with and kind of learning from what people do with it. Any way of ideas of
our own as well of course, but it's a little bit like, you know, when AWS provided the building blocks, right?
Yeah.
And they saw what people built with it and they try to make it easier to build those things,
right?
And we kind of hope to do the same thing.
Although AWS kind of does a shady job of like, I'm continually surprised at Mechanical
Turk, for example, how shitty the interface is.
We're talking about like, rev making me feel good. Like when I first discovered
mechanical Turk, the initial idea of it was like it made me feel like rough does, but then the
interface is like come on. Yeah, it's horrible. Why why is it so painful? Does nobody at Amazon
is so painful. There's nobody at Amazon want to like seriously invest in it. It felt like you could make so much money if you took this effort seriously. And it feels like they have a
committee of like two people just sitting back like like a meeting they meet once a month. Like
what are we going to do with mechanical torque? It's like two websites make me feel like this.
That and Craig list dot org whatever the hell it is
Feels like it's designed in the 90s
Well, Craigslist basically hasn't been updated pretty much since the guy was no
Do you seriously think there's a team like how big is the team working on mechanical turf?
I don't know there's some team right? I
Feel like there isn't I'm skeptical. Yeah
Well, if it's possible they they benefit from, you know,
the other teams like moving things forward.
Right.
The small way possible.
But I know, I mean, we use mechanical to work
for a couple of things as well.
And it's painful.
It's painful.
But yeah, it works.
I think most people, the thing is,
most people don't really use the UI, right?
Like, so like we, for example,'t really use the UI, right? Like, so like we, for example,
we use through the API, right?
So, yeah.
But even the API documentation and so on,
like, is super outdated.
Like, it's, I don't even know what to,
I mean, same criticism, as long as we're ranting,
my same criticism goes to the APIs
of most of these companies, like Google, for example,
the API for the different services. It's just the documentation is so shitty. It's not so
shitty. I should actually be, I should exhibit some gratitude. Let's practice some gratitude. Okay, let's practice some gratitude. The, you know, the
documentation is pretty good. Like most of the things that the API makes
available is pretty good. It's just that in the sense that it's accurate,
sometimes outdated, but like the degree of explanations with examples is only
covering, I would say, like 50% of what's
possible.
And it just feels a little bit like there's a lot of natural questions that people would
want to ask that doesn't, doesn't get covered.
And it feels like it's almost there, like it's such a magical thing, like the Maps API,
YouTube API.
I got to imagine it's like, you know, there's
probably some team at Google, right, responsible for writing this documentation.
That's probably not the engineers, right?
And probably this team is not, you know, where you want to be.
Well, it's a, it's a weird thing.
I sometimes think about this for somebody who wants to also build the company.
I think about this a lot.
YouTube, the service is one of the most magical, like I'm so grateful that YouTube exists.
And yet they seem to be quite clueless on so many things Like that everybody is screaming them at like it feels like
Whatever the mechanism that you use to listen to your quote unquote customers, which is like the creators is
Not very good. Mm-hmm. Like there's literally people that are like screaming white like
They're new YouTube studio for example. There's like features that that were like screaming white like they're new YouTube studio for example.
There's like features that were like begged for for a really long time like being
able to upload multiple videos at the same time. That was missing for a really,
really long time. Now like there's probably things that I don't know which is maybe for
that kind of huge infrastructure
is actually very difficult to build some of these features.
But the fact that that wasn't communicated and it felt like you're not being heard,
like I remember this experience for me and it's not a pleasant experience.
And it feels like the company doesn't give a damn about you.
And that's something to think about.
I'm not sure what that is.
That might have to do with just like small groups working on these small features and these specific
features. And there's no overarching like dictator type of human that says like, why the
hell are we neglecting like Steve Jobs type of characters? Like there's people that we
need to we need to speak to the people that like want to love our product and they don't.
That's a big issue.
Maybe at some point you just get so fixated on the numbers.
The numbers are pretty great.
People are watching.
It doesn't seem to be a problem.
You're not the person that built this thing.
You really care about it.
You just stare.
You came in as a product manager.
You got hired sometime later. Your mandate is like increase this to number,
like, you know, 10%, right?
And you just...
That's brilliantly put.
Like if you, this is the, okay,
if there's a lesson in this,
is don't reduce your company into a metric of like,
how much, like you said,
how much people watching the videos and so on
and I convinced yourself that everything's working
just because the numbers are going up.
There's something you have to have a vision,
you have to want people to love your stuff
because love is ultimately the beginning of like,
a successful long-term company
is they always should love your product.
You have to be like a creator,
and have that creator's love for your own thing, right?
And you're paid by these comments, right?
And probably like, Apple, I think, did this generally
like really well.
They're well-known for keeping teams small,
even when they were big, right?
And he was an engineer,
like that book, Creative Selection,
I don't know if you read it by Apple engineer named
Ken Kocienda.
It's kind of a great book actually,
because unlike most of these business books
where it's, you know,
here's how Steve Jobs ran the company,
it's more like, here's how life was like for me,
you know, and engineer here, the projects I worked on
and here what it was like to pitch Steve Jobs, you know, on like, you know, I think it
was in charge of like the keyboard and the auto correction, right?
And at Apple, like Steve Jobs reviewed everything.
And so he was like, this is what it was like to show my demos to Steve Jobs and, you know,
to change them because like Steve Jobs didn't like how, you know, the shape of the little
key was off because the rounding of the corner was not quite right
or something like this, but it was famously, that's the clear for this kind of stuff.
But because the teams were small, you really owned this stuff, right? So you were really carried.
Yeah, Elon Musk does that similar kind of thing with Tesla, which is really interesting.
There's another lesson in leadership in that is to be obsessed with the details and like,
he talks to the lowest level engineers.
Okay, so we're talking about ASR and so this is basically what we're saying, we're going to take this like ultra seriously and then what's the mission to try to keep pushing towards the 3%.
Yeah, I'm kind of trying to, try to build this platform where all of your meetings,
you know, they're as easily accessible as your notes, right?
So like, imagine all the meetings the company might have, right?
Now that I'm no longer a programmer, right?
Another quote unquote manager, that's less like my day as a meeting, right?
Yeah.
And pretty often I want to see what was said,
who said it, what's the context?
But it's generally not really something
that can easily retrieve, right?
I can imagine if all of those meetings were indexed archived,
you could go back, you could share a clip like really easily, right?
So that might change completely.
Everything that's said, converted to text might change completely the dynamics of
what we do in this world, especially now with remote work, right?
Exactly.
Exactly.
It was zoom and so on.
That's fascinating to think about.
I mean, for me, I care about podcasts, right?
And one of the things that was, you know, I'm torn.
I know a lot of the engineers that Spotify.
So I love them very much because they dream big in terms of like, they want to empower
creators.
So one of my hopes was was Spotify that they would use the technology like, wherever something
like that to start converting everything into
text to make it indexable.
Like one of the things that sucks with podcasts is like, it's hard to find stuff.
The model is basically subscription.
You find it's similar to what YouTube used to be like, which is you basically find a creator that you enjoy
and you subscribe to them and you just kind of follow
what they're doing, but the search and discovery
wasn't a big part of YouTube in the early days,
but that's what it would podcast,
like is the search and discovery is like non-existent.
You're basically searching for like the dumbest possible thing, which is like keywords
in the titles of episodes.
Yeah, but even aside from searching this kind of like all the time, so I listen to like
a number of podcasts and there's something said and I want to like go back to that later
because I was trying to, I'm trying to remember what do you say, like maybe maybe like recommend it some cool product that I want to try out and like it's basically impossible
Maybe like some people have pretty good show notes, so maybe you'll get lucky and you can find it right but
I mean if everyone had transcripts and it was all searchable or was it game changer?
Maybe it's so much better. I mean, that's one of the things that I wanted to I mean one of the reasons we're talking today
Is I want to take this quite the reasons we're talking today is
I wanted to take this quite seriously, the rough thing, I just been lazy.
So, because I'm very fortunate that a lot of people support this podcast, that there's
enough money now to do transcription and so on, it seemed clear to me, especially like CEOs and sort of like PhDs, like people write
to me who are like graduate students at computer science or graduate students at whatever the
heck field, it's clear that they're mind, like they enjoy podcasts when they're doing
laundry or whatever, but they want to revisit the conversation in a much more rigorous
way. And they really want to revisit the conversation in a much more rigorous way and they really want to transcript
It's clear that they want to like analyze conversations like so many people wrote to me about a transcript for your Shabakh
conversation I just a bunch of
Conversations and then on the Elon Musk side like reporters want like they want to write a blog post about your conversation
like reporters want, like, they want to write a blog post about your conversation, so they want to be able to pull stuff. And it's like, they're essentially doing on your conversation
transcription privately. They're doing it for themselves and then starting to pick, but so
much easier when you can actually do it as a reporter, just look at the transcript.
Yeah. And you can like embed a little thing, you know, like into your article right here,
is what's sad. You can go listen to like this clip from the section. I'm actually trying to try and to figure out I'll probably on the website create
Like a place where the transcript goes like as a web page so that people can reference it like reporters can reference and so on
I mean most of the reporters probably I've
Wanted right clickbait articles that are
complete falsifying the, which I'm fine with. It's the way of journalism. I don't care. I've
had this conversation with a friend of mine, a mixed martial artist, the Ryan Hall.
And we talked about, as I've been reading the Rise and Fall, the Therese Rike and a bunch of
talked about, you know, as I've been reading the rise and fall, the thrive and a bunch of books on Hitler. And we brought up Hitler and he made some kind of comment where like,
we should be able to forgive Hitler. And, you know, like we were talking about forgiveness.
And we were bringing that up as like the worst case possible thing is like even, you know,
for people who are Holocaust survivors
One of the ways to let go of the suffering they've been through is to is to forgive
And he brought up like Hitler as somebody that we would potentially be the hardest thing to possibly forgive
But it might be a worthwhile pursuit psychologically so on blah blah blah doesn't matter
It was very eloquent very powerful words. I think people should go back and listen to it
It's powerful and then all these journalists. There's all these articles written about like
MMA fight UFC fight right fighter loves Hitler. No like one though
They didn't they were somewhat accurate. Uh-huh. They didn't say like a loves Hitler. They said um
thinks that if Hitler came back to life,
we should forgive him.
It's kind of accurate-ish,
but the headline made it sound a lot worse than it was,
but I'm fine with it.
That's the way the world,
I wanna almost make it easier for those journalists
and make it easier for people who actually care
about the conversation to go and look and see.
Right, they can see it for themselves.
For themselves.
There's the still contact.
There's something about podcasts, like the audio
that makes it difficult to go, to jump to a spot
and to look for that particular information.
I think some of it, you know, interested in creating,
like, myself experimenting with stuff.
So, like, taking Revan, creating a transcript,
and then see people can go to it,
I do dream that, like, I'm not in the loop anymore,
that, like, you know that Spotify does it, right?
Automatically for everybody, because ultimately that one click purchase needs to be there.
Like you kind of want support from the entire ecosystem, from the tool makers and the podcast creators,
even clients, right? I mean, imagine if like most podcast apps,
you know, if it was a standard, right?
Here's how you include a transcript into a podcast,
right?
Podcasts just an RSS feed ultimately.
And actually, just yesterday,
so this company called Buzzsprout, I think they're called.
So they are trying to do this.
They propose to spec an extension to their RSS format
to reference transcripts in a standard way.
And they're talking about like there's one client dimension
that will support it.
But imagine like more client support it, right?
So any podcast you could go and see the transcripts, right?
And you're like normal podcast app.
Yeah. I mean, somebody, so I have somebody who works with me,
works with advertising, Matt is an awesome guy.
He mentioned Buzz Pratt to me, but he says it's really annoying
because they want exclusive, they want to host the podcast.
Right.
This is the problem with Spotify too.
This is where I'd like to say like F Spotify.
There's a magic to RSS with podcasts. It can be made available to everyone. And then
there's all there's this ecosystem of different podcast players that emerge and they compete
freely. And that's a beautiful thing. That's why I go and exclusive like Joe Rogan one exclusive.
I'm not sure if you're familiar with him, it just just Spotify is a huge fan of Joe Rogan. I've
been kind of nervous about the whole thing, but let's see. I hope this Spotify steps up.
They've added video, which is very surprising that they were. So it's close to meaning you can't subscribe to the RSS feed anymore.
It's only in Spotify.
For now, you can until December 1st.
In December 1st, it all everything disappears,
and it's Spotify only.
And Spotify gave them $100 million for that.
So it's an interesting deal, but I, you know,
at this time soul searching and I'm glad he's doing it.
But if Spotify came to me with a hundred million dollars, I wouldn't do it.
I wouldn't do. Well, I have a very different relationship with money. I hate money, but I just think
I believe in the pirate radio aspect, the podcast the figure and that there's something about the open source spirit
The open source spirit. It just doesn't seem right. It doesn't feel right that said, you know, because so many people care about Joe Rogan's program
They're gonna hold Spotify's feed to the fire like one of the cool things
What what Joe told me is the reason he likes working with Spotify is that they they're like
right or die together right so they they want him to succeed so that's why
they're not actually telling him what to do. It's about what people think they
don't tell them they don't give them any notes and anything they want him to
succeed and that's the cool thing about exclusivity
with the platform is like, you're kind of wanting each other to succeed. And that process
can actually be very fruitful. Like YouTube, it goes back to my criticism. YouTube generally,
no matter how big the creator, and maybe for PewDiePie something like that, they won't you to succeed.
But for the most part, from all the big creators I spoke with, Veritas, and all those folks,
you know, they get some basic assistance, but it's not like YouTube doesn't care if you
succeed or not.
They have so many creators.
Yeah, like a hundred other.
They don't care.
So, and especially with, with somebody like Joe Rogan, who
YouTube sees Joe Rogan not as a person who might revolutionize the nature of
news and idea space and nuanced conversations, they seem as a potential person
who has racist guests on or like, you know, they seem as like a headache potentially.
So you know, a lot of people talk about this.
It's a hard place to be for YouTube, actually, is figuring out with the search and discovery
process of how do you filter out conspiracy theories and which conspiracy theories represent
dangerous untruths and which conspiracy theories are like vanilla untruths and then even when you
have start having meetings and discussions about what is true or not, it starts getting weird.
Yeah, it starts getting weird. It's difficult these days, right? I worry more about the other side,
right? Of too much, you know, too much nonsense shift.
Well, maybe censorship is the right word.
I mean, censorship is usually government censorship,
but still, yeah, putting yourself
in a position of arbiter for these kinds of things,
it's very difficult.
And people think it's so easy, right?
Like, it's like, well, you know, like no Nazis, right?
What a simple principle.
But, you know, yes, I mean, no one likes Nazis.
Yeah, but it's like many shades of gray.
Yeah.
Like very soon after that.
Yeah.
And then, you know, of course, everybody, you know, there's some people that call
our current president, a Nazi and then there's like, you start getting a Sam
Harris.
I don't know if you know that is wasted in my opinion, his conversation with Jack Dorsey.
And I'll also, I spoke with Jack before in this podcast
and we'll talk again, but Sam brought up,
Sam Harris does not like Donald Trump.
I do listen to his podcast.
I'm familiar with his views on the matter.
And he has Jack Dorsey's like,
how can you not ban Donald Trump from Twitter?
So, you know, there's a set, you have that conversation.
You have a conversation where some number, some significant number of people think that the current
president of the United States should not be on your platform. And it's like, okay, so if that's
even on the table as a conversation, then everything's on the table for conversation.
And yeah, it's stuff.
I'm not sure where I land on it.
I'm with you, I think that censorship is bad,
but I also think the...
Ultimately, I just also think, you know,
if you're the kind of person that's gonna be convinced,
you know, by some YouTube video, you know,
that I don't know, our government's been taken over by aliens.
It's unlikely that you'll be returned to sanity simply because that video is not available
on YouTube, right?
Yeah, I'm with you.
I tend to believe in the intelligence of people and we should trust them.
But I also do think it's the responsibility of platforms to encourage more love in the
world, more kindness to each other, and I don't always think that they're great at doing
that particular thing.
So that, that, there's a nice balance there, and I think philosophically, I think about
that a lot, where's the balance between free speech
and encouraging people,
even though they have the freedom of speech
to not be an asshole?
Yeah, right.
That's not a constitutional.
So you have the right for free speech,
but just don't be an asshole. Like you can't really
put that in the constitution that the Supreme Court can't be like, yeah, just don't be a dick.
But I feel like platforms have a role to be like, just be nicer, maybe do the carrot,
like encourage people to be nicer as opposed to the stake of censorship. But I think it's an
interesting machine learning problem.
Just be nicer.
Machine learning for niceness.
It is, I mean, that's possible.
Yeah, I mean, it is, it is a thing for sure.
Jack Dorsey kind of talks about as a vision for Twitter
is how do we increase the health of conversations?
I don't know how seriously they're actually trying
to do that though, which is one of the
reasons that I'm in part considering entering that space a little bit.
It's difficult for them, right?
Because it's kind of well known that people are kind of driven by rage and outrage, maybe
is a better word, right?
Outrage maybe is a better word, right? Outrage drives engagement and, well,
these companies are judged by engagement, right? So it's in the short term, but this goes
to the metrics thing that we were talking about earlier. I do believe, I have a fundamental
belief that if you have a metric of long-term happiness of your users, like not short-term
engagement, but long-term happiness and growth
and both like intellectual, emotional health
of your users, you're going to make a lot more money.
You're going to have long, like,
you should be able to optimize for that.
You don't need to necessarily optimize for engagement.
And that'll be good for society too.
Yeah, no, I mean, I generally agree with you,
but it requires a patient person with, you know,
trust from Wall Street to be able to carry out such a strategy.
This is what I believe the Steve Jobs character and Elon Musk character is like,
you basically have to be so good at your job.
Right. You got to pass for anything.
Then you can hold the board and all the investors hostage
by saying like, either we do it my way, or I leave.
And everyone is too afraid of you to leave
because they believe in your vision.
But that requires being really good at what you do.
It requires things to do jobs in Elon Musk.
There's kind of a reason why a third name
doesn't come immediately to mind, right? like there's maybe a handful of other people
But it's not that many it's not many. I mean people say like why like people say that I'm like a fan of Elon Musk
I'm not I'm a fan of anybody
Who's like Steve Jobs and Elon Musk and there's just not many of those folks
It's a guy that made us believe that like we can get to Mars, you know in ten years, right?
I mean that's kind of awesome and it's kind of making it happen which is like It's a guy that made us believe that we can get to Mars in 10 years, right?
I mean, that's kind of awesome.
And it's kind of making it happen, which is like, it's great.
It's not a gun, that kind of spirit, right, from a lot of our society, right?
We can get to the moon in 10 years, and we did it, right?
Yeah, especially in this time of so much kind of existential dread that people are going
through because of COVID, like having rockets that just keep going out there, not with
humans. I don't know, that it's just like you said, I mean, it gives you a reason to wake
up in the morning and dream of forest engineers too. It It's inspiring as hell, man.
Let me ask you the worst possible question, which is, so you're like at the core, you're
a programmer, you're an engineer, but now you made the unfortunate choice, or maybe that's
the way life goes, of basically moving
away from the low level work and becoming a manager, becoming an executive, having meetings.
What's that transition been like?
It's been interesting.
It's been a journey.
Maybe a couple of things to say about that.
I got into this, right, because as a kid, I just remember this like
incredible amazement at being able to write a program, right? And something comes to life
that kind of didn't exist before. I don't think you have that in like many other fields.
You have that with some other kinds of engineering, but you're maybe a little bit more limited
with what you can do, right? But the computer you can literally imagine any kind of program, right?
So it's a little bit got like what you do like when you create it.
And so I mean, that's why I got into it.
Do you remember like first program you wrote or maybe the first program that like made you fall in love with computer science?
I don't know what was the first program. It's probably like trying to write one of those
games and basic, you know,, emulate the snake game or whatever.
I don't remember it to be honest.
But I enjoyed it.
That's why I always loved about being a program
is just a creation process.
It's a little bit different when you're not the one
doing the creating.
Another aspect to it, I would say, is
when you're a programmer, when you're an individual
contributor, it's kind of very easy to know when you're doing a good job, when you're not
doing a good job, when you've been productive, when you're not being productive, right?
You can kind of see it, like you're trying to make something and it's like slowly coming
together, right?
And when you're a manager, you know, it's more diffuse, right?
Like, well, you hope, you know, you're motivating your team and making them more productive and
inspiring them, right?
But it's not like you get some kind of dopamine signal because you're like completed X lines
of code today.
So kind of like you missed that dopamine rush a little bit when you first become, but then
slowly you kind of see, yes, your teams are doing
amazing work, right? And you can take pride in that. You can get like, what is it? Like,
a ripple effect of a dope or somebody else's dopamine. Yeah, yeah, yeah. I live off other
people's dopamine. So is there pain points and challenges yet to overcome
from becoming, from going to a programmer to becoming a programmer of humans?
A programmer of humans.
I don't know if humans are difficult to understand, you know?
It's like one of those things, like trying to understand other people's motivations
and what really drives them. It's difficult, maybe you like never really know, right?
Do you find that people are different?
Yeah.
Like, one of the things, I got a group at MIT that, you know,
I found that like, some people, I could like scream at
and criticize like hard and that made them do like much better work and really push them to the limit.
And there's some people that I had to non-stop compliment because like they're so already self-critical, like everything they do, that I have to be constantly like, I cannot criticize them at all because they're already criticizing
themselves and you have to kind of encourage and like celebrate their little victories.
And it's kind of fascinating how that, the complete difference in people.
Definitely, people are respond to different motivations and different worlds of feedback and
you kind of have to figure it out.
It's like a pretty good book, some reason not the name escapes me about
management. First break all the rules. First break all the rules. First break
all the rules. It's a book that we generally like ask a lot of like first time
manages to read a ref. And like one of the kind of philosophies is managed by
exception, right?
Which is, you know, don't like have some standard template.
Like, you know, here's how I tell this person to do this or the other thing.
Here's how I get feedback like managed by exception, right?
Every person is a little bit different.
You have to try to understand what drives them and tailor it to them.
Since you mentioned books, I don't know if you can answer this question,
but people love it when I ask it, which is, are there books, technical fiction or philosophical
that you enjoyed or have an impact on your life if you would recommend? You already mentioned
Dune. All of the Dune. All of the Dune. The second one was probably the weakest, but anyway,
so yeah, all of the Dune is good. I mean, yeah, can you just slow a little tangent on that?
Is how many do you do in books out there?
Do you recommend people start with the first one if you...
Yeah, you kind of have to read them all.
I mean, it is a complete story, right?
So you start with the first one, you got to read all of them.
Just like a tree, like a like a creation of like the universe
you should go in sequence. You should go in sequence. Yeah. It's a kind of a chronological storyline.
There's six books in all. Then there's like many kind of off-books that were written by
Frank Herbert's son, but those are not as good. So you don't have to bother with those.
Frank Herbert's son, but doesn't matter. It's good.
You don't have to bother with those.
Shots fired.
Shots fired.
OK.
But the main sequence is good.
So what are some other books?
Maybe there's a few.
So I don't know that, like I would say, there's a book that kind
of, I don't know, turned my life around or anything like that.
But here's a couple that I really love.
So one is Brave New World by all this
Huxley. And it's kind of incredible how pressing he was about like what a Brave New World
might be like, right? You know, you kind of see genetic sorting in this book right where
there's like these alphas and epsilons and
from like the earliest time of society, like they're sort of like you can kind of
see it in a slightly similar way today where
well, one of the problems with society is people are kind of genetically
sorting a little bit, right? Like there's much less, like most marriages,
right? Between people of similar kind of intellectual level
or socioeconomic status.
More so these days than in the past,
then you kind of see some effects of it
in stratifying society.
He illustrated what that could be like in the extreme.
The different versions of it on social media as well.
It's not just marriages and so on.
It's genetic sorting in terms of what Dawkins called memes, this idea,
being put into these bins of these little echo chambers and so on.
Yeah, and that's the book that's, I think, a worthwhile read for everyone.
In 1984, it's good, of course, as well. Like, if you're talking about dystopian novels of the
future, it's slightly different view of the future, right? But I kind of like identify with very new world and more.
Speaking of not a book, but my favorite kind of dystopian
science fiction is a movie called Brazil, which I don't know
if you've heard of.
I've heard of it, and I know I need to watch it.
But yeah, because it's in English, you know?
It's an English movie, and it's a sort of like dystopian movie of authoritarian and
confidence, right?
And it's like, like, nothing really works very well, you know, the system is creaky, you
know, but no one is kind of like willing to challenge it, you know, and just things kind
of amble along it, kind of strikes me as like a very plausible future
of like, you know, what a theoretician
is it might look like, it's not like this, you know,
super efficient evil dictatorship of 1984.
It's just kind of like this badly functioning, you know,
but it's status quo, so it just goes on.
Yeah, that's one funny thing that stands out to me is in what is authoritarian dystopian
stuff or just basic, like, you know, if you look at movie contagion, it seems that in
movies, government is almost always exceptionally competent. Like, it's like used as a storytelling tool
of like extreme competence.
Like, you know, you use it whether it's good or evil,
but it's competent.
It's very interesting to think about
where it much more realistically is incompetence
and that incompetence itself has consequences that are difficult to to predict like bureaucracy
as a very boring way of being evil.
Of just, you know, if you look at the show HBO show Chernobyl,
it's a really good story of how bureaucracy, you know,
leads to catastrophic events, but not through any kind of evil
and any one particular place, but more just like the...
It's just the system, kind of.
The...
The distorting information as it travels up the chain, that people unwilling to take responsibility
for things and just kind of like this laziness resulting in evil.
There's a comedic version of this, I don't know if you've seen this movie, called The Death of Stalin. and just kind of like this laziness resulting in evil.
There's a comedic version of this,
I don't know if you've seen this movie,
it's called The Death of Stalin.
Yeah.
Hi.
I like that.
I wish it wasn't so.
There's a movie called Inglourious Bastards
about, you know, Hitler and World War, you know, so on.
For some reason those movies pissed me off.
I know a lot of people love them, but like,
I just feel like there's not enough good movies, even about Hitler. There's good movies about
the Holocaust, but even Hitler, there's a movie called Don Fall, The People Should Watch. I think
it's the last few days of Hitler. That's a good movie, Turn Into Mm-hmm. But it's good. But on Stalin, I feel like I'm maybe wrong on this.
But at least in the English-speaking world,
there's not good movies about the evil of Stalin.
That's true.
Let's try to see that.
So I agree with the London glorious pastor
that I didn't love the movie because I felt like kind
of the stylizing of it, right?
The whole like Tarantino kind of Tarantinoism.
If you will, kind of detract it from it,
it made it seem like unserious a little bit.
But that of style, and I felt differently,
maybe it's because of comedy to begin with,
so it's not kind of expecting, you know, seriousness.
But it kind of depicted the absurdity of the whole situation in a way, right?
I mean, it was funny, so maybe it does make light of it, but it's something that goes
probably like this, right?
Like a bunch of kind of people, they're like, oh shit, right?
Like, you're right.
But like, the thing is, it was so close to like what probably was reality, it was caricaturing reality to where I think
an observer might think that this is not like they might think it's a comedy in one in
reality. This is that's the absurdity of how people act with dictators. I mean, that's, I guess it was too close to reality for me.
Yeah.
The kind of banality of like, what were eventually,
like fairly evil acts, right?
But like, yeah, they're just a bunch of people
trying to survive.
And like, I mean, because I think there's a good,
I haven't watched yet the good movie on the movie on Churchill.
With Gary Oldman, I think, is Gary Oldman.
Mom may be making that up. I think he won, like he was nominated for an Oscar something. So I like,
I love these movies about these humans and Stalin, like Chernobyl made me realize the HBO show that
there's not enough movies about Russia that capture that spirit. I'm
sure it might be in Russian there is, but the fact that some British dude
that like did comment, I feel like he did like hang over some shit like that. I
don't know if you're familiar with the person created Chernobyl, but he was just
like some guy that doesn't know anything about Russia and he just went in and
just studied it like did a good job of creating it and then got it so accurate, like poetically, and
the fact that you need to get accurate, he got accurate, just the spirit of it down
to like the bowls that pets use, just the whole feel of it.
That's good, yeah, that's how they see us.
Yeah, it's incredible. It's made me wish that somebody did a good, like, 1930s,
like starvation as Stalin, like leading up to World War II,
and in World War II itself, like Stalin grabbed and so on.
Like, I feel like that story needs to be told.
Millions of people died.
And it's, to me, it's so much more
fascinating that Hitler, because Hitler is like a caricature of evil almost that it's
so, especially with the Holocaust, it's so difficult to imagine that something like that
is possible ever again. Stalin, to me me represents something that is possible.
Like the so interesting, like the bureaucracy of it is so fascinating that it potentially might
be happening in the world now. Like they were not aware of North Korea and not their one that
like there should be a good film on. And like the possible things that could be happening in China with overreaching government
I don't know there does there's a lot of possibilities there. I suppose yeah
I wonder how much you know, I guess the archives should be maybe more open nowadays right?
I mean for a long time they just we didn't know right or anyways no one in the West knew for sure
Well, there's a I don't know if you know there's a guy named Stephen Kotkin
He is a historian of Stalin.
Then I spoke to him in the podcast.
I'll speak to him again.
The guy knows his shit on Stalin.
He like read everything.
It's so fascinating to talk to somebody.
Like he knows Stalin better than Stalin himself.
It's crazy.
Like, you have, I think he's a Princeton, he is basically his whole life.
He's Stalin.
Starting Stalin.
Yeah, it's great.
And in that context, he also talks about, and writes about Putin a little bit.
I've also read at this point, I think everybody,
biography of Putin, English, English biography of Putin, I need to read at this point I think every biography of Putin English
English biography of Putin I need to read some Russians obviously I'm mentally preparing for possible conversation with Putin So what is your first question to Putin when you have him on your on the podcast? I
It's interesting you bring that up the first of all wouldn't tell you but
But I actually haven't even thought about that.
So my current approach, and I do this with interviews often as, but obviously that's a special
one, but I try not to think about questions until last minute.
I'm trying to sort of get into the mindset.
And so that's why I'm soaking in a lot of stuff,
not thinking about questions,
just learning about the man,
but in terms of like human to human,
it's like, I would say it's,
I don't know if you're a fan of mob movies,
but like the mafia, which I am,
a good fellow and so on,
he's much closer like mob morality,
which is like, mob morality and so on. He's much closer like mob morality, which is like,
mob morality, maybe, I could see that.
But I like your approach anyways of this,
the extreme empathy, right?
It's a little bit like, you know, Hannibal, right?
Like if you have watched the show Hannibal, right?
They had that guy, we know Hannibal, of course, like,
yeah, sounds like that.
Sounds a lot, but there's TV shows, well,
and the focus on this guy, Will Durant, who's a character like,
Extreme Ampath, right? So in the way he like catches all these killers, so he pretty much,
he can empathize with them, right? Like, he can understand whether doing things they're doing,
right? And so pretty excruciating thing, right? Like, because you're pretty much like spending
half your time in the head of
Evil people, right? Like, but I mean, I definitely try to do that with with other so you you should do that in moderation, but yeah, I
I think it's it's a pretty safe place safe place to be like one of the cool things with this podcast and I don't know
You didn't sign up to hear me listen to this bullshit, but
uh, that was interesting. I podcast and I don't know you didn't sign up to hear me listen to this bullshit but not what I'm saying.
I was his name Chris Latner who's a Google, he's not Google anymore sci-fi, he's one of
the most legit engineers I've talked with him again on this podcast and he gives me private
advice a lot and he said for this podcast I should like interview like I should widen the range
of people because that gives you much more freedom to do stuff like. So his idea, which
I think I agree with Chris is that you go to the extremes, you just like cover every
extreme base. And then it gives you freedom to then go to the more nuanced conversations. It's kind of, I think there's a safe place for that.
There's certainly a hunger for that nuanced conversation, I think, amongst people
where, like, on social media, you get canceled for anything
slightly tense, that there's a hunger to go full.
Right. You go so far to the opposite side.
It does. And I'd like to mystify that a little bit, right?
Yeah. Yeah. there is a person behind
All of these things. Yeah, and that's the cool thing about podcasting like three four hour conversations that
That it's very different than a clickbait journalism. Take the opposite that there's a hunger for that
There's a willingness for that. Yeah, especially now
I mean how many people do you even see face- face anymore? Right. Like this, you know, it's like not that many
people like in my day to day, aside from my own family, that like I said across. It's
sad, but it's also beautiful. Like I've gotten a chance to like like our conversation now,
there's somebody I guarantee you, there's somebody in Russia listening to this, not like jogging.
There's somebody who is just smoke some weeds, sit back on a couch and just like enjoying.
But I guarantee you that we'll write in the comments right now that yes, I'm in St.
Petersburg, I'm in Moscow, whatever.
And we're in their head.
And they have a friendship with us.
And I'm the same way.
I'm a huge fan of podcasting.
It's a beautiful thing.
I mean, it's a weird one-way human connection.
Like before I went on Joe Rogan,
and still I'm just a huge fan of his.
So it was like surreal.
We had, I've been a friend with Joe Rogan for 10 years,
but one way.
Yeah, from this way, from the same Peter's Bergway.
Yeah, the same Peter's Bergway.
And it's a real friendship.
I mean, now it's like two-way, but it's still surreal.
It's, yeah.
And that's a magic of podcasting.
I'm not sure what to make of it.
That voice.
It's not even the video part.
It's the audio.
That's magical.
That I don't know what to do with it.
But it's people listen to three,
four hours. Yeah, we evolved over millions of years to be very fine tuned things like that.
Expression as well of course, but back in the day on the savannah, you have to be very tuned to
whether you had a good relationship with the
rest of your tribe or a very bad relationship, right? Because if you had a very bad relationship,
you're probably going to be left behind and eaten by the lions.
Yeah. But it's weird that the tribe is different now. Like, you could have a connection, one-way
connection with your Rogan as opposed to the tribe of your physical vicinity. But that's why it works with the podcasting,
but it's the opposite of what happens on Twitter,
because all those nuances are removed.
You're not connecting with the person,
because you don't hear the voice.
You're connecting with an abstraction.
It's like some stream of tweets,
and it's very easy to assign to them any kind of like evil intent,
you know, or dehumanize them, which you're much harder to do when it's a real voice, right? Because
like, you realize it's a real person behind the voice. Let me try this out on you. I sometimes ask
about the meaning of life. Do you, your father now, engineer, you're building up a company?
Do you ever zoom out and think like what the hell is this whole thing for? Like why are
we dissenting the debates even on this planet? What's the meaning of it all?
That's a pretty big question. I think I don't allow myself to think about it too often or maybe like life doesn't allow me to think about it too often
But in some ways, I guess the meaning of life is kind of
Contributing to this kind of weird thing. We call humanity, right?
Like it's in a way in think of humanity as as living and evolving organism, right? That we all contributing us way, way by just by existing, by having our own unique set of
desires and drives, right?
And maybe that means creating something great and bringing up kids who are unique and
different and seeing, like, you know, they can enjoy and what they do.
But I mean, that's pretty much it. I mean, if you're not a religious person, right, which I guess I'm not, that's
the meaning of life. It's in the living and in creation and the creation.
Yeah, there's something magical about that engine of creation. Like you said, programming,
I would say, I mean, it's even just actually what you said with even just programs. I don't care if it's like some JavaScript thing on the, on the, on the website.
It's like magical that you brought that to life.
I don't know what that is in there, but that seems that's probably some version of recreating
of like reproduction and sex, whatever that's in evolution, but like creating that HTML button
has echoes of that feeling and it's magical.
Right, that's great.
If you're religious person, maybe you could even say,
I like we were created in God's image, right?
Well, I mean, I guess part of that
is the drive to create something ourselves, right?
I mean, that's part of it.
Yeah, that HTML button is the creation in God's. Maybe it hopefully will be something a little more
dynamic, maybe bigger, sometimes. Yeah, maybe some JavaScript, some React, and so on. But,
no, I mean, I think that's what differentiates us from, you know, the apes, so to speak.
Yeah, we did a pretty good job. Then it was, uh, honor to talk to you. Thank you so much for
being part of creating one of my favorite services and products. This is actually a little bit of
an experiment. Allow me to sort of, uh, fanboy or some of the things I love. So thanks for wasting your time with me today.
It was awesome.
Thanks for having me on and giving me a chance
to try this out.
Awesome.
Thanks for listening to this conversation with Dan Cocoto.
And thank you to our sponsors, Athletic Greens,
only one nutrition drink, Blinkist app
that summarizes books, Business wars podcast and cash app.
So the choice is health, wisdom or money.
Choose wisely my friends and if you wish click the sponsor links below to get this
count and to support this podcast.
And now let me leave you some words from Ludwig Woodkenstein.
The limits of my language means the limits of my world.
Thank you.