Tech Brew Ride Home - (IHP) Gary Flake On The Search Wars
Episode Date: May 29, 2023(Originally aired February 2017) Gary Flake has been involved with search technology ever since he got turned on to this particular field in college. In this wide-ranging discussion, Gary lays out for... us, basically, the history of search technology before Google, the impact of Google, and then, since he lived it, the notion of competing with Google. The reason why Gary can talk so in depth about all of this is that he was Yahoo’s Chief Science Officer in the early 2000s, when Yahoo, via the infamous project Panama, and other initiatives, attempted to keep Google from taking over the entire search market. And because, prior to that, Gary was at Goto/Overture, he gives us basically the entire story of the birth of paid search as an industry. The story of Google is about two miracles. The first miracle is the Google algorithm that essentially solved search. And the second miracle is paid search… AdWords, AdSense, all of that… which is essentially the greatest advertising machine ever invented. But, not a lot of people remember: paid search was actually invented, not by Google, but by Goto/Overture. Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco.
Hey, who did this to you?
What happened next turned the story into a political firestorm.
Reports have identified the victim as Bob Lee, the founder of Cash App.
From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16.
what I ended up deciding was since the Google episodes got combined into one, I'd give you this Gary Flake episode, since he has a lot more direct personal stories about the whole Yahoo Google Search Wars era. Enjoy.
Welcome to the Internet History podcast. I'm your host, Brian McCullough.
Gary Flake has been involved with search technology ever since he got turned on to this particular field in college.
In today's wide-ranging discussion, Gary's going to lay out for us basically the entire history of search technology before Google,
the impact of Google on search as a discipline, and then, since he lived it, the notion of competing with Google.
The reason why Gary can talk so in depth about this is that he was Yahoo's chief science officer,
in the early 2000s, when Yahoo, via the infamous Project Panama and other initiatives,
attempted to keep Google from taking over the entire search market.
And because prior to that, Gary was at GoTo slash Overture,
he also gives us basically the entire story of the birth of paid search as an industry.
I've mentioned on this show before, and it will be a major part of the book,
but the story of Google is essentially two miracles.
The first miracle is, of course, the Google search algorithm that essentially solves search.
And the second miracle is paid search, AdWords, AdSense, all of that,
which is essentially the greatest advertising machine ever invented.
But not a lot of people remember.
Paid search was actually invented not by Google, but by go-to-slash-overture.
So, all of this great stuff and more on today's chat with Gary Flake.
Gary Flake, thanks for coming on the Internet History podcast.
My pleasure.
So Gary, it seems like, just based on what I've read about you and looking at your LinkedIn,
when you went to college, you're going for computer science,
but you sort of end up in what's certainly hot today.
things like machine learning and, you know, all sorts of things, including search and things like that.
Just give me a brief outline of, like, what you were into in college and in the 90s when you're still sort of in the academic setting.
Okay.
Well, that's a wide open question there.
But let me, I'll even go a little bit further back.
Sure.
In the, when I was a kid growing up in the 70s and the 80s, you know, we didn't have computers.
And I remember that the very first computer that became available for me to actually teach myself how to program on was a TimeX-Inclare.
I saw it advertised in a popular science sometime in the 80s.
I think I was 13 years old or 12 when I saw it.
And I worked the summer.
I think I was selling something door-to-door to save money because it cost like about $100.
And I bought a Timex Inclare.
And in short order, you know, I taught myself basic and started coding.
And then eventually kind of graduated having an Apple 2E.
I didn't know there was such a thing as an assembler.
And I taught myself how to code and machine code on an Apple 2E.
And I wrote some – and it's kind of funny because of that experience,
and as a 16-year-old kid teaching myself how to code in hex on an Apple 2E allowed me to kind of skip over a lot of the curriculum as a computer science major in undergrad.
But in the late 80s, you know, I was –
working on my on my bachelor's degree in computer science and and I I was a sophomore sneaking into a
graduate level seminar that was on this thing about neural networks it was a speaker speaking about
Hopfield neural networks and and I actually kind of understood what was going on because I had read
earlier some a popular article by John Hopfield who is the inventor of Hotfield neural networks
so I was really familiar with it and I kind of raised my hand I said
So, you know, these are, so it's just like an associative memory, right?
And, you know, and it was kind of funny.
I didn't think that I was showing off or anything like that.
But after sneaking into a graduate seminar, the host of that seminar who was a, you know,
who was a professor at Clemson walked over.
His name is Ed Page.
And he pretty much just like offered me a job right on the spot.
So I'm a sophomore.
I started doing paid research for a professor as part of his research program,
working on Hotfield neural networks.
and then that migrated to a whole bunch of other things in machine learning.
But I actually published my first academic article that year.
I was a sophomore.
And then that led to me working at Los Alamos National Lab, taking off a couple of semesters.
And the funny thing that emerged from that is that I worked so closely with so many physicists at Los Alamos that when I went on to work on my PhD at University of Maryland in computer science, I was actually a research assistant in plasma physics.
And so I had this schizophrenic existence where I was constantly sort of being exposed to ideas in physics and mathematics and computation.
And I was trying to build a bridge in my own mind in terms of how all these things related.
And so, yeah, I, in the early 80s, I'm sorry, in the late 80s, early 90s, I was doing, in some ways I almost like I won the contextual lottery because I got exposed to all of these amazing ideas.
that at the time seemed really avant-garde and out there,
you know, like machine learning and things like data science,
but we, you know, we didn't use that term back then.
And all of all this stuff that was happening on the boundaries between physics and computation.
And so I just got exposed to this stuff.
It's such a young age that it was,
it really shaped the way that I would think and view the world for decades to come.
Well, I would think that that would be such a boon to being a computer scientist,
because having multiple, having your finger in different pies, you know, understanding how the brain works,
understanding how physics works on a molecular level or whatever, you know, like that has to be
beneficial in terms of, yeah.
It is, but it actually, it's almost like two steps forward, one step back, because you have to kind of learn and unlearn.
And let me give you a case and point.
So I, when I was at Los Alamos, I was working, I didn't have security clearance at the time,
So I was working at the Center for Nonlinear Studies, which was the hot spot of chaos theory at that point in time.
So some of the biggest and brightest minds in the world were converged on the center of nonlinear studies there.
David Campbell was the director, Yichung Lee was the senior scientist.
He would later, Yichun, everyone called him Y-C.
He was my Ph.D. advisor at University of Maryland.
And so I was, I started looking at big ideas related to the uncertainty that is introduced through a number of different directions, you know, both in physics and mathematics and computation.
There's, there's, there's, in all three of those, you know, vastly different domains, there's, there's uncertainty emerges as like this very big concept.
And what happened was is that I was so used to working, uh, on problems.
related to dynamical systems and chaos theory, that my understanding of uncertainty and
intractability was very skewed towards a physicist worldview.
And so I distinctly remember that when I went to grad school, I took a, one of the best,
best classes I ever took was a recursion theory taught by Bill Gassarach at University
of Maryland.
Bill is probably one of the most gifted teachers of CS theory that you'll ever find anywhere.
And we used to have to do these proofs for proving that something was incomputable.
And before I really kind of wrap my head around the CS theory part of it, I kept trying to construct
incomputable frameworks or problems or or give it instances that.
of incompatible problems in terms of physical dynamical systems.
And of course, Bill, rightfully so, had to give me a zero on that attempt.
So for a while, so the irony is that that initial intuition that I had around physics,
I had to kind of put it on hold to really kind of crystallize how all that worked in computer
science.
And later on, I wrote a book in the 90s that attempted to kind of unify this.
these things, these ideas in a little bit. And it's somewhat of a hand-wavy way out,
but yes, once you start seeing the patterns across these different domains, it becomes,
it really just shapes how you think about everything. And of course, all of those things
became segues to other topics that became very important later on, like information retrieval,
data mining, search, and also some of the subtleties that happen with large,
scale social systems like social networking and network effects and all that stuff. So yeah, I was,
I tell people that I had a good fortune of being born at precisely the right moment to be able
to experience all these different things at the right time. I believe the book that you mentioned
is the computational beauty of nature, right? Yes, that's correct. We'll point that out in case
people want to look that up. The, so like you say, this is sort of,
you've got all these different ideas in your head, but by the late 90s, the web is happening.
Is that sort of excitement that's happening because it's involving computer science,
especially for the web being like this huge database experiment and search being a, is it starting
to nudge you and other people like you towards the web at that point?
Yeah, absolutely.
And so here was sort of the general trend of the 90s.
So in the machine learning community, the typical, there's lots of different, different types of machine learning.
But the version of machine learning that has captured a lot of attention because of the practical application is, you know, just basic, you know, you get a target input for something that you're trying to monitor.
You have a target output and you want to learn that input output mapping and so that's a problem where we
You know you can you can take a lot of different very practical problems from the real world and map it into that sort of abstract framework
And so in the early 90s people like myself would write research articles and research papers where we were studying
datasets and trying to
Show the quality of the models that we can learn
from just looking at data and then making
predictions. The size of the
data sets that we were looking at range
from, in
some cases, less than
10 data points.
But if we got up to the big side
of things, it might be a couple of thousand
data points because we were looking at
say a time series from a
chaotic dynamical system.
Or in some cases
maybe you got up to 10,000 data
points because you're working on a control problem
or in other words,
what I'm trying to call it is that the 90s, the scale that we used to, of the problems that we used to work on are tiny by today's standards.
And so what everyone was doing was they were, they were kind of looking and moving into successive domains where the availability of the data kept getting larger and larger and larger.
So in the early 90s, I was working for a company.
I was working for the big German multinational Siemens and their U.S. corporate research lab in Princeton.
And we're working on bioinformatic problems.
And then I was working on industrial process control.
So then I get into the hundreds of thousands of data points.
And then we started doing large-scale data mining where maybe we're looking at millions of
points, data points.
But to really go beyond that, you had to start looking at the web.
And so it was really, I actually felt that I made the transition late in my career.
So it was 1998.
My book had just come out.
And I was, you know, I was looking to hit the reset button in my life in a lot of different ways.
I wanted to, you know, change research directions.
I wanted to, you know, change whom I was working for.
So I walked around the corner and I sort of, I got a job at NEC Research Institute, which is a different lab in Princeton, that had a little bit more of a more open environment and more freedom for the, for the research scientist.
And, you know, my good friends, Steve Lawrence and Lee Giles were at the Research Institute at the time.
And they were trying to convince me, God, work with us, publish up with us, you know, collaborate with us on these web problems.
And so they were the ones that really kind of, you know, pulled me into looking at the web as a single holistic artifact to be studied on its own.
And so it was around 98 and then 99 that I really, in earnest, started getting into web-scale data mining.
And at that time, I had started building out much of the infrastructure that you would see in a commercial search engine at that time.
So we had, you know, I think, so there were some papers that came out of this time frame.
Some of those papers got a lot of attention.
But in truth, I was actually more, I was prouder of the infrastructure that we built at NEC because we designed this system.
that was crawling vast portions of the web.
And we were able, in the late 90s and early 2000s,
to suck in hundreds of millions, you know, crawl hundreds of millions of web pages,
pulling their entire link structure.
I remember everyone thought I was a little bit crazy,
but I was one of the few people that actually invested long on the Itanium architecture,
because it was the first time that you could buy a machine for,
less than $10,000 that had 64 gigs of RAM in it.
So I bought a couple of vitaniums with my research budget, loaded them up with RAM,
and I would suck in the whole of, you know, the web graph that we could crawl at that time,
which had billions of edges in it.
So we were doing some really neat stuff.
And it's kind of fun.
I think that that piece that I loved the most working on was the web crawler, because
we had a single web crawler on a single machine that,
could saturate a T3 connection, which T3 is at 45 megabits.
And so a single crawler saturating, you know, a 45 megabit dedicated T3, which was the sort
of network connection that you would find it, you know, that would serve a whole lab, you know,
so I was, it was, it, there was a certain amount of pride to be able to write something that
could like saturate a network pipe that big.
That was a lot of fun as well.
So, yeah, a lot of us started doing, who had been in machine learning, started going towards web scale stuff because that's where the data was.
Right.
That's where there was so much data that, like, that was the exciting feeling.
But, you know, we say it's so much data.
But back then, so it was Lee Giles and Steve Lawrence, my good friends that I'd mentioned before, that were the guys who actually did the very first study for the size of the web.
and they published some articles in science and the nature
and at that time they estimated the scale of the web to be less than a billion web pages
and I was part of a research group that validated a later claim
I'm going to think that came out around 2001 by intone me
when they had the first evidence that they had successfully crawled a billion web pages
and that that was like a new milestone and so I actually I validated their
claims. I worked with them on the side on that. And that was around 2001, I think. And so it's kind of
funny just to think about that. That was a big deal that people were doing press releases.
The web is provably in excess of a billion web pages. And now, you know, like a billion pages
is like, doesn't even matter. You know, it's just trivia. You know what I'd like to do, I want to pause on
your career history right now. And to whatever extent you're interested or feel comfortable,
sort of for the layperson paint a picture of search at that time.
And it's going to be impossible for us to do this without mentioning Google.
So in a sense, let's start with on a layman's technical level.
I think most people have a basic understanding of the back rub and the algorithm that made Google work better.
From a layman's sort of technical perspective, what was search before Google?
technically. Was it based off of, you know, what people understood from database searches and
things like that? Well, we used to joke that the state of the art for search engines was grep for
the web. GREP is a Unix command line tool. I think the GREPN GREP stands for Grab regular expression
program or something like that. It's basically, you know, the way you use GREP is you type in a
string, well, you take grep and then a string and a list of files that you want to search over,
and then it pulls occurrences of that string that you're looking for and prints it out on the
command line. So you can see that, oh, the file foo over there had the search term that you're
looking for. And, you know, there was no notion of relevance, and it was very basic. So the very
first search engine that people looked at as having, in some sense, tackled the big problem was
Alta Vista. Now, of course, there were other systems like Gopher and other things and that were
nominally solving very similar problems, but from a commercial perspective, people looked at
Alta Vista as like the granddaddy of them all. And what Alta Vista did was, you know, not much more
than for the web. You type in a couple of terms, and it just shows you a long list of links to
pages where that term occurs. They, of course, want to
to make it better. And so they, you know, started working. That's where a lot of things related
to some of the early work and information retrieval got to be applied onto Alta Vista. But the,
the techniques that they were using were pretty basic and mostly just doing little things like
we have a very common metric called TFIDF. That stands for term frequency, inverse document
frequency and it basically says that it says two things one the more times that the word occurs in a
particular page the higher up in ranking it should be number one number two and notice that if you're
typing in multiple search terms you got to aggregate that and number two the waiting you know so
let's say you do a search that has two terms in it like foo and bar and let's say foo occurs a lot in
some pages but not so much in others and bar is a lot in some pages but not enough and others you have to
kind of given a result, we'll have a blend of including foo and bar. And so you had to figure out
which is more important, the foo or the bar. Will the inverse document frequency portion of that
metric, you know, TF, IDF, inverse document frequency for IDF, that's the piece that says if a term
is less common in the larger corpus, then it's probably more noteworthy that you've, that it's,
that it's a hit. Okay. And so, so ranking at that,
point was simply saying if the word occurs and it occurs a lot and it's a rare word, well, then it
should appear up high. And that's kind of like that first generation. Later on, people started
getting a little bit more clever about what they were considering the text of a document and how
they would wait text in certain context. So, you know, you have text that is in the body of a document.
You have text that's actually in the first thousand words. Maybe that's a little bit more important.
You have text that's in the title of the document because we have a tag that says title.
And then there's text that occurs in other pages but are part of the anchor text for the link that points back to the site that you're interested in.
And while Altavista didn't get a lot of credit for this, I believe they were the ones that actually were the first and actually doing that.
Sometimes Google gets credit for that.
That was another innovation that also Google ran with as well.
But this idea of treating anchor text as part of the text of the document had a huge impact
because oftentimes the phrase that people use to title or call out a link that they were linking to
could have been actually more important, more indicative of the content and of its importance
than it was than the text that actually occurs in the document itself.
And so that's kind of like what Alta Vista.
There were other searching into companies like Intomie and Northern Light.
and then all the web, and they were doing a lot of text-based stuff.
And then, you know, Google with back rub, which later became known as page rank,
they, I mean, that big innovation was taking that link structure
and trying to make some inferences around what it said about the importance of individual results.
So in general, the general intuition is the more links that point to you,
the more important the pages.
there's some subtlety to it because it's not just,
it's not just a matter of that pages point to you,
but it's important that important pages point to you.
And that's a little, that creates a recursive definition.
So a page is important if other important pages point to it.
The authority of those incoming.
Yeah.
Well, that's actually, so yes, you can think about it as like a measure of authority,
but that's actually kind of an interesting side note.
There's this brilliant, brilliant computer scientists.
And a friend of mine is named John Kleimberg.
I think John's still at Cornell.
John invented an algorithm called the Hubs and Authority's algorithm that at that time in the late 90s was thought of by many people's like a better version of page rank.
But it never really had the commercial success of page ring.
But it explicitly sought to identify what sites are authority.
and which sites are hubs.
And so this came up with a two-way recursive definition.
So an authoritative website is a website that is referred to by many hubs.
And a hub website is a website that refers to many authoritative websites.
So you get this co-recursive definition.
And so this is when in the late 90s, everyone started, you know, thinking about this link structure.
That's when I was, I actually did the research that I was doing that was in that.
vein that got the most attention was something that I called the community algorithm. So I invented
something called the web community algorithm that had a similar recursive feel to it. It defined a
community of pages to have that recursive property that they refer to more pages inside their own
community than outside of their community. So you could kind of recursively define what was a
community, what was a tightly cohesive, you know, group of pages that,
and their linkage seem to refer to the same sort of topical theme.
And so a lot of us were doing stuff like that.
Some, you know, my, my results got a lot of attention, not as much as page rank and
the hubs and authorities algorithm, but, you know, I'll hang my hat on it.
Just no problem with that.
But, and I should also say one thing that's really interesting is around 1999, Brewster Kale,
who's best known as the founder of the Internet Archive,
he reached out to me, Lee Giles, and Steve Lawrence,
you know, my friends that I was referring to earlier,
and we all decided to organize the first ever internet,
what did we call it, was the first Internet Archive
or Internet Colloquium or something like that.
It was an event that only about 50 people attended,
but attending, you know, Larry Page was the person that, the one person that attended from Google to represent Google.
And Northern Light, that was the biggest search engine of the world at that time.
You know, I think their CEO was there.
There was representation from Alta Vista.
I think John Kleinberg was there.
So where there's this one event at 1999 where I think, at least in mine, mind, the 50 most important people in the web at that point, I think were present.
in one room and arguing.
So it was a great moment for me.
Just real quickly, you know, in the business histories especially, it's sort of like
Backrub comes along and Google comes along and, oh, search is solved.
When Backrubs published, what's the reception among people like the community you're
talking about?
Like, was it like, oh, my God, a thunderbolt out of the sky?
This is incredible.
Or was it just another interesting thing?
idea. So I have to tell you, my first exposure to it wasn't as an academic, it was as a user. And when
Google came out, you know, back when it was at Stanford, it was just a research project. And you, you know,
we were all kind of acclimated to using Alta Vista. Google at that point felt revolutionary.
Because it felt almost like it was reading your mind compared to what we were used to. Because the
introduction of page rank as a score for how you order these things,
suddenly, suddenly became this powerful way of separating the wheat from the chaff.
And, you know, so you would type in something seemingly generic.
Like, you know, back then, what would be a good example?
If you typed in flowers, you know, you would get these definitive sources to like, you know,
maybe it's FTT.com or something like that.
But, you know, notice that the word flower wasn't necessarily in the title,
but what was really important was finding the sort of definitive answer.
And so Google was just, it was, it was a little bit of a lightning bolt, I think.
I think we were all kind of blown away at just how good, how much, how big of an improvement it was.
And then as an academic reading the paper, you know, it's a simple paper, actually.
And the math isn't that difficult.
But it's, it's, there's so much elegance and so many.
So there's a, I think of page rank as this amazing convergence of, of engineering, mathematics, and, um, in nature, uh, in a weird way.
And what I mean by that is, okay, at the heart of the page rank calculation is this certain type of mathematical operation that deals with advanced math topics like, um, eigenvectors and, um, eigenvalues.
and if you're familiar with linear algebra,
there's this family of algorithms
that are collectively referred to
as power method iterations,
and page rank is in fact a power method iteration.
And so I just threw out a whole bunch of gobbly cook.
Let me explain what it.
Let me bring it to angeal point.
What's amazing about it is that a power method technique
like page rank did not have to actually
work. There was nothing that guaranteed that it would be a practical application. And the reason
why is we understand the mathematics of power method iterations very well. And we know that they have a
convergence rate that is a function of the mathematical properties of the network that you're
applying it on top of. Now, here's the thing. Most random networks that if you were to write an
algorithm that would randomly generate networks, you know, most of the networks and the adjacency
matrices that you would tease out of it, a power method iteration will not be effective.
It will converge too slowly.
But it turns out that the sort of networks that emerge in this organically grown thing like the
internet, where you have pages that there's an incentive structure underneath the hood for
things to kind of refer to other popular things and for linkage to be somehow topically related,
that actually constrains the space of networks to be something that's a little bit more orderly.
And in that family of networks, we have very good mathematical models for their properties.
And it turns out that a power method iteration like PageRank is wildly successful and converges amazingly fast.
So here's the point.
It's simple, elegant math that could have worked or could not have worked.
But it turns out that there's systemic properties of these sort of organically grown networks like the Internet and webpages and all that usage and all those different things that almost encourage page rank like algorithms to work really well.
And so interesting math, interesting kind of natural property.
of these organically grown networks turning into a killer application.
And that's why I kind of, and so I actually used to use that in some sense as my yardstick.
With whatever I was going to work on as a researcher, I really wanted to find something where
there's an interesting story in terms of the math, the engineering in terms of what kind of
I could build, and how it related to the natural world in some way.
And that's what let me on fire.
And I think a lot of other academics on fire was just, God damn, this thing is so useful.
The beautiful serendipity of that, yeah.
Yeah, yeah.
All right, so I'm going to try to get you back in the story here.
But to do that, if you could, again, to whatever degree of detail you want to, and I know you weren't there for this, but for the listener, sort of describe what go-to.com was and how it got started before it turned into.
overture and then you got there. Yeah, yeah. So go-to.com was actually founded, I think, in late
1998 or early 1999, around the same, very close to the same period that Google was incorporated
as well. And in some ways, go-to was like the anti-Google. Not in any philosophical way. It's just
that the direction that they were going towards was tackling a set of problems that existed on the
web that were very different than what Google was was tackling.
Bill Gross, who's a very well-known person in the, you know, the entrepreneurial and big-finker
community in general today, he was the founder of go-to.com and as he was the founder of, I think,
like 50 other companies, he's one of the most prolific people you'll ever mean.
Right.
And in the late 90s, he has Idea Lab, which is a classic incubator, sort of.
That's right.
Yeah.
That's right.
And they're just churning out, churning out startups.
GoTo is probably the biggest success story that ever came out of Idea Labs.
But the concept of GoTo is, so it's such an amazing story because it's one of these things where
initially everyone thought it was the stupidest idea you could imagine.
But there was something utterly brilliant in it.
And so what GoTo does in 1999, and remember, you might.
Actually, so at that time, banner ads were the thing.
And I can't remember who did it, but I know I know the person who, like,
created the very first banner ad.
And also, you know, I know, Mark and Dr.
And then Craig Canerick and a whole group of people, yeah.
Yeah, yeah.
So, and the very first banner ad, I think it went for like $1,000 per impression or something
really ridiculous.
And then there was a crash in that market.
where because people couldn't quite justify the advertising costs without any sort of return on it.
So there is this economic uncertainty as to, okay, how is the internet going to make money and is advertising part of that story?
So Bill Gross has this brilliant idea that seems stupid to everyone else.
And that idea is we're going to go to people that want to advertise.
We're going to have them bid on a cost per click model if we show.
their ad that's going to look like a search result that is a response to a user query,
how much are they willing to pay for the click?
And we're going to rank order the results that the user sees and, you know,
kind of put that as a layer on top of the normal organic search results.
And it will be ordered rank.
It will be ordered by cost per click.
So it was kind of sometimes that we were referred to as a pay for play kind of derisively.
It's, you know, and it seemed to a lot of people.
myself included as kind of like this slimy idea.
It's like really you're going to tell me that publishers or content owners are going to have websites.
And instead of just showing like pure organic search results, you're going to layer on this layer of results that's actually advertising but looks like search results.
And everything there is going to be rank ordered by their willingness to pay.
That just seems evil or wrong or slimy or something like that.
Let me do this. Let me be explicit about this.
So, because I'm going to base this off of my experience with GoTo.
So you would go to Gotto.com and you would type in Flower.
And the results that you would see, it would look like any other search engine,
but what would turn out is that the top result would be the person that paid the most per click.
And then the second would be the second most, the third would be the third most.
And then if you ran out of people that had bid, then you would get actual search results from some other license.
Yeah, that's right.
That's right.
And so go-to.com and the name go-to.com reflects the initial strategy that it was Bill's ambition that this would be a destination website on its own. And that people would come and just search. And so they would come here and search when they were seeking to buy something. So if you go and you search for flower on goto.com, you want a florist. Whereas if you're looking for gardening results or tips for gardening, you would go to someone else. And that was kind of like the basis. Keep in mind, what this meant is that every single query as a user,
types it in is actually doing a real-time auction that has to have some sort of account settlement
at the end of the click because user searches, maybe they're searching on flowers, the results
come, they click on one or more of those links. Those advertisers, they had a cost per click that
they were willing to pay. That cost per click gets deducted from their account in real time.
And so what this meant is that there is that the paid search system was in some ways at that time scaling up to do more little financial transactions than say all the credit card companies combined.
And so what happened was is that the success of go-to as its own destination site was, you know, hit and miss.
It wasn't quite the case that people would necessarily want to come directly to it.
it. But where it really, really took off was when, and I, you know, honestly, I don't know who started
pushing the, the syndication model first. But let me just say, I think, you know, I worked very
closely with Ted Meisel, who was the CEO of Overture, which was the, what, you know, go-to would
later change its name to Overture. That change was a change in strategy, not just a change in
name. So they went from being a destination website to being a platform. And as a platform, it meant
that they had three constituents.
One, they were doing deals with the likes of AOL and MSN and Yahoo and Ours and Alta Vista
in order to get a deal where when the search query was issued, say, on a Yahoo, the query
would be federated out to overture.
Overture in 40 milliseconds would send back the top end results back to Yahoo.
and then Yahoo would interleave or give a presentation that showed the overture paid results on top, say three results there, and the rest of the organic ones below.
You know what?
So let me – sorry.
Okay.
So you went to Yahoo.
You typed in flowers.
You would still get the results that Yahoo traditionally gave you.
It's just that now, because of this partnership with overture, the top three, and they would probably be labeled ads, would be coming in from overture in this sort of partnership.
That's right.
And so the beautiful thing about this, I said before everyone thought it was stupid, and then it turned out that it's absolutely like one of the most brilliant ideas ever on the internet.
The reason why it was so brilliant is that there are three constituents here, if you think, actually, maybe four.
Okay.
So there's, well, there's overcharge itself, which is the platform, so let's leave them off.
But there are the destination websites like the Yahoo's, the Amunds, the AOLs of the world.
Number one, there are the advertisers that are willing to pay to have their stuff shown.
And then there's the end users.
And so in economic theory, there's this concept that sometimes referred to as incentive alignment.
And so in economics, it's a good thing when there is a choice, when many independent actors can make a choice and kind of converge on a shared outcome.
And if their incentives are aligned in the sense that if you make a choice, it's good for everyone rather than,
good for only one of them, you end up with really interesting outcomes because it's a, it almost
has like a non-zero-sum dynamics. It's win-win-win. And so let me expand on that for a little bit
what that means. So in the past for the user, if they were searching for flower, maybe they had
that commercial intent, maybe they had an informational intent. But now with the web page being
split so that there was commercial stuff on top and non-commercial stuff below, it now suddenly
got a lot easier for a user to quickly say, oh, I want to buy, boom, it's up top. Oh, no, I want to
research. Boom. It's below. Okay. So that actually improved the user experience from a
relevance perspective and a way that no one had really anticipated. And at overture, we had done
tons of studies that showed that people really did like the overture results and especially when
they had a commercial intent. So good for the users, right? Number one. Number two, the advertisers
absolutely loved it because it gave the highest quality pairings that any of them had ever seen
before. And we had done also lots of analysis to figure out like what was the cost to acquire
a new customer as compared to page search to yellow pages, to advertising on radio, to advertising
on television, to doing an internet banner ad. And it was shocking how efficient paid search was.
Page search was in many cases 10 to 100 times cheaper to acquire a customer that way than through any of the other channels.
I so rarely do this.
I'm going to interject a personal experience here.
I still rarely do this, but I think it's relevant.
1999, I just finished college, and I'm starting a little thing that will eventually evolve into my first company,
editing college essays and term papers and things like that.
And I throw $40 on to GoTo.
I know it was GoTo.
It was not over true yet.
I remember this super clearly.
I throw $40 on, not much, but I can afford to lose $40.
And I got $80 back in 24 hours.
And I said to my, I remember thinking, I can do this all day every day.
If every $40 I throw down gets me $80.
Yeah, yeah.
It was magical.
And I have to tell you, that effect wasn't just for advertisers that benefit.
this is actually kind of like a, I want to complete sort of the story of how all these different constituents related and the incentive alignment piece.
But in a similar way, that observation that you just made, there was a variation of that that literally changed how I would think about how to run a research lab.
And because of that property that you just mentioned, but I'll circle back to that.
Yeah.
So anyhow, the third constituency here is the owners of the, the owners of the,
the destination sites like Yahoo and AOL or whatever.
And you have to remember, these were companies that were, you know, in some way they were
flying high because everyone wanted a piece of the action, but they weren't making any money.
Okay.
They were losing money.
And suddenly, Overture comes along and is minting money.
And they're not just minting money for themselves.
They're minting money for everyone else.
Because so let's say someone searches on flower.
the advertiser is willing to pay a dollar per click for that click through.
So, Overture gets the buck after the user clicks on it.
They would tend to pay somewhere in the early days, like 70 cents on the dollar.
It would later get into the 80 cents on the dollar.
So they'd pay 80 cents or so to a Yahoo.
And they'd keep 20 cents for themselves, right?
And then the user would end up with something really good as well.
And so when Yahoo became profitable, it was entirely because of their relationship with overture and what overture was doing for their search economics.
And so that was amazing to kind of witness this sort of shift in the advertising industry and the internet and everything else.
And to kind of complete the circle and go back to the point that I was making about how I would do research.
Well, when you view this sort of this economic system that has these three constituents and queries are coming in on one end, clicks are coming out, and clicks in dollars are coming out on the other, it's now kind of like this living, breathing algorithm that's running in the wild.
And to kind of think about this now, it was 2004 that Yahoo acquired Overture.
I never even said what my role was at that, Overture.
So in 2002, I joined Overture as their first.
and only chief science officer.
And you were recruited?
Yeah.
Oh, yeah, yeah.
So I had been living the fat academic life of sorts at the NEC Research Institute.
I was doing some of my web scale data mining research, also some machine learning stuff.
I was very big in support vector machines at that time.
And the role that I had at NEC was the chariest role you could imagine.
I had a big research budget.
I could do whatever I wanted to do with it.
And I was allowed to occasionally lecture at Princeton.
I had two really awesome PhD students that year.
You know, so it was like the best of all worlds.
I was an academic, but I didn't have to pay the academic price of living this publisher parish track or, you know, because I had tenure also at NEC.
So it was like this, this ideal academic existence for a while.
And so my research was getting some, some attention.
you know, like I, you know, mainstream press and others were talking about it.
But overture comes knocking on my door.
And, and I had, you know, to kind of put in perspective, 2002, I was living on a tree farm with my wife and had this, you know, this, this, this, this, this, this, this, idealic academic life.
And I thought that I had my, I thought that I was living in my forever home and working in my forever job.
I had no intention of moving.
And when I learned about overture and I spent days.
and days out there.
You know, the interview loop was with Overture was no less than three interviews,
and the last one was three days long.
It was what I learned, I felt like that it was, my gosh,
I'm like, it's like Gutenberg walked up to me and said,
hey, I have this idea for this thing I call a printing press with movable type,
and would you like to help me work on it, right?
It was, I felt like I was invited to participate in something that was going to be
historic. And so I sold everything. My wife and I, we literally loaded up in a one-way RV rental. We loaded our two
dogs. We didn't have kids at the time. And we drove cross-country because we didn't want to fly our dogs.
And we moved out to Pasadena. I lived in a, you know, a rental place in downtown Pasadena that
was like 500 square feet with my wife and two giant dogs. And I worked in an office that there was, my
research team had about a dozen people at that time. And we were all.
crammed in one room that may have been maybe, you know, like 150 square feet. And I had a desk
that was made out of the stereotypical, you know, door on top of like filing cabinets. And I
shared that desk with someone else. And we didn't even have air conditioning in the room. So it was,
it was like this crazy startup environment, even though it was already a public company. And so my
challenge, as I go from this academic environment to working a corporate environment, now I'm like
an executive, and that's something that I never thought I was going to be doing. My job was to figure
out how to take that living, breathing, organic ecosystem of paid search and make it better, right?
And up until that point, and I want to give you a little bit of context. Now, I've worked at that
point for university research labs, government research labs, corporate research labs, and I had seen
the rise and fall of research labs where they have that pendulum swinging from pure research to
applied. And that pendulum swinging oftentimes resulted in an entire research labs collapsing. So I would,
I mean, maybe not collapse in the sense that it's a complete failure, but it could be a
collapse where like the best talent leaves. And so I saw that happen at Siemens. I saw that happen at the
NEC Research Institute. So I had very, very strong ideas around what made research labs fail. And part of
that had to do with the funding models for how they justified their existence. And
And when the pendulum would swing towards, okay, you got to show some value for your work, people, researchers would spend so much time trying to take their research and turn it into an application.
And oftentimes they weren't very good at doing that, that they would fail on both producing an application and in producing novel research.
And so it was almost like doomed to fail.
And so when I came to overture, here was this living, breathing ecosystem.
And the first research project that we had lined up, which I didn't pick, I kind of inherited, was we need to introduce spell correction.
And I know that sounds like so trivial, okay?
But spell correction in a way that where you're building the spell corrector has derived from data was kind of a new thing at that time.
And here was the challenge.
We had committed to the whole world that we were only going to do exact match search, okay, at Overture.
And the reason why we said we're only going to do an exact match search is because the advertisers were worried that if we didn't do exact match, that we would be scamming them.
And so to be clear, what this means is an advertiser comes and they're placing a bid.
They're bidding on the word flowers.
They're not bidding on roses.
They're not bidding on flowers, Los Angeles.
They're bidding on flowers.
And they only wanted things that matched exactly flowers.
Now, the challenge is is that, okay, flowers, that's easy, right?
But, you know, back in that time, Britney Spears was, you know, kind of like on top of the pop charts.
And there were on the order of like two dozen ways that people spelled Britney Spears.
Only one of them was correct.
But the whole world thought there were at least 24 different ways that you spelled Britney Spears.
And for the advertiser, that meant that they were only getting one of those spellings, the correct one.
But they missed out on all the incorrect ones.
So a compromise that we kind of bridge with our advertisers is like, look, okay, how about
we go beyond exact match, but we do things like spell correction.
So you're getting all the relevance that you want it before, but we're just, we're correcting
for these inefficiencies.
And so the technique that we used, we had a, I love setting these names because it's so
neat to see where these people went.
So there was a PhD student that was part of my research team.
His name is John Carnahan.
He went on to become like the chief.
Data Scientist or the chief data officer, if I think of News Corp or, you know, like a major, you know, major company.
But John and I and a bunch of other people were sitting down this room.
And John's PhD work, he was working on a technique to do approximate matching in long discrete sequences for bioinformatics.
Okay.
So he has this technique that he's been working on for how do you see if two different strings of DNA are,
close enough, right? And this technique that it came up with was a weighted edit distance. And so what
that means is that you say that things are similar if the number of like inverting the characters
or deleting one or inserting a number, if you count the number of edits that it takes to make
one thing match another, that's the edit distance. Now what John observed in his research is that
certain edits are more common than others. And so we should weight them differently at when we try to
whether two things are similar or not. So for example, because, for example, the layout of the keyboard
and going back to Britney Spears, it might be more common for people to accidentally insert a T instead of an
R because T is next to R on the keyboard. And so, however, if they put in a vowel an O instead, which is
way over on the other side of the keyboard and doesn't sound anything like that, that even though
from an edit distance perspective, those are the same edit distance.
We would be, we would conclude that T is much more likely to have been a mistake for, you know,
for BT is closer to BR than B.O is to B.R.
You know, if you're thinking about the initial characters of Brittany.
Right.
And, um, and so John came up with this technique that he had developed for bioinformatics,
but he scanned the entire corpus of, you know, misspelling.
for things that we had to learn what were the what were the weight proper weightings for different
types of misspellings and came up with this technique for for for correcting you know you we we yes
we could identify that all those different misspellings of brittany where in fact should be mapped to the
one canonical one and and that seems like such a simple innovation but we you know we had we were
at this time keep in mind overture is handling 60% of the search queries of the world
because they're partnered with all the big guys at this point.
We were the big boys.
In like a 2002-2004 time frame, we were the big boys.
We had 60% of the query traffic of the whole world going into our system.
And so we would have to do that complete match and real-time auction.
And about 50 milliseconds was our limit.
And we had to return the results in 50 milliseconds.
So that meant that our budget for doing the spell corrector was 5 milliseconds.
when you piece it all together.
So going back to the research and everything like that,
so we had this whole project, whatever,
we inserted this spell corrector into the pipeline.
It's hitting 60% of the world's traffic.
And in one year, I think we calculated it was going to make, you know,
at least $20 million, I think.
Okay, so this is like, you know, this is around 2002 or something like that.
I might be, God, I hope I didn't drop his ear.
Yeah, I think it was going to,
It was going to add $200 million, no, $20 million in revenue to the overture bottom line.
And that was like, that was almost like a five or 10% lift in the revenue.
You know, so this was, so going back to the whole research thing, I, it used to be, if you
worked in a research lab, you'd have to wait five years for your research to finally make it
into a product that you can point to and say, yeah, I made the world a better place.
And here what we did was we shipped an algorithm.
them. And in 24 hours, we're estimating, holy crap, we're going to make $20 million for the company.
And so when I had my conversations around what my research budget should be, for the first time ever,
you know, and at this point, I've been, you know, I've been doing research for like 15 years at this point.
I could actually have a very, I could be standing on firm ground when I was justifying the expense for having a research lab.
So that completely, I think what a lot of people don't realize is that the data-driven nature of the Internet has fundamentally changed how we think about executing a research agenda within a company because the R-O-Y now is so much more shorter term and clearer and transparent in terms of figuring out whether the research actually paid off or not.
And so this was kind of like, you know, what I was thinking about in this 2002 to 2004 time frame.
But we were, we were slowly innovating on top of what we had already built.
And Yahoo would come to acquire us in 2004.
But then there was this joggernaut called Google.
All right.
Let me slow you get our heels.
Let me slow you down for a second.
I'm going to, for the context, I think we should underline.
This is after the bubble bursts.
And a lot of people are not making money.
You know, people like Amazon, Yahoo are down to $5 stocks,
and people are wondering if they're going to last.
And so there's this window of time where overture is basically the only guys left at,
overture and eBay maybe that are still making money.
We were on track to make, I think around the time frame we're making about $800 million in revenue.
We had about $200 million in the bank.
And we saw that there was a war emerging.
So what happened was, and I have to, and I'm going to make a little kind of meta side here.
I can keep talking about this for a long time and I don't have a hard stop.
So I would love to tell the complete story.
Yeah.
Rather of artificially kind of time bound this.
Yeah.
What's your hard stop, by the way?
I don't have one.
Oh, you don't.
Okay.
Okay.
So anyhow, so keep in mind what a lot of people don't realize.
So, you know, Google is not a public company at this time, but they are a company.
And they're making maybe on the order of like $20 million in revenue a year.
And they're making it on their search appliance.
They were selling a piece of hardware that had a search appliance built into it.
And that was, you know, in that kind of early 2000, I'm thinking now closer around 2000, 2001.
Their search appliance at that point may have been their biggest moneymaker.
Overture is killing it.
Overture is making more money.
Overture is serving more queries, has more advertising partners, has more destination sites, has better monetization, better cost per click, everything, killing it in every way.
But here's the thing that I don't think a lot of people understood.
And I want to make it a little aside.
There's this concept called the Innovator's Dilemma that was introduced by Clay Christensen.
Richardson, yes.
I think he's a, is he still at Harvard or Cornell?
Cornell, I think he's at Cornell.
Anyhow, Clay Christensen coined the phrase innovators dilemma.
And it's a beautiful concept that has been used to study economic cycles and trends and winners and losers and things like that going back a long time.
And what a lot of people don't realize is that overture had in some ways painted itself into a corner in a manner that was a nearly
perfect example of the innovator's dilemma. So the innovator's dilemma, in the simplest terms,
is that the first in an industry, whatever it is, will tend to focus on a small number of big
customers that have a lot of money. And they try to solve the problems of the big customers with a lot
of money first, because that's where the easy money is, right? I mean, it's like you do one deal and that
deal is big, right? And because the innovators, the first in the industry, are focusing on the
smaller number of big customers, what they ignore are the larger number of small potential
customers. And so the theory of the innovator's dilemma then says, well, once the incumbent
builds out and proves out that something could, you know, is that a marketplace is valuable,
a newcomer will come on the scene and they want to compete.
Now, the newcomer can say, I want to compete head to head and try to beat them at their own game.
But that would be stupid because the innovator already, you know, the first in the market already has that advantage.
So the newcomer says, you know what, I'm going to attack.
I'm going to go after the customers that they're ignoring.
The smaller, I'm sorry, the larger number of smaller customers with less money.
And so if they tackle that, then they're going to have to actually learn to be more.
efficient in what they do because they have to deal with more customers.
Each sale or transaction is smaller in size.
The margins, therefore, may not be as good.
And so they've got to make up for that with those economic inefficiencies with being
more internally efficient for how they do it.
And then as you play this out, those newcomers, the efficiencies that they learn from having
to work in that manner eventually allows them to then compete head-to-head.
with the first original market maker that had been in that market
and eventually potentially eat their lunch.
And so the conclusion of the innovator's dilemma is that if you want to survive long term,
you have to be willing in some ways to act like the newcomer and destroy your own business.
And so the classic example of this is like initially there were supercomputers and mainframes
and then there were scientific workstations and then there were expensive PCs
and then there were cheat PCs and then there's cell phones and then there's commodity cell phones.
And each time that these generations would happen, we'd almost see an erosion of the market strength of the original innovators that had been focusing on the bigger things.
So no one talks about create computers anymore.
No one talks about some computers anymore.
Not a lot of people are even talking about Dell as much anymore.
And a lot of the power in the ecosystem is like where cell phones, you know, everything, it kind of goes downhill where the strength and where the leverage is.
And that's the innovator's dilemma.
Now, Overture was an innovator.
And they were an innovator in the sense that they had a business idea that everyone thought was stupid.
And so they made a small number of very critically important decisions in order to under, I'm sorry, to overcome that.
uncertainty. So I mentioned one of them already. They did exact matching on the query terms. They could
have actually done fuzzy matching right from the get-go, approximate matching, phrasal matching,
string, you know, fuzzy matching from the get-go, but the advertisers would have never signed up
because of their uncertainty as to the model. They also could have gone very broad and said,
we want to make this appear on every website in the world, you know, going out, you know, thinking
about the distribution of different destination websites, but instead they went to AOL and Yahoo and
MSN, the big boys.
They could have gone to every advertiser in the world and had a self-serve model, and they did
it.
Instead, they went to advertising agencies and big advertisers.
They could have also said, well, from a quality perspective, we're going to be somewhat
accepting of things of different quality from a, from a matching perspective.
But what they also did was they put in place an editorial staff that was over 100 people
that would have to editorially approve before a paid search listing ever went live,
that that listing was editorially relevant to that keyword.
And until a human checked off the box and said, yes, we conclude that that
that is editorially relevant and high quality, the ad never went live.
So in a very real way, they chose like a newcomer in any new industry to focus on the head of the
distribution and they completely ignore the long tail.
And they did that.
As opposed to Google and AdWords, which...
That's right.
And that's right.
And so what a lot of people don't realize is that what Google did, so Google went after the
same model, it was a model that overture had proven, they didn't have to overcome that
that disbelief that the first generation of paid advertisers and other partners had to overcome.
And so they could change the model in ways that overture couldn't change the model.
So Google, instead of doing exact match, right away, right from the get-go, they did fuzzy
matching.
Instead of having human editors in the loop to approve them, right away, they had click-through
rates as a proxy for.
editorial approval. And so they did this click-through rate-based ordering. And if you had a low
click-through rate, they drop you. So that was in some sense automating editorial to the long tail.
Right away, instead of going right away to big advertisers, they introduced a self-serve system
where an advertiser could just go on to the website and manage the whole thing and would never
have to talk to a person. And then they created their contextual advertising system that turned
potentially any blogger into a Google AdWords partner.
And so it wasn't just like the AOLs and the Yahoo's and the MSNs of the world.
It could be, you know, your, you know, Joe's Fish and Tackle, you know, and doing contextual match on that or something like that.
And so on those four dimensions, they went to the long tail, whereas overture had kind of painted itself in a corner with working on the head of the distribution.
And so we saw this coming.
It was kind of like a slow motion train wreck.
And because around 2002, 2004, we saw that Google was, you know, we were out monetizing them.
But they had another trick up their sleeve, which was they were able to subsidize the business deals by basically through the revenue that they were generating on their own website.
Now, they had Google.com.
They're running AdWords on Google.com.
They're also, you know, they went to AOL and they convinced AOL that you should switch over from Overture to Google AdWords.
And one of the things that made that deal work the very first time, it wasn't, they weren't able to compete with us.
We actually just based on the economics, we were absolutely certain at the time that we had better economics.
And just based on the advertising system itself, we could actually give AOL more.
money than Google could. But what Google did that we couldn't do was that they took the revenue that was being generated from Google.com and in some ways they shared it with their partners in order to sweeten the deal. And we didn't have a destination site anymore.
Okay. I want to underline that because that seems so glaringly obvious to me. Because you even said early on that Bill Gross once go to, originally go to.com, to be this destination site in terms of shopping, in terms of shopping.
in terms of intent.
That never worked out, but then stumbled on to this.
We had to switch to a platform play of being like a platform.
That's right.
And so the problem is, is from a business model perspective, you're dependent on your
partners because if you lose your Yahoo deal, you lose your AOL deal, you don't have,
you don't have anything to back up.
You would have lost distribution.
That's exactly right.
And you know what's really interesting, this exact same pattern has played out later with
Microsoft and Apple.
Okay, think about it this way.
It's the exact same thing.
Microsoft chose not to sell predominantly to consumers,
but to base their partner network on building bridges to hardware manufacturers,
to IT departments, to, you know, I said hardware,
but that big OEM ecosystem, the software developers,
but not so much to end consumers.
And they made it a point for a, for a,
a very long time to never manufacture their own hardware, because to manufacture their own hardware
would be to compete with their own partners, right? And it wasn't until Apple had that complete
integration of the whole stack where that vertical integration and showed its value
and Apple seemingly coming to eat Microsoft's lunch because of the whole integrated experience
that they were able to produce, that Microsoft finally said, you know what, we're going to
start manufacturing hardware as well.
So Microsoft resisted manufacturing hardware because they didn't want to be in the consumer business directly.
And they didn't want to compete with their own partner network.
And we were in the same boat.
We thought for a long time that if we had our own destination site, we would be competing with our own partners.
But Google had their own destination site and started beating us on deals by virtue of having it.
So this is when we made the mad scramble to try to acquire a destination site.
So my first day on the job at Overturn, 2002, I actually spent it on the ground in Trondheim, Norway.
And that's because that's where the main technical team for a website and a company.
So the website was known as All the Web, and the company was known as Fast Search and Transfer or Fast.
And so I was making a world tour with the purpose of acquiring a search engine.
And we were talking to everyone at that time, direct hit, ink to me, altavista, all the web.
Then there were others that I can't remember.
And ultimately, and this is a really complicated story that I think we should skip.
Okay.
But we ultimately what happened was, is ink to me was our first choice.
But for purely business and finance reasons, there were some things that kind of became a deal breaker.
And Yahoo ended up acquiring ink to me.
and so we acquired Alta Vista and all the web
and announced those acquisitions on the same week.
They are on the same day, the same week.
And so it was kind of a confusing thing
for a lot of people to witness.
Why are they acquiring two search engines
announcing it at once?
And so we were with some urgency
trying to figure out how we could create,
we could revitalize Alta Vista and all the web
so that we could actually have our own source
of steady revenue that was independent of a partner and used out to sweeten deals.
And the thesis that we had was that Altavista had a brand.
All the web had technology that we liked better.
We'd smash them together and that would work out.
But we never really got a chance to execute on that strategy because then in the same year,
Yahoo announced that it was acquiring overture.
And I think the deal closed after that in 2005.
Right.
So, yeah.
So the kind of kind of wrap a bow around.
that whole thing.
We were scrambling to try to create a destination site, but in the middle of that plan,
that's when Yahoo acquired us.
For me, I would go on and move up to Sunnyvale, and I founded Yahoo Research Labs and
was the principal scientist of Yahoo for a while and the head of corporate R&D for Yahoo.
And during that time, when Yahoo, now that Yahoo, now that Yahoo, now that Yahoo,
overture, that's when they wanted to take a stab at reinventing the whole platform from the bottom
up. And so it was this 2005 beyond time frame that the project, now known as Panama, got underway at
Yahoo, which is an attempt to, because people forget this, Yahoo never had its own search. It started
out as a directory. It licensed search from other people. That's right. And so now when it sees that
Google with AdWords and AdSense has
tied this all together and overture.
And it's a joggernaut now.
And it's like, it seems like that they are close to, I mean, that they have like,
you know, they've, they've, and they're embracing of the long tail,
they crack the code on the economics.
And so the idea is, is you're bringing in overture.
They also acquire ink to me.
And they're going to try to tie these together to, to, to do this.
amazing economic genre. Yeah. Yeah. And the holy grail for them was to be able to have an answer
to the entire addressable marketplace on all those dimensions that I outlined. So they wanted to have
really killer exact match, but also really killer fuzzy match. They wanted to be able to have
self-serve for the long tail of advertisers, but also really, you know, premium branded relationships
that would be managed through the, you know, the whole Yahoo network.
They wanted to be able to be a syndication partner and run this whole syndication thing
for other parties like Microsoft, which they did at a time.
And so they were trying to kind of rebuild the whole thing from scratch.
And, you know, I think, you know, there's a lot that it'd be fun to get in a room sometime
with some of my colleagues from Yahoo and kind of debate the pros and the cons.
in some ways, I think the Panama effort was too ambitious.
And it was, I don't want to say it was set up to fail, but it was the dynamics of the situation.
There was no one, there was, I don't think anyone made a choice that was, you know, that you could point to and say, yes, that killed that, that set up Panama irreversibly on a path to fail.
But part of what was going on is that, okay, you had overture, overshire acquired Altavis and all the web.
You had Yahoo, Yahoo had acquired Intomi.
Okay, so you now have basically five different groups of people that are all supposed to get in the same room and collectively work towards throwing away everything that they had ever built and rebuilding a single unified platform.
Well, and you had a preexisting culture of Yahoo's only a few years out from having been the king of the web, you know, in the late 90s.
So all of a sudden, you're coming in there to save and revolutionize their business and there's got to be residual culture that's like, yeah.
Yeah, yeah, and that was part of it.
And I want to say, you know, all of my Yahoo colleagues that I worked with at that time are people that I keep in touch with to this day.
And I have love and affection and they're awesome people and I've learned so much from them.
But at the same time, I don't think that there was as much appreciation from the Yahoo leadership for the complexity of what Overture had built.
and they looked at it through the lens that they were familiar with,
and from their perspective, it looked to them like they could easily tear it apart, rebuild it,
and do it a better job.
And what they hadn't really kind of understood was, number one,
how technically difficult some of the problems were that we had solved at Overture.
Number two, how difficult it was actually to build out the ecosystem and get, you know,
because you have to think about it this way.
When you have a company like Overture or even a company like Microsoft where you have these two-sided network effects, it's a living-breathing ecosystem.
And you have a cold start problem in terms of how you build it and how you grow it.
And these systems do not work if you hit the pause button.
Because if you hit the pause button, the ecosystem kind of starts to collapse out from underneath you.
And so what I don't think everyone appreciated that was when we hit a partial pause on the development of overture and putting resources in overture and instead shifted everything over to this thing called Panama, which was going to take a couple of years to build out, that allowed Google to perfect everything that they are working on and basically eat Yahoo's lunch.
And so there was a time where it just seemed like Google, in terms of their economics, they got so good at matching, so good at self-serve, so good at syndication, that whereas in the past I had said that our economics were better than theirs. And by that, I mean, we had better cost per click, better total revenue. We were able to give better revenue share to our partners. And I had said that in some ways Google was, you know, kind of being kind of tricky and that they,
were subsidizing their deals by throwing in their own destination site revenue in order to make the economics add up.
They eventually reached a point that they were legitimately outperforming us economically,
and then they would no longer have to subsidize the deals, strictly speaking.
And also, and I'm sorry, I'm going to have to interject again, personal recollections here.
But from an outsider, it was always Panama's coming, Panama's coming.
It's going to be as good as Google or better.
but I don't think it, because I'm running businesses that are dependent on paid search at this point,
I don't think we were able to actually use Panama until like 2007 or something.
And at that point, is it a question of, well, Google's wrapped up the market share at that point.
Yeah, yeah, Google owns it.
That's right.
That's right.
And, you know, to be, you know, so I was part of the original Panama team.
And I thought it was, you know, this probably says more about me than it does about anyone else.
you know, I had seen, you know, keep in mind, I had seen some, the signs of what I thought was like a, like a, the smell of death, if you will.
And I thought I, I had, I could detect that Panama was going to fail.
Even from the earliest days, I thought it was going to fail.
Because a lot of, you know, it wasn't easy to move it forward.
There's a lot of disagreement on all sides as to what it even was and what the goals were.
And so I actually,
And keep mind, there's so many smart people working at Yahoo at the time.
One of the, one player that people forget about, ironically enough, is because he had such a big impact on the world after his Yahoo days, was Jeff Weiner.
Okay, Jeff Weiner was the head of the product side and the business side of Yahoo search.
And Jeff was a key stakeholder at that point as well.
And, but anyhow, 2005, I, you know, I, I, I, I wasn't planning on leaving, but the combination of Yahoo, I'm sorry, of Microsoft, I had a, I had a good working relationship with Microsoft for years because they were an overturn, a Yahoo partner.
And so, you know, I've been talking to Microsoft execs for, you know, for what seemed like forever.
And they had been trying to recruit me for a long time.
but it was in 2005 that the combination of the carrot that Microsoft was kind of dangling in front of me
and the implicit stick that I was feeling at Yahoo because I felt like that there was no way that that effort was going to succeed.
That that's when I personally I made the leap and said, you know what?
I'm going to I'm going to go up and move up to the Redmond area and join Microsoft.
So I, you know, from 2005 to 2007, I was a spectator on it.
Panama, not in the trenches with everyone else.
And so the stuff that happened afterwards, I can kind of only speculate.
Before we do leave Yahoo, this is something that maybe you'll have to speculate on if you're
willing, because obviously you would not have been privy to this sort of thing.
But I have to point out because I feel like everyone looking back on this and history
looking back on this is screaming, wait a minute.
So Google basically becomes Google by not stealing, but by copying.
No, no, they literally copied.
Right.
And there was actually, there were lawsuits.
Yes.
So that's what I want to say.
Yeah, there was like, what do you say to history about, like, eventually, I got to point out that Google and Yahoo set up that lawsuit.
Yahoo gets a ton.
Like, I think they eventually sell it for a billion and a half dollars of Google stock.
Yeah, it was like $1.5 billion.
But in that kind of Clay Christensen's sort of model, like, what do you say to history?
how could Yahoo have just given away this amazing business model that Google just runs with?
Yeah.
I don't think, and, you know, I don't want to like pin the blame on the Yahoo execs at this time
because I was a Yahoo exec at that time, but I don't think collectively everyone, I don't
think anyone really kind of fully understood how time was of the essence.
And that, you know, and the innovator's dilemma that I spoke of before about, like, things
happening in the hardware industry, that stuff historically takes decades to play out.
The, this fall, this shift in the search industry, the paid search industry that we just
described happened in like two years.
Okay.
So we got to see a complete, full-on, you know, kind of, gosh, I don't know, I'm trying to come up
with the right metaphor, but it's almost like a, not a morality play, but it was like this,
this, you know, this, this triumph and tragedy.
Play out.
Traum to tragedy in two years.
You know, that was, because again, 2004, Overture owns the paid search industry.
2005, Yahoo is, hit the pause button.
They acquired Overture and everything's kind of stagnating.
2006, Google owns paid search.
And at that point, they had escape velocity.
And what I mean by escape velocity is that they could pay destination sites so much more for their search traffic than anyone else could pay that that created this virtuous cycle.
The distribution syndication partners would come to them.
The advertisers more than anything else wanted traffic and volume of traffic.
And the reason why is that paid search was so.
much cheaper relative to every other form of advertising that the advertisers weren't really particularly
price sensitive.
Okay.
So if they could only work with one platform, they wanted to go with the one that had the most
traffic.
So then you get more advertisers there.
With more advertisers, you get more bidders, greater bid density.
So you get higher cost per click.
And with higher cost per click, you get improved economics by which to get more syndication
and distribution partners.
So it created this virtuous.
cycle where the winner, it became winner take all.
And literally, Yahoo! Blanked.
That's what happened.
Yahoo! Blanked.
Okay, I, just for the, looking out for the listeners here,
because we're approaching an hour and a half,
and I want to ask you what you're up to these days.
I'm going to slightly yada, yada.
I'm going to say, after this, like you said,
you go to Microsoft Live Labs.
Yeah, yeah, so I found it, I found at Microsoft Live Labs,
Live Labs was a really kick-ass kind of part research institute, part, you know, startup spin-out.
We produced a ton of really amazing technologies and product innovations.
I'll call out some of them.
The Sea Dragon as kind of like a data visualization framework that became DeepZoom and Silverlight.
Photocent, which a lot of people may have heard of when Barack Obama was inaugurated the first time, CNN, literally.
crowdsource creating a 3D
environment of the inauguration
when the moment happened, and they called it
the moment. And also, you know,
Photosense was featured on a CSI crime show because
it was used to solve a crime.
Last project that I, in fact, the one I'm most proud of is one
called Pivot. Pivot was a
big, bold, ambitious take on
data and content visualization and interaction
and how you would merge the modes of
searching and browsing and discovery.
And there's a great TED Talk if you want to see him describe what that is.
Oh, thank you.
Yeah, I think I was on the main stage at TED, maybe in 2009, 2010.
So, yeah, so I was a technical fellow at Microsoft and I found it and ran live labs and had a blast there.
But in 2010, I left.
I did my own startup called clipboard.
Clipboard, if you didn't use it, it was kind of like in the intersection between
Evernote and Pinterest.
You know, it had some of the social dynamics and visual flare that a Pinterest would have,
but it was actually a little bit more useful because you could clip anything,
like, and it would preserve the look and feel and fidelity and functionality of whatever it was that you were clipping.
But we, you know, I was stupid as a CEO.
I defaulted it to private instead of public.
And they probably made a big difference in my competition with Pinterest.
But Salesforce acquired clipboard.
And then I would, I was the CTO of search.
and data science at Salesforce.
Which I was interested to read.
People wouldn't think of this, but like search is actually the most widely used feature.
On Salesforce.
Yeah.
Yeah.
And the biggest piece of technological infrastructure.
And so I ran all of that and I rebuilt all of that during my tenure there.
And one thing is that kind of like go full circle.
Remember how I said earlier in the podcast that Lee Jiles and Steve Lawrence were the first to, you know,
know, estimate the size of the web and it was like 700 million.
And then I was, I worked with Ingthomi to prove that it was a billion.
Okay.
At Overture, our index was 700 billion business objects.
Okay.
And so, you know, and I don't know the exact, and most people peg the, the web index of Google at around 150 billion.
And at Bing at around, you know, maybe 80, 90 or 100 billion.
You know, Salesforce, 700 billion, you know, records.
So that kind of like, I think that's like a really great touch point on, on, you know, how things change in as little as like a decade and a half.
But anyhow, I left over sure.
I'm sorry, I left Salesforce in the late spring of 2016 of 2016.
And basically to just kind of do my own thing.
So what's the better question to ask you?
what are you working on today or what are you interested in today in terms of what's going on?
Well, it's why don't we merge them?
Okay, okay.
So I spend about half my time working with other companies and half my time working on my own personal passions.
The stuff that's externally facing with others, I work with a lot of awesome, awesome companies
that range from billion-d-ar companies to single-person.
pre-funding, you know, start us with, you know, that haven't had their first bit of funding yet.
And everywhere in between.
There's about a dozen companies that I help in that regard.
And I typically do that in the context of being an advisor.
The things that I do for them range from, you know, overall business strategy to technology
strategy to helping them build out their data science and machine learning roadmaps,
to even product design.
Or I do.
And actually the thing that's even most common is I mentor a lot of CTOs and some CEOs as well.
But if you think about a lot of CTOs and startups, that's the first time they've ever had a job like that.
And since I've, you know, it's not my first time, not my first rodeo.
Yeah, yeah, I help them out a lot with that.
The other stuff that I work that's my own passions is, oh, and I should say from a topical perspective,
you know, my client companies include companies that work in D.R. E.E.G.
machine learning, e-commerce, search, search, search.
There's a bunch of search companies.
And so it's all over the map, and it's all really interesting as a result.
The personal stuff that I work on is it will probably seem a little bit schizophrenic and all over the map.
But it's number one, it's about some emerging technologies.
Number two, it's about some old things that I've been working on for many years, and I want to kind of continue.
and I'm also working on another book.
But the things that I'm working on now include IOT, home automation, mobile development, 3D printing, deep learning and machine learning.
That's a big push.
And some data visualization.
And yeah, and then the other book that I'm working on, which is in some ways kind of like almost a philosophical follow-on to my first book.
Well, let's end with this.
final one then because I'm interested in like those are almost buzzwords again now deep
learning machine learning that sort of thing you know I'm I'm old enough to feel like there's been
several times when I've told me what I've been told that things like machine learning we're
going to change everything and yeah and now I'm being told that again and so I guess in a way my final
question is like should I believe it this time are we on are we on the verge of you know personal
assistance and things like that revolutionizing everything yeah so so great question and the answer
is so i have i have a little bit it's not a contrarian take on this but i want to answer it in
kind of two parts um one is i i the way i like to kind of segment the universe of artificial
intelligence is i think about it as consisting of three different layers at the base is are
things and capabilities that look a lot like perception okay and so this is
technologies like image classification. So is there a dog or a cat in the image? And if it's a dog,
is it a border collie or a beagle, right? Or face recognition is that Bob or Sue? That is something
that's very much kind of based on the notion of taking a very raw input and coming up with like some way of
identifying which among many possibilities that thing is. So think about it as like classification.
So that's something that we've been wildly successful at, deep learning in particular.
And that is really on the threshold, not on the threshold.
That is at superhuman capability right now.
We know how to do that at superhuman capability or at human capability.
So we're building these visual classification systems and other classification systems and other domains that are better than humans.
Okay, but it is in some sense a different type of brute force.
You know, you don't look at those systems and say, ah, they are inherently intelligent.
No, they are intelligent.
They're inherently vastly broad in terms of the universe of data that they've been trained over.
And deep learning has given us a framework to kind of handle big data sets now.
The second layer up, layer above that is what I think of as knowledge representation.
And so here is where you would find some sort of.
mapping between what sort of knowledge and intuition and understanding that people have about how the world,
about how things relate to one another. And we see some examples of making progress in this
direction with, for example, machine translation. So when you, when you do automatic machine
translation of a document from one target language, sorry, from one source language to another
target language, state of the art for that now is machine learning base.
And it is learning in some ways in a manner that's very similar to the visual systems that we alluded to before.
But now it's embedding in that structure systems that actually get to the conceptual model of how the world works.
Because to perform that translation layer, you have to know that this utterance in this source language like English refers to the concept of like a person walking a dog or something like that.
And then you need to know how to invert that going, going from a conceptual representation to a concrete representation that's in another language and what's the right way of converting that utterance.
And so that's like the second layer.
The third layer is when you start getting into things that are much more that touch upon volition and control.
And so things like self-driving cars are, you know, on the forefront of that.
that. And so what I will claim is that as you go from the base up to the next level,
we, you know, another interesting challenge is to make it broadly applicable or as opposed
to single purpose. And so with the pattern that we're seeing now is that at the perception level,
we're going to see systems that on a regular basis are just simply better than humans.
And so I am predicting that, for example, in medical diagnosis, that's something where just like how Google today, you type in a query on the image search and it seems to like to know that there's a border collie inside that image.
Okay.
We're going to see mammograms and MRIs and x-rays being read by computers at some point in the near future.
And they're going to be vastly better than the human radiologists that are reading those.
And that doesn't mean that the job of being a radiologist is going to go away.
It means it's going to change.
It, in some sense, it's almost going to become a supervisor of these AIs that are really great at reading at the very
low level detail.
Well, in the same way that on factory floors for hundreds of years now, people have just been
managing the machines and making sure they don't break down.
That's right.
You know, and the biggest bottleneck to reading an x-ray today is the time of the radiologist
is having to pour over with a, with a, you know, a magnifying lens over their eye and zooming in
on minute detail to try to see what they see and actually scanning over the entire area.
That's something that's just right for automation.
And instead, and so if you think about it, there are problems where there is one half of the
problem is identifying whether something is novel and worth calling out.
And there's another problem of verifying that, yes, that thing that's novel is in fact really novel and important.
And so the reason why I'm kind of teasing these apart is that if you can automate the part of calling out any potential area of an x-ray that is noteworthy or novel, that is worthy of having the radiologists look at closely, then what you've done is you fragment, you bisected that problem into a piece that is right for automation.
and another piece that's right for wisdom.
Okay?
And so one of my favorite expressions,
and I said this in my TED talk,
is that,
and it's a twist on something that I think Frank Zappa said,
is that is the following.
Knowledge is not data.
What do I say?
I said, data is not knowledge.
Knowledge is not, no, shit, not is it.
No, data is not information.
Information is not knowledge.
Knowledge is not wisdom.
And my whole point in all that is that,
you start aggregating bits and pieces of information at a higher level of abstraction,
you actually require greater and broader context to elicit something that one might term as being
wise or would require a wise person to identify.
And this is why I think, like, in radiology or an x-rays, you're going to see a whole
bunch of single-purpose AIs that are reading X-rays, but it's going to be the job of the
radiologist now to validate that, yes, what they discovered is noteworthy.
and doing that through the lens of their wisdom, right?
Wisdom as compared to accuracy,
because machines will be able to get accurate,
but it's the wisdom beyond the accuracy.
That's right.
The machine doesn't know how operable or inoperable that thing is
or what the history is of the patient
as to whether they tend to be cysty or not,
or whether maybe that we already knew about these one things,
but we are looking for something new over there.
So there's a context that's going to be really,
hard for any machine to actually know, and it'll be the job of the radiologist to view those
discoveries from the AIs through the lens of that context. So that's one thing that we'll see,
that will be very revolutionary. On the other end of the continuum where we think about things
that were actually, where the AI is actually literally and figuratively in the driver's seat,
that's going to be very, it's not going to be as general purpose as, or, or, what I'm, what I mean by
that is I think that we're going to see systems like that that are built that are very specialized
for that particular task, like driving, right?
But that driving car isn't suddenly going to become self-aware and knowledgeable on the history
and the evolution of the U.S. road system.
It has no need to know that.
Right? But it will get really, really good and better than a human at driving.
In between and where we have knowledge representation and other things in that middle layer,
that's where that becomes in some sense almost like the interface between these things.
And I think that as a community, AI researchers have to put a lot more thought into figuring out how we bridge between those layers.
Because when you're down at the low level of perception, it's really easy to go broad with.
that because perception is, you know, whether you're reading x-rays or whether you're reading
or doing voice recognition, a lot of these things actually start looking very similar and the
same tools actually become applicable across a wide variety of application domains.
But the sort of techniques that we're using for like self-driving cars, those are much more
specialized.
Now, what has changed with deep learning is that, and I think deep mind and another another
groups have really kind of paved the way for this is that they are now where it used to be that
you know to make a good chess player a good go player or a good self-driving car we you'd have to do a lot
of specialized algorithms for that particular domain what has changed is that we are now building
um deep learning systems that are attempting to solve those same applications but they're doing it
from tabelil rasa and what i mean by that is that they're doing it from a blank slate they
have no, they haven't been designed or built in a way that has any special purpose knowledge
about, say, the game of go or the game of chess or driving a car, very little, I should say.
And from scratch and nothing more than through reinforcement learning are able to learn a strategy
for how to play very advanced games at a superhuman level or to become better at driving a car
than any human could be.
And so this is where things get very interesting.
But note that coming up with a deep learning solution to say playing a video game, again, that's not going to like suddenly make itself aware.
I think where all this goes is I think we're all destined to become borgs.
And that might be another podcast for another day.
But I ultimately think that anything that we do with an AI, there is going to be a way of marrying that with a human.
that the combination of human and computer is better than any AI.
And I have a whole theory as to how that's all play out,
and I think about the research that I do as how it feeds into those scenarios,
but that's ultimately what I'm betting.
Well, maybe we'll have to have you back to expound on that theory at some point.
But first of all, I appreciate you bringing Star Trek the next generation back into it
since we were geeking out on that over email.
Gary Flake, thank you for.
You're so smart on all this stuff for giving us all that stuff about where computing and all that stuff is going.
But thank you also for telling the story that I contacted you to tell and go-to, Overture, Yahoo, Google, all that stuff.
It's so fantastic.
Thank you so much for sharing all that.
My pleasure.
Thank you for doing these podcasts and kind of being there to help tell the story.
I think you're doing a wonderful thing for society because hundreds of years from now, this is going to be the record.
You know, I think, and so thank you for your part as well.
If this is the first time you're listening to this podcast, please subscribe to us on your podcast app of choice.
There's plenty more great internet history where that came from.
And if you're a longtime listener, then you know what to do to help us out.
Rate and review us on iTunes.
Because iTunes gives credit to reviews and ratings, and the more great reviews we get, the more people will discover us.
As always, there's more info on our website, www.com.
The show's Twitter handle is at NetHistoryPod, and my personal Twitter is at Brian MCC.
Thanks for listening.
