The a16z Show - The Cool Stuff Only Happens at Scale
Episode Date: June 5, 2015Distributed computing frameworks like Hadoop and Spark have enabled processing of "big data" sets -- but that's not enough for modeling surprise/rare "black swan" or complex events.... Just think of scenarios in disaster planning (earthquakes, terrorist attacks, financial system collapse); biology (including disease); urban planning (cities, transportation, energy power grids); military defense ... and other complex systems where unknown behaviors and properties can emerge. They can't be modeled based on (by definition impossible) limited data. And parallelization for this is hard. But what if companies and governments could answer these seemingly impossible questions -- through simulations? Especially ones where we can directly merge in knowledge and cues from the real world (sensors, sensors everywhere)? CEO of Improbable Herman Narula and Stanford University professor-in-residence at a16z Vijay Pande discuss this and more with Chris Dixon in this episode of the a16z Podcast. And as Herman says, "the cool stuff only happens at scale". The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information. Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
The content here is for informational purposes only, should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com slash disclosures.
Hi, this is Chris Dixon's the A16Z podcast. I'm here with Herman Nerula from Improbable and VJ Pandey from Stanford, who's also a professor in residence here at A16Z.
Hey guys, let's talk about distributed computing.
So my own view, I guess, I'll just start it off, is that over the next few decades, distributed computing will be a particularly important topic because we're now with things like AWS awash in computing resources.
You know, compute is becoming approaching zero, storage, networking.
But most of this is on multiple physical machines.
And it's very, very hard for software developers to build.
software that that distributes well. And so we have things like, I think, of Hadoop and its successor,
we think as successor, Spark, as frameworks for doing distributed computing for a specific
application, which is data processing. And I think we'll probably see more of that kind of pattern
among other verticals. And also at the same time, more infrastructure that helps us, you know,
programming languages, frameworks, et cetera, that help us do this. So, so I don't,
I don't know, maybe VJ, if we could start off.
How do you view this?
Yeah, I mean, I think this is the key issue, right?
Because everyone can program the standard C paradigm on one processor core.
Now actually, you know, going to multi-core is not so hard, but start thinking about multiple boxes, thousand boxes, 10,000 boxes.
This will drive you insane, right?
We can't be spending our time programming thinking about each of these things.
We need to have some type of abstraction.
And that seems to be the key.
That's how we've been successful in general with coding.
And Hadoop and its successors are great abstractions for certain things.
but you can't do MapReduce with everything.
And I think that's going to be the challenge.
And I think what we're going to see is verticals moving in certain directions that can do different things.
It's hard to imagine the magic language that solves all this problems.
And so picking things that can sort of attack certain key domains actually could have a huge impact.
Yeah, I mean, I'd agree.
I think, Avigee, when you say you can't use MapReduce for everything, that suggests an even deeper issue,
which is a lot of the approaches people have today for scale and for paralyzability.
They really boil down to problems that are quite easy to scale because they naturally
have a simple abstraction, a way of splitting them up.
I think the next few years are going to be about attacking problems
which are naturally harder to split, harder to scale.
And how we go about that, I mean,
a whole other layer of reliability is going to be required as well
for computations that are more challenging to split up.
And that suggests to me that even in terms of the cloud infrastructure
that's available, there may be completely different characteristics
that are necessary.
So I think there'll be lots of winners and losers in this space
that no one can yet predict.
And the thing to emphasize here is that the possible win is huge.
the difference between what you could do on one box versus 100,000 boxes.
I mean, that's not just like I'm being impatient.
That's being transformative.
There's a couple things happening, right?
One is, I mean, they always say Morris Law's ending.
People are saying that now.
I mean, then again, people said that every year for the last 50 years.
But to the extent that maybe it's slowing down or whatever,
that the way you're going to get additional sort of the effects of Moore's Law,
additional compute is going to be going across machines with more transistors on a single core,
number one, right?
So you need to.
Number two, as you said, like, it's a different, like,
It's not just like you're doing like three times as much.
If you're doing 100,000 times as much, it unlocks a whole new class of potential applications.
Yeah, exactly.
So, like, from your own area, like, I don't know if you could, like, so in biology, for example.
Yeah, in biology, you know, we launched folding at home in October of 2000 now, so we're coming up on.
Can you tell us people what that is?
Yeah, so folding home is a large-scale distributed computing project where people go to our website,
folding.comfodd-sneaford.edu.
They download the software.
And right now, we have about 40 petaflops worth of performance out of maybe about, you
maybe 400,000 processors.
It's actually interesting.
This was inspired by SETI at Home?
Yeah, inspired by SETI at Home, which also was inspired by GIMS,
and there's a few other ones.
I think we were the first to do sort of something in science.
I think, you know, debate whether finding aliens of science or not.
But, you know, something where in biology...
Nonfiction science.
Yeah.
In biology, you know, the challenge was we wanted to tackle problems
that would take, let's say, a million CPU days to do.
You know.
But now would it, would you...
Did you still need to do it?
do that kind of approach today, or could you do it on a cloud computing?
Yeah, I think you could do on cloud computing, but do you think about it, I think Amazon
has roughly maybe 300,000 boxes. So if we want to buy all of Amazon, that would be a little pricey.
Okay.
But you know, you think about what we do with Foiling Home, it's kind of like a time machine,
because what we did 10 years ago, people can now do on GPUs.
What we are doing now with 10,000 GPUs people will probably do in the future with maybe
a small GPU cluster. But the paradigm, the programming paradigm, the way you think about
all the same. And therefore, we are taking advantage of Moore's law as time goes on.
And so how does the future, let's say the future you can have access to a million boxes
or something like, what does it mean for the applications in biology and health care?
Yeah, I think there's a couple different ways to think about it. I think there's sort of
processing lots of data and then doing calculations either for us, we do a lot of simulation,
but I think it's people usually think about the data side, but the simulation side is actually
pretty intriguing too. It's interesting. I think, you know, all that power, it unlocks
so many more problems. I mean, how do I test my just?
distributed application. How do I even run, you know, sensible diagnostics on it?
Plus, not to mention the skills shortage and people that can really build distributed applications
well. Thinking about that alone, I think, will start to create a whole movement of people
where this skill set becomes incredibly valuable. You know, even in terms of languages and tools,
we're not well served today with good abstractions to think about distributed systems. Most of the
ideas that are being used now, like actor paradigms, for example, which some people may be
familiar with. I mean, these are from the 80s, right? Nothing's really changed in
how we attack distributed systems.
So it should be fun to see that revolution.
Is there anything promising you see in terms of languages, frameworks, infrastructure, software?
I think right now everyone is sort of rolling their own for the domain.
I mean, that's true for us.
And there's these problems that we all have that are hard to handle sort of in a generic way.
Like you have to deal with fault tolerance.
You've got a million boxes or even just 10,000 boxes.
Most likely one of them will die or have some problem along the way.
and MPI, which has been the standard in super-computing HPC,
is very fault intolerant.
You know, the whole job will crash and things like that.
And so those paradigms really have to change.
And I think the prominence of companies like Google,
which have all this on the back end,
I think have gotten people thinking about this,
and MapReduce and things like that,
those abstractions have played a huge role.
I really am looking forward to seeing where people will go.
And I think Hadoop and Spark are a good example,
but I think we need much more.
Yeah, I mean, these problems are profoundly different
from scaling web services.
And I think another interesting point is that we traditionally assume large tech companies
are going to have a hegemony over large compute problems.
But with this new space, I wonder whether existing infrastructure really will be that.
One pattern we've noticed is that whereas industry, meaning probably Google and maybe Facebook
led data center innovation and Amazon, sorry, over the last 15 years, we're seeing more
and more academic led stuff.
So Spark, as an example, came out of Berkeley.
A lot of interesting stuff at Stanford, Berkeley, MIT, kind of the usual suspect.
And I think that the theory we have is that that's because you kind of, you know, industry is very good at kind of a depth search, right? Like, you know, continuing to iterate on something. But when you need to go back and fundamentally rethink how you do something, that's probably better done in academia.
Completely. But I guess industry always needs, it needs motivational problems, right? And while the issues within biology are profound academic and potentially commercial importance, I think if you want to get the massive hackers and developers out there behind something, we need to start seeing some interesting.
problems that are kind of solvable, but through interesting innovation are going to result in
new companies. And I think there's tons of stuff out there, be it in gaming or wherever.
I think also there's interesting intersection between academia and industry. At Stanford, there's this
pervasive parallelism lab, which brings together companies in the valley, mostly big companies,
but I think startups could certainly play a role there too, because I think academics are
interested in these questions, but it's useful to have some grounding for where are the big
problems that really need to have the biggest impact right now.
Completely. And we see a lot of academics that we speak to in kind of
Cambridge, Oxford, using supercomputer methods right now, unaware that actually a distributed
systems approach might be more cost-effective or even easier to think about from that perspective.
So, Herman, your company, Improbable, build simulations. Can you talk about, you know, why are
simulations important? Sure, absolutely. Well, I mean, I guess there are many paths to knowledge,
right? And one of the ones that people are very familiar with now, I guess big data,
collecting huge amounts of information about the world and running pattern analysis on top of it.
Another approach, which we're passionate about, is completely recreating.
a phenomenon from the real world. I mean, I guess this is something Vigé would be familiar with
from a biological perspective, but imagine being able to model cities, model power grids, model telecoms
networks, actually achieving any of that, though, you know, involve solving some of the distributed
systems problems we just talked about, and also being able to think about simulation in a totally
new way. And why would someone want to model us? I mean, sure, so you can answer questions,
right? Answer those water of questions. What happens if, you know, a disease is released in this,
in this crowd? What happens if we shut down this tube station? Questions which government and
companies want to answer, but which, you know, you can't answer just looking at data,
particularly when you're considering situations that have never happened or trying to project
or understand.
So let you kind of A-B-test the real world in the way that you can do.
Yeah, but I think the problem's even deeper than that, right?
Like the problem, the big problem is, you know, stepping away even from technology,
the problem is, how do we make choices when our world is full of so many interrelated
complex systems that no one person can actually hold in their head.
And that's where simulation comes into its own, right?
It's interesting, like today, like, so one, I'll tell you my pet theory, which is sort of
in the same way that, so if you go back to like the 80s,
machine learning was kind of this rebel enclave of AI, right?
Like the mainstream AI thought you could do,
use rule-based systems.
And this rebel enclave was like, no, no,
you need to create statistics and have machines that learn.
And now it turns out, of course,
that the enclave became the dominant movement.
In fact, AI and machine learning are basically synonymous today.
Today, simulations and agent-based kind of reasoning,
it's like just kind of like Santa Fe Institute
and all these kinds of quote eccentric thinkers,
the mainstream, if you look at the social sciences,
all the mainstream kind of thought leaders use analytic approximations.
If you look at macroeconomics, for example,
they use whatever.
They have their set of equations,
and they have a very, very poor track record in predicting the future.
There's these rebel enclaves of agent-based thinking
who haven't actually been able to really run there.
I mean, so my own pet theory is a little bit like machine learning in the 80s or something,
which is until machine learning couldn't happen for real
until you had the infrastructure, right?
You needed to have massive amounts of data.
You couldn't do the kind of things like to take Google Translate an example.
Like rule-based systems were better than statistical systems
until you had the ability to scan, you know, corpora of millions of books or something, right?
I mean, it's particularly important when you consider that the times you're going to want to run simulation.
You're often dealing with phenomenon that have emergent complexity.
The cool stuff only happens at scale, right?
So, I mean, for example, we're dealing with a group called an Institute for New Economic Thinking at Oxford.
These are some amazing scientists, and they want to model the UK housing economy.
Now with 10,000, 20,000 houses, there's only so much enacters, there's only so much you're going to be able to deduce, right?
I mean, I wonder, Viji, actually, are there some things in your domain where this emerging complexity property becomes...
Yeah, I think those are the things that we're most excited about because I think, you know, analogous to what Chris was talking about,
there's tons of pencil paper, analytic work that's done in physics and chemistry, but it's reaching its limits.
The approximations you have to make really sort of take out a lot of the things that are the hope for what would be interesting and complicated.
So we turn to simulations to give us new insights that we couldn't get from other things.
If anyone listening to this wants to get an example of emerging complexity,
I recommend Googling just Conway's Game of Life.
This little cell-based automata that you can play with,
but you see so much beauty emerging from such simple roles.
Very simple.
Yeah, very simple.
Yeah.
So I think actually in physics and chemistry, this is,
simulation is a dominant paradigm.
It is actually really intriguing to imagine taking this to social areas,
social science.
Yeah.
And what, and so if you had, so if today you had your fantasy simulation scenario
with, like, say, for example, you know, to monitor,
model a cell or something. Can you explain like how that would work and what kind of questions
you might build to answer? Well, you know, the cell is interesting because a cell is more like
New York City or, you know, than like a sort of a dirt path or something like that. There's a lot
going on. And it's that complexity that leads all these immersion properties. And so the hope
about simulating a cell is that we'd be able to sort of gain some understanding that you couldn't
get from just doing the experiment alone. Because if you could just do the experiment, then that would be
fine. But there's just so much going on, it's hard to really sort of capture everything.
And so I think there's been a lot of excitement in cellular biophysics and cellular biology on the disease side
because we think diseases are sort of systemic problems, not having to do with any one single point.
I think you can imagine these systemic things are sort of what's interesting to go after,
and it's not just simulating a cell and looking at the systemic properties.
It would be simulating a city where maybe shutting down one bridge has effects sort of all over
and making small changes even could have major effects.
And it's those counter to the things are the things that I get excited about,
because kind of the things are obvious the simulation could verify.
that's not very interesting.
It's the discoveries of things that you would never think would be connected are, I think,
where the real excitement is, and that's where we've had the biggest wins.
I think it scares people a little bit, particularly in some domains like economics.
I mean, analytical methods, they come with a certain degree of certainty and trust, right?
I can exactly explain to you how and why this works, but emerging complexity is unpredictable.
It's scary.
The results may not be what you expect, and it has the potential to upset a lot of preconceived
notions about what's possible.
The other aspect that I think is interesting is this concept.
of the AB testing that you can't do this in real life as easily.
You know, you can't shut down this bridge during this time and see what would happen.
And so the ability to do this type of AB testing and optimizing things before you bring into
the real world is actually also exciting.
I think, finally, I think one of the things that we've seen in terms of these types of areas
is that it always starts with heuristics.
When people built bridges, people built bridges in Roman times, you know, they didn't have
simulations of bridges or F equals MA or anything like that.
But now with the Bay Bridge, you know, that, you know, multi-billion dollar thing,
wouldn't just do that empirically. And so I think as the simulations and sort of analytics become
better and better, we don't have to use our best guesses, which is what heuristics are. We can
actually really just see what's going to happen. So there's a common preconception that I wonder
if Fiji or maybe Eucharican attack, and it's an interesting idea that you often hear thinking about
simulation, which is how do you know the model is right? And could your simulation be useful unless
your model is 100% accurate? I mean, what do you think of that, Vichai? Yeah, I think, you know,
there's a couple of things. One is that there's ways of testing on back data to see if you're right. But
you're correct that actually the simulation doesn't have to be perfect to be provocative.
A, there's, you know, the ideal is something where it's perfect and it gives quantitative predictions,
whether of the stock market or of traffic or something like that.
But a lot of times things are useful even if it just gives you an idea or a hypothesis or a new insight that you wouldn't have gotten just by thinking about it or by doing pencil paper math.
And then that hypothesis can be tested in other ways.
So Herman, so your customers are using improbable to simulate cities.
Can you talk about like how that might work?
Sure, awesome.
Wow, to be quite concrete and to give them a little mention,
Matthew Ives at Oxford and the ITRC group
are very interested in modeling large city infrastructure.
So from their perspective, they see cities as interconnected layers of infrastructure
where each layer actually isn't as significant as the sum total of all the layers
working together in interesting ways.
Now, again, the limiting factors in being able to see that immersion complexity
and to be able to poke them is enough scale and enough detail.
So, you know, what we're hoping to do is part of platform,
almost an OS where they can build those sorts of simulations.
But actually, I think the cool things will happen when we can go a little further than that.
Not simply creating standalone models, but actually instrumenting real cities.
Imagine a simulation bowed by sensor data for millions of sensors placed around a city that actually lets you.
So a car in the simulation actually corresponds to a car with a Internet of Things device sitting on it.
Absolutely.
And so maybe half the entities in the simulation are actually real world entities and half could be modeled or something.
And for every event, there are knock-on consequences.
If there's an accident or a traffic jam, you know, it's possible to then extrapolate from that potential other outcomes and scenarios that would be of interest to a wide variety of people.
I mean, I guess that it would improbable, we don't really believe that a simulation should be this like standalone box where knowledge comes out of.
We see it as an operating platform somewhere that you can actually make decisions and build applications that consume that simulation.
That's also something that's been missing.
I mean, the community, I think, is very much, Vijay, correct me if I'm wrong here, I might be jumping out of my purview.
But the community is very much inspired from supercomputer research, which was always about putting in data and getting an answer out.
Thinking of simulations in this more like almost Web 2.0 app way is quite flexible constructs is a little alien to people in the space at the moment.
And I think we're going to see more of this because the desire to have one place where you can integrate simulation predictions and sort of experimental data, whether it's IoT experiments or whatever, that becomes very powerful.
Because imagine having a million IoT devices.
How do you even understand what's going on and how do you put that together and how do you get a picture of what it means?
Yeah, exactly.
I mean, it's quite weird.
A lot of customers we've spoken to, they just want to visualize the current state of a large complex system, let alone simulating it.
Turns out to be quite a hard challenge.
And then, you know, there are dark aspects you can't see and if simulation can fill in the gaps between that, then suddenly you have a full picture.
Completely.
I mean, there are undoubtedly amazing companies out there that have built massive distributed computing infrastructures that live on their own proprietary hardware.
The question, though, is we start to think about the kinds of applications that Chris InVigier are talking about, where problems are not so easy to paralyze, how useful those software infrastructures are going to be in the future.
I mean, I think ultimately the companies and solutions are going to start dominating the space are going to have to redo a lot of stuff at quite low layers in order to make it effective.
So there may be a need to throw out some of the work that's gone before.
So just on the business side of simulations, one area that people have been interested in is that increasingly there's pressure on companies, particularly around,
sort of core infrastructure like financial services, public safety, et cetera, to do serious
disaster recovery planning. And that includes both in the case of, let's say, cyber attacks,
also physical, like terrorist attacks. So, you know, if there is a terrorist attack,
God forbid, you know, against some financial institutions or something, you know, we want to make
sure that the system as a whole is robust and survives that. And so there's lots and lots of thought
being put into this concept of sort of disaster planning.
And it seems like an area where simulations can be quite useful.
Yeah, I mean, I think even conceptually, simulations tend to allow you to consider the vulnerabilities
in your infrastructure a little bit more objectively than you would if you were being
totally analytical, because it doesn't require a human being to, you know, maybe focus on what
vulnerabilities seem most obvious.
For example, when we look at cascading failures in power grids, which is an area that,
you know, we've explored it with improbable, how those failures arise can often be the
accumulation of many, many disparate, like seemingly irrelevant events and slight vulnerabilities
which add up together to cause a big catastrophe. Again, I think this might even relate perhaps
to some of the stuff. These events are these are these sort of black swan events.
Exactly. Exactly. If you look at, like, as an example, like the airplane industry has,
air travel has gotten much safer over time. Unfortunately, they have a number, they had a number
of crashes in order to learn from, you know, which of course is tragic, but is also from a
disaster recovery planning, point of view, a positive because they had many data points, right?
When you talk about things like massive terrorist attacks, you have one or two data points.
So you have no historical pattern from which to train from.
Even like a nine or greater earthquake, let's say, in the Bay Area, what do you tell people to do?
First, to be really powerful to be able to simulate different possibilities.
And the second one is in the moment having IoT-like information for what people are doing to be able to feed into making predictions for where to go from here.
Instant decision making.
Yeah.
That combination, you can imagine the team has seen this 100 times
because the simulations have been running over the last two years.
And then in the moment, they're sort of ready for the game plan.
They're getting information from IoT,
and they're sort of in real time deciding what to do based on what they've already run
or what the simulation would even predict would be the best thing.
Indeed, or even considering how a situation like a riot or civil unrest
or a problem might evolve over time, given its current situation
and given the mechanics of that group of people that the simulation is able to model and explore.
I mean, these are all things that people don't even dream of doing today.
I mean, and to make them possible and usable over so many different domains, I think is an immense challenge.
And it's not something where you can just have a bunch of guys and say, oh, I think this is what we should do.
I mean, to be data driven and to really use the data on the ground in a way that no human could wrap their head around could really be something fantastic.
And could be the difference between life or death for many people.
And it's another area where you can't just simulate one thing, right?
It's not just about the physical effects of the earthquake on building integrity.
It's about everything together.
It's about social conditions.
It's about whether...
And the car's out here but not there and so on.
Exactly. And it's often the little details that slowly accumulate and add up and make a simulation meaningful.
I mean, I'm always reminded of the...
I don't know how particularly relevant this would be.
I'm always reminded of the prisoner dilemma simulations that have taken place a long time ago, very simple simulations.
But as you add more detail, the results become completely different.
You know, when you just run prisoner dilemma-style games between participants, okay, you get one outcome.
But when you start to introduce geographic components to those simulations, suddenly it all changes again.
I mean, that's why we need a system something that will let people introduce detail at pretty much arbitrary scales in order to really get more and more accurate models.
Okay, thanks guys.
