The a16z Show - a16z Podcast: Big Data Goes Really Big
Episode Date: July 24, 2015Big Data is evolving. It’s moving from the sole domain of the high priests of data science, to something that practically every organization -- big and small -- and every group within that organizat...ion can get its hands on. So what happens now? The implications of the democratization of Big Data are bigger than just big, says Prat Moghe, CEO and Founder of Cazena. And it’s not just the corporate world that will benefit, he adds, having access to Big Data tools will change how all kinds of organizations, including government agencies and other social services, operate and solve their particular problems Joining Prat on the pod to describe this changing world of Big Data in the Cloud is a16z’s Peter Levine. If every company, and every organization is becoming a data-driven company, it makes sense to start to put that data to work, Levine says. Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Welcome to the A16Z podcast. I'm Michael Copeland.
Big data is evolving. It's moving from the sole domain of the high priests of data science
to something that practically every organization, big and small, and every group within that organization, can get its hands on.
So what happens now? The implications of the democratization of big data are bigger than just big, says Pratt Moway, CEO and founder of Kizena.
implication of big data cloud making it really democratic is access to everyone,
flattening of the organization, collaborative culture, and ultimately faster decision-making.
And it's not just the corporate world that will benefit, he adds.
Having access to big data tools will change how all kinds of organizations,
including government agencies and other social services, operate and solve their particular problems.
You know, like in New York City, I think this was New Year's Eve, and they collected statistics on random gunfire, and they used that data to predict where they should deploy police so that they could ensure safety, right?
And what they realized is that they drastically cut down on random gunfire, but they needed fewer police actually to be deployed.
So you can actually cut costs, increase safety, all because.
because you're smart about, you know, collecting data.
Joining Pratt on the pod to describe this changing world of big data in the cloud
is also A16Z's Peter Levine.
If every company and every organization is becoming a data-driven company, he says,
it makes sense to start to put that data to work.
This not only democratizes the use of big data,
it democratizes the organizations that use big data,
such that it's not limited to only the Fortune 2000,
who have the capabilities to set up these large data centers.
Pratt Moway, welcome.
It's great to be here.
All the way from the greater Boston area.
That's right.
And Peter Levine, welcome as always.
Thank you.
Good to be.
Go, Pat.
We're going to talk about why big data, for starters, hasn't made it to the cloud.
And I know Kizena, Pratt, you're the CEO and founder,
is all about moving big data to the cloud and offering it as a service.
But let's back up.
Why has it been so hard for big data,
to move to the cloud?
It's a great question.
I think if you look at a typical enterprise,
and in the last 15 years,
if you look at all the other application stacks
that have moved to the cloud, CRM, ERP,
there were all siloed applications.
If you look at big data,
big data is not a siloed application.
It gets infused through the organization.
So it's usually part of some operations,
business process, and it's really hard to lift and shift it into the cloud. It's got
tentacles all over. You've got to figure out how to move the data. You've got to figure out
security. A lot of that data could be customer data, employee data, and all that patient data.
So security, complexity, real challenges that's holding big data back. And right now it's on
premise. I mean, Peter, when you look at the big data landscape, how do you sort of
see what's happening with Kizina and others, for that matter, versus how big data has kind of
been rolled out over the last couple of years. I sort of characterize Big Data 1.0 as being an
on-prem offering. It's been great, but again, that requires a deep knowledge of infrastructure,
buying out computing, all of the, you know, it's basically an on-prem data center to run
big data applications. And now what we're seeing is just like we saw applications move from
on-prem to a SaaS offering, we are thankfully seeing big data enters a big data 2.0 era,
which is moving big data from on-prem into the cloud.
And there's just tremendous benefit by having big data 2.0 in the cloud as opposed to on-prem.
The same advantages that we see with SaaS applications are the same similar advantages that we now
have with Big Data 2.0.
So I want to get into that because I think for a lot of people,
big data just by itself is a bit of a vague term,
but what are those, like why do it?
Why is it so important?
Why do I as an enterprise or a government agency for that matter?
Why do I care about big data so deeply now?
Yeah.
So just to add to Peter's comments,
like when he talks about 2.0 versus 1.0,
I just want to make clear that 1.0,
which is where most of the world is.
It's a $10 billion market,
largely going to a few incumbents, right?
The old stack, largely relational stack,
Hadoop is a decimal point, sparks just coming along.
And if you look at it,
and if you go to a marketing guy in one of these companies,
if you go to the guy who runs risk in some of these companies,
if you go to somebody who's running intelligence, like security,
and you say, do you understand what's going on
in your company like do you understand what your customers are doing what the terrorists are doing
all that you sort of realize that many of these guys will basically say i don't have access to data
it's somebody else has that data they're going to generate a report get it to me six months later
and that's how i'm going to make that decision so i think the the biggest issue i hear from many
of these companies and it's a horizontal problem it's not just a vertical problem and and size
doesn't matter, large or small.
People just don't have access to data.
They can't make decisions fast enough.
They usually reacting to something after it has happened.
And this idea that they can rack and stack gear and make these projects happen,
you know, the cycle time is way too long.
Six to nine months to deploy millions of dollars of spend every year.
And the data is growing 200% every year.
So that was my question, like, bigger, small,
you say it doesn't matter, but we know that Google, for example, is a data-driven company,
but how many Googles are there out there?
One.
But is every company now a data-driven company?
I'm just wondering, why do I care about big data if I'm, you know, a consumer package
good company or if I'm, I don't know, selling shoes, why do I care?
You care because I think, you know, it's the world of hyper-targeting.
The days of, you know, mass direct marketing are over.
You could do a big ad for everyone.
Now it's all about, hey, Peter, I want to get after Peter.
Who is Peter?
What's that demographic?
What's that profile?
And that's different from you, I'm guessing.
Well, look at our...
Yeah, just look at him.
He's wearing a blue shirt.
Anything but.
And so it's all now.
I think social, mobile, it's all about, you know, who's my customer, who's my prospect.
What are they like?
What are they like?
what are they saying about my product, like my CPG product, and how do I get to them?
I think all of that starts with getting the data fast, that data is being generated in the cloud,
and so you want to be close to that, and then you want to be making fast decisions.
So you can't wait for six to nine months to understand your customer.
Every organization is becoming a data-driven organization.
And, you know, the whole advent and use of mobile devices, we're all holding.
supercomputers in our hands and, you know, kind of targeting information and using data,
imagine if big data were available on every device that we hold access to. And I think that's a
very compelling, a very compelling use case that doesn't necessarily exist if I have silos
of on-premise kind of data centers. The whole beauty of cloud computing and SaaS kind of applications
is that the information and the data is available to a wide variety of users.
So if I'm in a company and I have my supercomputer in my hand,
I now can go ask questions to my data set that exists in the cloud
and get answers back in the same way that Salesforce.com allows the sales organization
to go get information about the sales organization.
And we just haven't seen that transformation yet on the big data and analytic side because it has been a horizontal process.
And so, you know, what we're seeing, this 2.0 kind of step takes all of the benefit of on-prem with the beauty of SaaS and cloud and combines those things together.
And I think that will create an agile use of data that we just don't see right now.
So it sounds to me like big data is in some sense getting democratized.
And if that kind of superpower was only available to large companies that were data driven like Google and Facebook, can you expand the sort of thinking and the possibility?
I mean, what happens when, like you say, everyone has access to their data set.
And that sounds somewhat sterile, but that could be anything, right?
That could be customers.
It could be people.
It could be outcomes.
So what happens?
It's very interesting, I think.
like Peter mentioned, it's agility first.
That happens immediately.
Like you don't wait to make decisions.
To find out why you're going out of business.
Yeah.
Yeah. And by the way, even if you got it wrong, you fail fast.
Right.
And then you do iterate, right?
That's the first immediate impact.
The second impact is cost.
There's a definite impact on cost because you're no longer building, you know, big iron,
which is not amortized across anyone.
It's just for you.
The third long-term impact, I think, is,
culture. I recently visited an e-tailor that's competing with Amazon, right? And so the contrast in
this e-tailer compared to like a typical retailer, in a typical retailer, you would have the
supply chain guys, the merchandising guys, all these guys are siloed, they don't talk to each other.
Here, this e-tailor, I walk into the CEO's office, he's got tableau on his desktop, hooked
into the whole
company
and all the
thousand people
in this company
have the same
table of things
so guess what
they all look at
the same data
they all can figure out
what's my issue
how's that tied
to somebody else
sitting next door
that's what you want
so I think
the cultural distance
between the supply chain
specialist in that
company and the CEO
is almost not there
right
I think that's the
Google-like
or the Facebook like
culture you want to
see permeated across
all top 2,000
enterprises
and it's not there today.
So I think the long-term implication of big data, cloud, making it really democratic,
is access to everyone, flattening of the organization, collaborative culture,
and ultimately faster decision-making.
Peter, how else do you see that then changing sort of the nature of companies as an organization
and even work how we get it done?
Once big data moves to the cloud, it unlocks a number of possibilities, including the ability to iterate over data sets that may not be accessible in silos internally.
Think about big data 2.0 as the sort of analogy is what Amazon did for compute, Big Data 2.0 is doing to the databases that exist on-prem, right?
And so you think about what an Amazon cloud is unlocked in terms of compute.
One, I don't need to build out any data center.
I pay as I go.
And I can do as many big operations, little operations as I need to with the Amazon infrastructure.
So think about now big data moving into the cloud.
It's got a lot of the same characteristics.
The agility, the benefit of iterating over data, the ability to collect data on a fungible,
basis. So as data grows, I have more capacity. If I need less capacity, it sort of, you know,
grows and shrinks with that environment. And as a result, I think decisions get way better made
by having this iterative, collaborative sort of process on data that exists, not in silos,
but in a centralized place. So that, to me, is the value of having it's the same as what happened
with SaaS applications moving to the cloud is people got a lot more productive.
Right.
Because we all had access to lots of applications where, you know, when things were on-prem,
a few people had access to the applications.
And everyone else was downstream of that, just, you know, kind of twiddling their
thumbs, waiting for the output to come to them.
So the risk, it sounds to me like of not doing this is you're still one of those folks
twiddling your thumbs.
and everyone else is racing ahead or at least asking more questions.
You're not staying relevant.
You know, having, again, the cloud model works very well for exactly this use case, right?
You have a large amount of information centrally located with lots of users accessing that information.
It is the perfect use case for cloud computing.
And so with this, I do believe that organizations will be more.
much more effective in analyzing their information through this movement from big data 1.0 to 2.0.
And Pratt, you and I have discussed in the past a little bit. It's not just Fortune 1,000 companies.
It's kind of any organization, right? I mean, like you were saying, Peter, most organizations,
if not all organizations now throw off just tons and tons, boatloads of data, right?
And do you see this then kind of being applied outside the realm of, you know, the Fortune 1,000?
into other kinds of worlds?
I think it's pervasive.
This could apply to governments.
This could apply to all kinds of social services.
I was reading recently about predictive policing.
So the idea is not new, but just this idea that, you know,
like in New York City, I think this was New Year's Eve,
and they collected statistics on random gunfire.
And they use that data to predict where they should deploy police so that they could ensure safety.
Right.
And what they realized is that they drastically cut down on random gunfire, but they needed fewer police actually to be deployed.
So you can actually cut costs, increase safety, all because you're smart about, you know, collecting data.
So I think, you know, I think this idea of analytics is pervasive.
And I think bringing it to the cloud, I think the way it helps you is now, instead of having New York as a district, think of itself as a silo, you can start to collect this data and pool this data across lots of data.
And the more data you collect, the more accurate you are.
What's saying is New York different from Chicago, really?
Is that different from Los Angeles?
you know, and so I think the power of big data in the cloud is that you can start to see signals
in the noise a lot faster.
You can collaborate and you can spot trends before they become incidents, right?
Big incidents.
Let me also add that large organizations who have the expertise and budgets to go create
on-prem data centers have a huge competitive advantage.
over the hundreds of thousands of small to mid-sized companies that are probably using Excel
spreadsheets as their data analytics tool right now.
So what this is going to do by moving in the same way that, you know, I keep going back to
the SaaS application being moved to the cloud, then everyone has access to these great
tools and services.
So what happens here is small, I believe, that big data 2.0 in the cloud is going to empower
hundreds of thousands of smaller organizations who don't have any access to big data analytics
because they don't have the expertise or budgets to go create an on-prem data center.
And so this not only democratizes the use of big data, it democratizes the organizations
that use big data such that it's not limited to only the Fortune 2000 who have the
capabilities to set up these large data centers.
And I think that that is going to be very empowering for all organizations.
That's a great point.
Something to add to that.
One other challenge that we've seen keep companies from jumping to the cloud is that the clouds are really expensive real estate to go to if you have to keep hopping back and forth.
What I mean is because it's expensive to move data into the cloud.
if you have to do some compute and you've got to move data back, there's a lot of friction.
So many large enterprises look at the cloud and say they're really excited about it.
They'll usually get into AWS.
They'll start playing around with Redshift.
They'll start playing around with Hadoop, Spark.
And then they say, wait a minute, for me to take all my $100 million infrastructure.
It's an incredibly scary prospect.
And to move it, I can make a bet, right?
Yeah, and to move it, I can't do that if it's just a one-off, right?
I got to be able to go there, land there.
It's like landing on the moon, right?
A land on the moon, if it's just a one trip, how many trips have we made to the moon?
Five, six.
But if you really want to go, stake it out.
It's got to be like a civilization, right?
And for me to go land there, I got to be able to do all kinds of processing there.
And so the other challenge with landing on, let's call it landing on the cloud,
is to take the other challenge,
which is the stack challenge in big data.
So the stack has evolved a lot
over the last five to 10 years.
It was all SQL 10 years back.
Now Hadoop came along five years back
and has a lot of strength and promise.
Sparks now, the latest addition to the stack,
very promising.
And so the enterprises are really struggling
to figure out which technology to use.
Yeah, do you guys see winners emerging
or is that even the right question?
I think it's the wrong question because I think it's ultimately about can you get your work done at a certain price point, certain price performance.
And the way we look at this is every technology is well suited for a specific problem.
And it's not about one technology being a silver bullet for everything.
The point is, though, why it becomes really interesting when it comes to the cloud is for an enterprise to move to the cloud, you've got to solve both those problems in one shot.
You've got to figure out how to make the cloud be really connected and easy, secure.
But you also got to solve the second problem, which is you've got to solve that platform problem for them,
where you say it's all about the work, and somebody's going to figure out how to map that work to the right technology.
If not, somebody's going to be forced to just move BI stuff to the cloud, and they've got to hop back again,
or they've got to do spark jobs and then move back again, and you can't do that.
Right.
And so that's the way we see this big data as a service as a concept, which we're going to do.
because Kizena's hatching and launching is all about make cloud your next platform.
So you can do all your work there.
So you're not necessarily, you know, signing on for one particular flavor of the stack.
You're just getting work done.
You could.
But, you know, once you see that succeed, you don't want, you're not, you shouldn't be forced to hub back.
Right.
You can do more and more.
You can do a data lake.
You can do a data mark.
You can do a sandbox.
You can connect Hadoop to Spark to SQL.
And that is not a technology question.
It's a workload question.
So just wanted to add that as another barrier, right?
If big data, so, Peter, your description of, you know,
all of these tens of thousands, hundreds of thousands of small and medium-sized businesses,
if they're all, you know, big data armed up and, you know, ready to go,
where do we go next?
Like, where's the next competitive advantage then if we've all got this capability?
is there something else that you see out there that that's just kind of emerging?
Where do we go from here?
My belief is that once we solve the big data 2.0 problem, we'll have the big data 3.0 problem.
But in serious, there's always the next generation.
I think that big data evolves to, once we have the platform in the cloud,
which is sort of the 2.0 aspect that Pratt just brought about,
we now can build on top of that, whether it's,
machine learning and machine intelligence on top of the base that's being created as one aspect
and or applications that start to get built for this big data pool that the application itself
becomes big data aware. So that way we are building applications that are inherently tied to
the big data information system such that we're not going through
an analytics process separate from the application.
I believe that application-aware big data is sort of the next driver in this space,
and we will get an application layer that will become very intelligent as a result of this underlying big data.
But the first movement has to be that we enable a very powerful real-time system that is offered,
in the cloud such that we can move to the next, you know, kind of the machine learning and
application aware big data as some of the next pieces. So that's sort of my belief on where
things, where things had. I mean, remember, we're only in 1.0 now. Well, yeah. You know, so I think that
that this movement of big data into the, into the cloud with the right platform and the right
tools is, you know, the next several years of sort of this big data transformation, and then
we start layering on these.
And then there's this foundational layer of data available to everyone, right?
Correct. Interesting. Okay. So you've convinced me that I'm headed towards the big data
world, but let's say I am walking into my board meeting tomorrow. We've done this move
to the cloud, and now I'm going to convince them that our next move is big data.
what's my opening argument?
What's my opening gambit?
My opening gambit, if I were walking into that boardroom,
I mean, I'd be talking to the CEO and basically be saying,
look, these new initiatives we are rolling out.
You know, they are one year, two years, three years late.
Name your time period.
You know, our competition is kicking our ass.
And if you really want to change the game or you want to catch up,
cloud is the way to go.
We've moved our CRM stack there.
We're moving our ERP stack now.
We've got to put big data on the truck now.
And if we do that, we can roll things out much faster.
And by the way, I can do it for one-fifth the cost.
I turn to the CFO next, who's sitting there,
and you can see him nodding, right?
And then the question is, can you get the security guys?
comfortable.
Comfortable.
Can you get the CIO there?
Who's concerned about running the trains
and making sure nothing gets disrupted?
And if you do that,
then you come out of the board meeting
with actions and a project.
Peter, anything to add?
I think that's spot on.
You know, it's about,
I think it's, again, it's about agility
and empowerment that really is the big sell.
The cost element, of course, is very important.
I don't have to invest
and it's the same movement that, you know, basically it's the same argument that has already
happened with SaaS applications.
It's the same thing.
You don't have to build your whole server, data center infrastructure.
Arguably, this is even more expensive than those sort of on-prem applications because
there's huge storage requirements and networking and all kinds of stuff.
It's a very complex environment.
And so to the extent that I don't have to go build that anymore, I can offload, you know,
kind of the expense and this whole notion of building.
out of data center, I think is a very compelling argument and provided that we can, of course,
get over the security and performance issues. And like Pratt said before, the performance issue
is only an issue when we're going back and forth from the cloud to on-prems. So there is a leap of
faith that, look, we're going to go and move everything to the cloud, not unlike what we've done
with Salesforce. Like, we don't arbitrate back and forth. You do all.
your stuff on Salesforce and you get data back.
It's not like the information is forced to have to come back to be pre-processed on one side
and the other side.
So those elements obviously have to be solved and have to be taken care of.
But once we're there, there's a lot of very good reason to go and do this.
Well, Big Data 1.1.0 is now moving to Big Data 2.0 and someday we'll get to 3.0, Peter.
you'll lead us to the promised land.
Pratt, Peter, thank you guys so much.
Thank you, Michael.
It was great.
Yeah.
