a16z Podcast - a16z Podcast: Big Data Goes Really Big
Episode Date: July 24, 2015Big Data is evolving. It’s moving from the sole domain of the high priests of data science, to something that practically every organization -- big and small -- and every group within that organizat...ion can get its hands on. So what happens now? The implications of the democratization of Big Data are bigger than just big, says Prat Moghe, CEO and Founder of Cazena. And it’s not just the corporate world that will benefit, he adds, having access to Big Data tools will change how all kinds of organizations, including government agencies and other social services, operate and solve their particular problems Joining Prat on the pod to describe this changing world of Big Data in the Cloud is a16z’s Peter Levine. If every company, and every organization is becoming a data-driven company, it makes sense to start to put that data to work, Levine says.
Transcript
Discussion (0)
Welcome to the A16Z podcast. I'm Michael Copeland.
Big data is evolving. It's moving from the sole domain of the high priests of data science
to something that practically every organization, big and small, and every group within that
organization can get its hands on. So what happens now? The implications of the democratization
of big data are bigger than just big, says Pratt Mogay, CEO and founder of Kizena.
implication of big data, cloud, making it really democratic is access to everyone,
flattening of the organization, collaborative culture, and ultimately faster decision-making.
And it's not just the corporate world that will benefit, he adds.
Having access to big data tools will change how all kinds of organizations,
including government agencies and other social services, operate and solve their particular problems.
You know, like in New York City, I think this was New Year's Eve, and they collected statistics on random gunfire,
and they used that data to predict where they should deploy police so that they could ensure safety, right?
And what they realized is that they drastically cut down on random gunfire, but they needed fewer police actually to be deployed.
So you can actually cut costs, increase safety, all because.
because you're smart about, you know, collecting data.
Joining Pratt on the pod to describe this changing world of big data in the cloud
is also A16Z's Peter Levine.
If every company and every organization is becoming a data-driven company, he says,
it makes sense to start to put that data to work.
This not only democratizes the use of big data,
it democratizes the organizations that use big data,
such that it's not limited to only the Fortune 2000,
who have the capabilities to set up these large data centers.
Pratt Moway, welcome.
It's great to be here.
All the way from the greater Boston area.
That's right.
And Peter Levine, welcome as always.
Thank you.
Good to be.
Go, Pat.
We're going to talk about why big data, for starters, hasn't made it to the cloud.
And I know Kizena, Pratt, you're the CEO and founder,
is all about moving big data to the cloud and offering it as a service.
But let's back up.
Why has it been so hard for big data,
to move to the cloud?
It's a great question.
I think if you look at a typical enterprise,
and in the last 15 years,
if you look at all the other application stacks
that have moved to the cloud, CRM, ERP,
there were all siloed applications.
If you look at big data,
big data is not a siloed application.
It gets infused through the organization.
So it's usually part of some operations,
business process, and it's really hard to lift and shift it into the cloud. It's got
tentacles all over. You've got to figure out how to move the data. You've got to figure out
security. A lot of that data could be customer data, employee data, and all that patient data.
So security, complexity, real challenges that's holding big data back. And right now it's on
premise. I mean, Peter, when you look at the big data landscape, how do you sort of
see what's happening with Kizina and others, for that matter, versus how big data has kind of
been rolled out over the last couple of years? I sort of characterize Big Data 1.0 as being an
on-prem offering. It's been great, but again, that requires a deep knowledge of infrastructure,
buying out computing, all of the, you know, it's basically an on-prem data center to run
big data applications. And now what we're seeing is just like we saw applications move from
on-prem to a SaaS offering, we are thankfully seeing big data enters a big data 2.0 era,
which is moving big data from on-prem into the cloud.
And there's just tremendous benefit by having big data 2.0 in the cloud as opposed to on-prem.
The same advantages that we see with SaaS applications are the same similar advantages that we now have
with big data 2.0.
So I want to get into that because I think for a lot of people,
big data just by itself is a bit of a vague term.
But what are those, like why do it?
Why is it so important?
Why do I as an enterprise or a government agency for that matter?
Why do I care about big data so deeply now?
Yeah.
So just to add to Peter's comments, like when he talks about 2.0 versus 1.0,
I just want to make clear that 1.0, which is where most of the world is,
It's a $10 billion market, largely going to a few incumbents, right?
The old stack, largely relational stack, Hadoop is a decimal point, sparks just coming along.
And if you look at it, and if you go to a marketing guy in one of these companies,
if you go to the guy who runs risk in some of these companies,
if you go to somebody who's running intelligence, like security,
and you say, do you understand what's going on in your company?
You're like, do you understand what your customers are doing, what the terrorists are doing, all that?
You sort of realize that many of these guys will basically say, I don't have access to data.
It's somebody else has that data.
They're going to generate a report, get it to me six months later, and that's how I'm going to make that decision.
So I think the biggest issue I hear from many of these companies, and it's a horizontal problem.
It's not just a vertical problem.
And size doesn't matter, large or small.
people just don't have access to data
they can't make decisions fast enough
they usually reacting to something after it has happened
and this idea that they can rack and stack gear
and make these projects happen
the cycle time is way too long
six to nine months to deploy
millions of dollars of spend every year
and the data is growing 200% every year
so that was my question like big or small you say it doesn't matter
but we know that Google, for example, is a data-driven company, but...
How many Googles are there out there?
One.
But is every company now a data-driven company?
I'm just wondering, why do I care about big data if I'm, you know, a consumer package
is good company, or if I'm, I don't know, selling shoes, why do I care?
You care because I think, you know, it's the world of hyper-targeting.
The days of, you know, mass direct marketing are over where you could do a...
big ad for everyone now it's all about hey peter i want to get after peter who is peter what's that
demographic what's that profile and that's different from you i'm guessing um and and so look at our
yeah yeah just look at me he's wearing a blue shirt you anything but yeah and so it's all now
i think social mobile it's all about you know who's my customer who's my prospect what are they like
what are they like, what are they saying about my product, like my CPG product, and how do I get
to them? I think all of that starts with getting the data fast, that data is being generated
in the cloud, and so you want to be close to that, and then you want to be making fast decisions.
So you can't wait for six to nine months to understand your customer.
Every organization is becoming a data-driven organization.
And, you know, the whole advent and use of mobile devices, we're all holding supercomputers in our hands and, you know, kind of targeting information and using data.
Imagine if big data were available on every device that we hold access to.
And I think that's a very compelling, a very compelling use case that doesn't necessarily exist if I have silos of on-premise kind of data.
centers. The whole beauty of cloud computing and SaaS kind of applications is that the information and
the data is available to a wide variety of users. So if I'm a, if I am in a company and I have my
supercomputer in my hand, I now can go ask questions to my data set that exists in the cloud
and get answers back in the same way that Salesforce.com allows the sales organization to go get
information about the sales organization. And we just haven't seen that transformation yet on the
big data and analytics side because it has been a horizontal process. And so, you know,
what we're seeing, this 2.0 kind of step takes all of the benefit of on-prem with the beauty of
SaaS and cloud and combines those things together. And I think that will.
create an agile use of data that we just don't see right now.
So it sounds to me like big data is in some sense getting democratized.
And if that kind of superpower was only available to large companies that were data driven
like Google and Facebook, can you expand the sort of thinking and the possibility?
I mean, what happens when, like you say, everyone has access to their data set.
And that sounds somewhat sterile, but that could be anything, right?
That could be customers, it could be people, it could be outcomes.
So what happens?
It's very interesting, I think.
Like Peter mentioned, it's agility first.
That happens immediately.
Like you don't wait to make decisions.
To find out why you're going out of business.
Yeah, yeah.
And by the way, even if you got it wrong, you fail fast.
Right.
And then you do iterate, right?
That's the first immediate impact.
The second impact is cost.
There's a definite impact on cost because you're no longer building, you know, big,
iron, which is not amortized across anyone. It's just for you. The third long-term impact,
I think, is culture. I recently visited an e-tailor that's competing with Amazon, right? And so
the contrast in this e-tailer compared to like a typical retailer, in a typical retailer,
you would have the supply chain guys, the merchandising guys, all these guys are siloed. They
don't talk to each other. Here, this e-tailor, I walk into the CEO's office.
He's got table on his desktop hooked into the whole company.
And all the 1,000 people in this company have the same tableau thing.
So guess what?
They all look at the same data.
They all can figure out what's my issue.
How's that tied to somebody else sitting next door?
That's what you want.
So I think the cultural distance between the supply chain specialist in that company and the CEO is almost not there.
Right?
I think that's the Google-like or the Facebook-like culture you want to see permeated
across all top 2,000 enterprises, and it's not there today.
So I think the long-term implication of big data, cloud, making it really democratic, is
access to everyone, flattening of the organization, collaborative culture, and ultimately
faster decision-making.
Right.
Peter, how else do you see that then changing sort of the nature of companies as an organization
and even work how we get it done?
Once big data moves to the cloud, it unlocks a number of possibilities, including the ability to iterate over data sets that may not be accessible in silos internally.
Think about Big Data 2.0 as the sort of analogy is what Amazon did for compute, Big Data 2.0 is doing to the databases that exist on-prem, right?
And so you think about what an Amazon cloud is unlocked in terms of compute.
One, I don't need to build out any data center.
I pay as I go and I can do as many big operations, little operations as I need to with the Amazon infrastructure.
So think about now big data moving into the cloud.
It's got a lot of the same characteristics, the agility, the benefit of iterating over data,
the ability to collect data on a fungible.
basis. So as data grows, I have more capacity. If I need less capacity, it sort of grows and shrinks
with that environment. And as a result, I think decisions get way better made by having this
iterative, collaborative sort of process on data that exists, not in silos, but in a centralized
place. So that to me is the value of having it. It's the same as what happened with SaaS applications.
moving to the cloud is people got a lot more productive because we all had access to lots
of applications where, you know, when things were on-prem, a few people had access to the
applications. And everyone else was downstream of that, just, you know, kind of twiddling their
thumbs, waiting for the output to come to them. So the risk, it sounds to me, like, of not doing
this is you're still one of those folks twiddling your thumbs. And everyone else is racing ahead
or at least asking more questions.
You're not staying relevant.
You know, having, again, the cloud model works very well for exactly this use case, right?
You have a large amount of information centrally located with lots of users accessing that information.
It is the perfect use case for cloud computing.
And so with this, I do believe that organizations will be much more effective in analyzing
their information through this movement from big data 1.0 to 2.0.
And Pratt, you and I have discussed in the past a little bit, it's not just Fortune
1,000 companies, it's kind of any organization, right?
I mean, like you were saying, Peter, most organizations, if not all organizations,
now throw off just tons and tons, boatloads of data, right?
And do you see this then kind of being applied outside the realm of, you know, the Fortune
1,000 into other kinds of worlds?
I think it's pervasive.
This could apply to governments.
This could apply to, you know, all kinds of social services.
You know, I was reading recently about predictive policing.
So the idea is not new, but just this idea that, you know, like in New York City, I think
this was New Year's Eve, and they collected statistics on random gunfiles.
and they use that data to predict where they should deploy police so that they could ensure safety, right?
And what they realized is that they drastically cut down on random gunfire, but they needed fewer police actually to be deployed.
So you can actually cut costs, increase safety, all because you're smart about, you know, collecting data.
So I think, you know, I think this idea of analytics is,
pervasive. And I think bringing it to the cloud, I think the way it helps you is now, instead of
having New York as a district, think of itself as a silo, you can start to collect this data and
pool this data across lots of data. And the more data you collect, the more accurate you are.
What's saying is New York different from Chicago, really? Is that different from Los Angeles?
And so I think the power of big data in the cloud is that you can start to see signals in the
noise a lot faster you can collaborate and you can spot trends before before they become
incidents right big incidents let me also add that large organizations who have the expertise
and budgets to go create on-prem data centers have a huge competitive advantage over the
hundreds of thousands of small to mid-sized companies that are probably using Excel spreadsheets
as their data analytics tool right now.
So what this is going to do by moving in the same way that, you know, I keep going back
to the SaaS application being moved to the cloud, then everyone has access to these great
tools and services.
So what happens here is small, I believe that big data 2.0 in the cloud is going to empower
hundreds of thousands of smaller organizations who don't have any access to big data
analytics because they don't have the expertise or budgets to go create an on-prem data center.
And so this not only democratizes the use of big data, it democratizes the organizations
that use big data, such that it's not limited to only the Fortune 2000 who have the
capabilities to set up these large data centers. And I think that that is going to be very
empowering for all organizations. That's a great point. Something to add to that. One,
other challenge that we've seen keep companies from jumping to the cloud, is that the clouds
are really expensive real estate to go to if you have to keep hopping back and forth.
What I mean is because it's expensive to move data into the cloud, if you have to do some
compute and you've got to move data back, there's a lot of friction.
so many large enterprises look at the cloud and say they're really excited about it
they'll usually get into AWS they'll start playing around with Redshift they'll
start playing around with Hadoop Spark and then they say wait a minute for me to take all
my 100 million dollar infrastructure it's an incredibly scary prospect and I
and to move it I can't do that if it's just a one-off right I got to be able to go there
land there, it's like landing on the moon, right?
A land on the moon is with just a one trip?
How many trips have we made to the moon?
Five, six.
But if you really want to go, stake it out, it's got to be like a civilization, right?
And for me to go land there, I got to be able to do all kinds of processing there.
And so the other challenge with landing on, let's call it landing on the cloud,
is to take the other challenge, which is the stack challenge in big data.
So the stack has evolved a lot over the last five to ten years.
It was all SQL 10 years back.
Now, Hadoop came along five years back and has a lot of strength and promise sparks now,
the latest addition to the stack, very promising.
And so the enterprises are really struggling to figure out which technology to use.
Yeah, do you guys see winners emerging, or is that even the right question?
I think it's the wrong question because I think it's ultimately about can you get your work done
at a certain price point,
certain price performance.
And the way we look at this is every technology
is well suited for a specific problem.
And it's not about one technology
being a silver bullet for everything.
The point is, though,
why it becomes really interesting
when it comes to the cloud
is for an enterprise to move to the cloud,
you've got to solve both those problems
in one shot.
You've got to figure out how to make the cloud
be really connected and easy, secure.
But you also got to solve the second.
problem, which is you've got to solve that platform problem for them, where you say it's all
about the work, and somebody's going to figure out how to map that work to the right technology.
If not, somebody's going to be forced to just move BI stuff to the cloud, and they've got to
hop back again, or they've got to do spark jobs, and then move back again, and you can't do that.
Right.
And so that's the way we see this big data as a service as a concept, which Casina's hatching
and launching, is all about make cloud your next platform.
so you can do all your work there.
So you're not necessarily, you know,
signing on for one particular flavor of the stack.
You're just getting work done.
You could, but, you know, once you see that succeed,
you don't want, you're not, you shouldn't be forced to hub back.
Right.
You can do more and more.
You can do a data lake.
You can do a data mark.
You can do a sandbox.
You can connect Hadoop to Spark to SQL.
And that is not a technology question.
It's a workload question.
Right.
So just wanted to add that as another.
barrier right if big data so Peter your description of you know all these tens of thousands
hundreds of thousands of small and medium sized businesses if they're all you know big data
armed up and you know ready to go what where do we go next like where's the next competitive
advantage then if if if we've all got this capability is there something else that you see out
there that that's just kind of emerging where do we go from here my belief
is that once we solve the big data 2.0 problem, we'll have the big data 3.0 problem.
But in serious, there's always the next generation.
I think that big data evolves to, once we have the platform in the cloud, which is sort of
the 2.0 aspect that Pratt just brought about, we now can build on top of that, whether it's
machine learning and machine intelligence on top of the base that's being created as one aspect
and or applications that start to get built for this big data pool that the application itself
becomes big data aware.
So that way we are building applications that are inherently tied to the big data,
the big data information system such that we're not going through an analytics process
separate from the application.
I believe that application-aware big data is sort of the next driver in this space
and we will get an application layer that will become very intelligent as a result of this underlying big data.
But the first movement has to be that we enable a very powerful real-time system
that is offered in the cloud such that we can move to the next, you know,
kind of the machine learning and application aware big data as some of the next pieces.
So that's sort of my belief on where things head.
I mean, remember, we're only in 1.0 now.
Well, yeah.
So I think that this movement of big data into the cloud with the right platform and the right tools
is the next several years of sort of this big data transformation,
and then we start layering on these other pieces.
foundational layer of data available to everyone.
Correct.
Interesting.
Okay, so you've convinced me that I'm headed towards the big data world, but let's say I am
walking into my board meeting tomorrow.
We've done this move to the cloud, and now I'm going to convince them that our next
move is big data.
What's my opening argument?
What's my opening gambit?
My opening gambit, if I were walking into that boardroom, I mean, I'd be talking about
to the CEO and basically be saying, look, the, you know, these new initiatives we are rolling
out, you know, they are one year, two years, three years late. Name your time period. You know,
our competition is kicking our ass. And if you really want to change the game or you want
to catch up, cloud is the way to go. We've moved our CRM stack there. We're moving our ERP
stack now. We got to put big data on the truck now. And if we do that, we can roll things out
much faster. And by the way, I can do it for one-fifth the cost. I turn to the CFO next, who's sitting
there, and you can see him nodding, right? And then the question is, can you get the security
guys, you know, comfortable? Comfortable with that, yeah. Can you get the CIO there? Who's
concerned about running the trains and making sure nothing gets disrupted? And if you do that, then
you come out of the board meeting with actions and a project.
Peter, anything to add?
I think that's spot on.
You know, it's about, I think it's, again, it's about agility and an empowerment that really
is the big sell.
The cost element, of course, is very important.
I don't have to invest in the same movement that, you know, basically it's the same argument
that has already happened with SaaS applications.
It's the same thing.
You don't have to build your whole server, data.
center infrastructure, arguably, this is even more expensive than those sort of on-prem
applications because there's huge storage requirements and networking and all kinds of stuff
as a very complex environment. And so to the extent that I don't have to go build that anymore,
I can offload, you know, kind of the expense and this whole notion of building out a data
center, I think is a very compelling argument and provided that we can, of course, get over
the security and performance issues. And like,
said before, the performance issue is only an issue when we're going back and forth from
the cloud to on-prems. So there is a leap of faith that, look, we're going to go and move
everything to the cloud, not unlike what we've done with Salesforce. Like, we don't arbitrate
back and forth. You do all your stuff on Salesforce and you get data back. It's not like the
information is forced to have to come back to be pre-processed on one side and the other side.
Those elements obviously have to be solved and have to be taken care of.
But once we're there, there's a lot of very good reason to go and do this.
Well, Big Data 1.1.0 is now moving to Big Data 2.0, and someday we'll get to 3.0, Peter.
Yes, we will.
You'll lead us to the promised land.
I don't know about me, but Pratt, Peter.
Thank you guys so much.
Yeah.
Thank you, Michael.
It was great.