Drill to Detail - Drill to Detail Ep.6 'Data Capital and the Economics of Big Data' with Special Guest Paul Sonderegger
Episode Date: October 25, 2016Mark Rittman is joined by special guest Paul Sonderegger, Oracle's Big Data Strategist, to talk about Data Capital and the economic impact of big data...
Transcript
Discussion (0)
Hello and welcome to Drill to Detail, the podcast series about the world of big data,
analytics and data warehousing, and the people and technology behind the headlines.
Each week I'm joined by someone who either works with or builds the analytics platforms driving innovation in the industry,
or like me, analyses the market and thinks about
what the impact this is going to be on the economy and on business and so on.
Just like my guest in this week's episode, none other than Paul Sonderiger,
who some of you might know from presentations he's given
at Open World or other events around the world,
in his role as Oracle's big data strategist.
Paul came to Oracle through the Indecker acquisition a few years ago.
And whilst the technology from Indecker went on to inspire
much of the new BI and big data products Oracle are now deploying,
Paul's been traveling the world talking to executives about big data
and most recently, a concept he's termed data capital,
which is a way of thinking about data and big data initiatives from an economic perspective.
So, Paul, thanks for coming on the show.
And why don't you introduce yourself properly to the audience?
Sure. Well, it's great to be here, Mark.
I appreciate the opportunity to sit down and chat about data capital.
But by way of introduction, yes, I'm a big data strategist at Oracle. And before that, in DECA, and before
that, I was an analyst at Forrester Research for a number of years. And the thing that I do now at
Oracle is lead our work on data capital. And the essence of that work is to flesh out the economic story behind big data and tease out the implications for competitive strategy. about what it means to their businesses that the entire world is being digitized and datafied.
What are the implications for competitive strategy? And then what kind of technology
would then support these new competitive strategies that they want to put into place?
Thanks, Paul. It's great to have you with us. So I first heard you speak a
couple of years ago, Oracle Open World, I think it was, where you talked about how everything is
now being digitized and datafied and explained it in a really accessible, business-relevant way.
My customers should be thinking about what that means and the opportunities and threats it brings.
And then I followed your writing and presentation since then and seen that evolve into what you're
now referring to as data capital. So can you go into a bit more detail? What is data capital? Sure. Well, let's begin at
the beginning and talk about what data capital is. You know, I should say first that data capital
is not a metaphor. This is not data is the new oil, data is the new gold, data is the new
electricity, although that one is particularly good. Of the metaphors, that's the best one. But this is not a metaphor. What we're saying with data capital is that data fulfills the literal economic textbook definition of capital. So capital, in the eyes of economists, is a produced good as opposed to a natural resource.
And it is produced through some process and there's investment that goes into building that process and whatever technology gets used in it.
But the output of that process is then an input into creating another good or service. It's what economists call an economic
factor of production. So without going any further, let me just give you our definition
of data capital. Data capital is the recorded information necessary to produce a good or
service, which is really boring. So let me give you an example. If you think about
a retailer who wants to go into a new market, a new geographical market, they have to build new
facilities or buy facilities. They have to extend their supply chain, build out the inventory.
If the retailer lacks the financial capital to make all of those investments, it cannot go to that new region. By the same token,
if that retailer wants to create a new dynamic pricing algorithm or a new recommendation engine,
but lacks the data to feed that engine, they cannot create that new service.
Data is a kind of capital, an economic factor of production in new digital products and services.
So, Paul, I've read in articles you've written for Forbes and so on there, this idea that
data capital can actually substitute for other forms of capital, which obviously has some fairly
kind of significant implications in terms of kind of new entrance to a market and how you get
companies started and how you kind of, I suppose, gain market share. What do you mean by that? Explain a bit more about that idea.
Sure.
The essence of this claim that data capital can be substituted for financial capital,
for human capital, comes from an activity-based perspective on how companies work.
And this is really based on the work of Michael Porter
at Harvard Business School.
And the activity-based perspective of how firms work
is that everything that happens in a company is an activity.
So creating customer segmentation for a marketing campaign,
handling a trouble ticket when a customer calls up, distributing product to warehouses.
All of those things are activities.
Every single activity has a financial component, a skill component.
That's where your people are.
It has a process component, a technology component, and an information component.
That information component is now being digitized and as that happens, as you
dramatically cut the cost to capture data, capture information from these
activities and use information in these activities, as
you dramatically cut the cost of doing that, it turns out that you can do things like run
that process at higher levels of throughput, with greater quality, with lower error rates,
with less actual financial capital, with less money to do it.
And we're not the only one with this perspective.
You're actually starting to see McKinsey talk about the fact that these new online consumer services,
they call them asset-light companies because their tangible assets are so small
relative to their valuations, especially.
And so there is where you get a vivid example
of data capital substituting
for traditional financial capital.
So it's interesting you say there
about the principles of data capital.
And I've read before some articles of yours
and seen some presentations
where you talk about the idea
of there being laws of data capital
or three principles.
So hearkening back to, I suppose, the kind of laws of motion and so on so just define for us first of all what do you mean by the principles of data capital
here are the three principles of data capital one data comes from activity and so that means
that if your company is not part of the activity when
it happens, your chance to capture its data is lost forever. It doesn't come back. Two,
data tends to make more data. And here, the real focus is on algorithms. So while analytics for
people are great, algorithms operate far beyond human scale. And more importantly, they create
data about their own performance that can be fed
back into the model to improve their future performance. And three, platforms tend to win.
Platform competition is a normal thing in information intensive industries and the
digitization and datafication of more activities brings platform competition to industries that have never seen it before.
Yeah, certainly. I mean, if you think about some of the disruption in markets by the likes of,
say, Uber with transportation, hotels with Airbnb, I guess probably the most striking example is
Google with the advertising industry. Oh, yeah. I mean, Google is just such an extraordinary example of not only the insight into data as a kind of capital,
but just really early on, you know, their index,
just their search index, their search index,
that is a proprietary data capital asset.
And then, you know, another great insight is that they then use that
proprietary data capital asset to feed their search ranking algorithms and to deliver this
search service, which then produces new proprietary data capital. So they capture data about everything
you and I click on in the search results. And perhaps more importantly, they collect data
about everything you and I do not click on in the search results. And the record of our actions
there then becomes an input into the performance of the search algorithm the next time a user like you
searches on the term that you just searched on. So that is using data to make data, which is
actually one of the key principles of data capital. I'm particularly interested by that last
principle you mentioned there, the idea that platforms tend to win. I think we can all kind of,
we all get that in terms of things like email and communications and kind of cloud and so mentioned there, the idea that platforms tend to win. I think we can all kind of, you know, we all get that in terms of things like email and
communications and kind of cloud and so on there.
That, you know, if you own the platform that runs the transactions, that hosts people email,
then you've got particular insights there into people's behavior and activities and
desires that, you know, other people wouldn't have, other companies wouldn't have.
But where have you seen that apply outside of those areas in more traditional industries?
Sure, sure.
Well, the idea that platforms tend to win is something that we observe in a lot of information intensive industries, technology industries especially.
So you, of course, remember the battle for the desktop operating system,
and that was platform competition.
To be the platform between application developers and end users.
Similarly, that's what we see with the battle for the mobile operating system.
It's what we see in video games,
where a gaming console is
a platform between developers on one side and gamers on the other. But I should be careful to
point out here that what we're talking about in this case is platforms through the eyes of economists. So not not the way that technologists think about platform is, you know,
a fundamental technology on which you build higher levels of the stack.
For a moment, step back and let's look through the eyes of economists
at the idea of platform.
To do that, let's look at credit cards.
So a credit card is a payment platform. And what distinguishes a platform
business is that it serves a two-sided market. And there could be more, but there are at least
two sides. And so with credit cards, you have consumers on one side, those of us using credit
cards to pay for things. And then on the other side, the other
side of the market is merchants, retailers, restaurants who are taking credit cards for
payment for things. And the interesting thing about platforms, economic platforms, is that
growth on one side of the market tends to encourage growth on the other side of
the market. So we as consumers, we want to have in our pocket the card that more merchants will take.
Merchants want to take the card that more consumers have in their pocket.
And economists call this indirect network effects. And the reason it matters so much is that it tends to lead to a winner-take-all outcome.
And the reason that's important when we think about the rise of data capital is that the digitization and datafication of more activities bring platform competition to industries that have never seen it before.
And this digitization and recording of everything we do, the activities and so on,
isn't just limited to companies. We're all doing it ourselves using health bands and
health trackers and so on. That's exactly right. I mean, for example, some insurers are worried that Apple with its watch and with its health kit may be able to gather sufficient data about consumers' health that they might be able to price risk better than the insurance companies. And this is simply what happens when you digitize activities and you dramatically cut the cost of that information component.
You cut the cost to capture the data.
You cut the cost to use the data.
And then as a result, you can then use that unique data in unique ways.
And it creates all kinds of new competitive dynamics.
You know, one of the crazy places where you see this is in agriculture. And it's now possible
to have drones that will take photographs of the crops and they'll do these spectrographic
analyses of that picture looking for how much green is in it because that's a
proxy for how much chlorophyll is in the leaves of the plants. And that data is then fed to the
fertilizer spreader, which changes the mix and the volume of fertilizer that the spreader puts
as it moves across the field. So now the mix and the volume is tailored to the plants.
And the tractor in the middle is in competition
to be the platform for digital agricultural services.
And that is not the way that makers of heavy agriculture equipment
think about competition.
But now they must.
So one of the arguments I hear from customers
who perhaps haven't invested in big data and these new kind of platforms and so on is that, first of all, there's too much of it to make sense of.
And secondly, it's so freely available that actually, you know, what's the value in having data?
What's the value in owning the platform?
Yeah, the common understanding of data is that it's everywhere and there's so much of it that we can't pay attention to all of it.
That is true, but it really misses the point.
Data is not abundant.
Data consists of countless unique observations.
And that sets up a competition for data.
So this really comes from the first principle of data capital.
Data comes from activity.
And what this means is that if your company is not part of an activity when it happens,
your chance to capture its data is lost forever, doesn't come back.
And the reason is that in interaction with a customer that takes place at a particular time,
with a particular context, with particular market dynamics, it only happens once.
And so if you're not part of that activity at the time that it happens, that's it.
You don't get the data from that activity.
What's worse is that if your rival handles that sale, supports that research for a product,
handles that ride from point A to point B, they get the data and you don't.
And they now have this unique observation,
set of observations, really, from that whole activity, which they can then combine, you know,
with the other data that they may have accumulated to then deliver a unique service. And you can't.
And so the big implication here is that companies have to develop the skill to look out at the world and see the data that isn't there.
Not yet.
And then put new sensors into the world, new mobile apps, new ways of digitizing and datafying those activities before their rivals do.
Companies are in competition for data.
And this also makes things very interesting for IT, where something that was considered a cost
in the past, a burden in a way, handling transactions, dealing with all the aspects
of putting order through and so on, that can be now an asset and a unique bit of insight that
others don't have. Yeah, the real switch in perspective here is to realize that data is not merely a record of what happened.
It is also a raw material for creating new digital services.
That's really the shift that large enterprises have to make.
And, you know, this is a shift that is, it's actually a little
bit difficult. It's a change of perspective that's a little bit difficult to make because we have
lots of highly skilled, highly accomplished executives who are very used to the idea
that the business is the business, and then they get reports that help them answer the question,
how are we doing? The reports then enable them to make better decisions. And that's sort of like,
we have lots of leaders who think of that as the life cycle of data. And don't get me wrong,
that is a good life cycle of data. That is a life cycle of data. However, however, there is this new thing, which is you can now create a digital service delivered straight into consumers' pockets that can only be driven by aggregated data.
And so data is not merely a record of what happened.
It is a critical raw material in creating those digital services.
And that's the new perspective that executives have to embrace. a record of what happened. It is a critical raw material in creating those digital services. And
that's the new perspective that executives have to embrace. And, you know, being the organization
that has unique access to this data, because you own the platform, because you have the records and
transactions and so on, the effect of that kind of multiplies, really, when you think about how
you can use that to improve the products and services you offer, the accuracy of predictions and so on.
Absolutely.
We see this in fraud detection, for example.
So in fraud detection, we have a customer who checks person-to-person mobile payments for fraudulent activity.
So, you know, let's say that you go out to
dinner with a friend, you forgot your wallet, you need to split the tab later, and you just want to
make a payment from your checking account to their checking account. You do it on your phone,
and it's all digital really fast. Well, all of those transactions get sniffed in real time for
possible fraudulent activity. And the challenge with fraud detection is that you have to,
you want to avoid false negatives where you miss actual fraudulent activity because then the bad guys get away with it. And you also want to avoid
false positives where you flag a legitimate transaction as fraudulent because then your
actual customers get really irritated. And so this company, they get information about both of these
things, about scams that they missed through both customer complaints and their
own offline investigations, and they get information about legitimate transactions that the algorithm
wrongfully flagged, again, from customer complaints.
Well, that data goes back into the model to improve the scoring mechanism for fraud, improving that algorithm's future ability to catch the real bad guys without disrupting real transactions.
And that cycle then enables them to get better faster than the other guy. So this is great, but the reality is for a lot of companies that there's a kind of lack of people with skills in this area, what we call data
scientists. And the tools, the BI tools we use currently, typically, you know, they're very
backward looking and they tell you what data is, what things have happened in the past, but
their ability to do kind of algorithms and predictive models and so on is lacking. So
do you think, you know, that we need to get more data scientists
or do you think we need to kind of invest the time in tooling
so that tools do this for us instead
and take away some of the complexity
in doing stuff now that requires a data scientist to do?
We're going to need both more data scientists
and better tools that enable more citizen data scientists.
And so, yes, we're going to need more.
And we should be specific here by what we mean by data scientists.
We're going to need more people who can represent real-world relationships mathematically.
And that's not a skill that should be limited to phds in statistics i mean that really is a
skill that should that that should become part of the uh regular um you know bag of tricks for
any effective manager uh from here on out um we're and then you know and we're also going to need
more of the people who support those uh
those people who can represent real world relationships mathematically so the people
who can actually write the code who can do the data wrangling you know who can keep the systems
happy and all that kind of thing in addition though we're also going to need tools that make
it easier for um managers who have a familiarity with statistical inference and other statistical
techniques, but whose real expertise is in understanding how medical device manufacturers
understand whether the therapies they enable are better than what was there before. We need those,
we need to enable those people to ask more of the questions they have in their minds more easily.
So Paul, just changing sort of tack a little bit, really, I'm interested in the job that you do. So
your job title is a big data strategist for Oracle, you go and talk to customers, I'm curious
to know how how that goes and what you do. But also, I'm curious to understand how you can how
you add value in industries that you're not so aware of, really.
How do you take the ideas you've got and make them relevant to each person's, each company's industry?
So just tell us how the job works and what you do, first of all.
Yeah, a big part of my job is giving our customers eyes to see.
And a lot of that is the conversation that we're having right here you know the idea
that data comes from activity and you're actually in competition to get it data tends to make more
data so your algorithms that produce data about their own performance to improve their future
performance is really big and platform competition is coming to industries that have never seen it before. So those ideas are really just lenses
through which to look at the world.
But then what happens is that our customers will say,
hey, wait a minute.
We've been thinking about, you know, we make cat food.
And we don't know anything about how cats eat.
You know, don't get me wrong.
Like our product people, they know a lot about how cats eat and what cats need to eat.
But we don't actually know how the cats eating our food eat it.
Like how often? Like do the majority of cats kind of
eat once a day? Or do the majority of cats have food available all the time and they just kind of,
you know, graze on it? And what does that mean for their health? Is there a way that we could
capture data about that? And then we can have a discussion about, of course, of course, there's
a way that you could capture data about that. I talk about belling the cat this is better than that this
is following the cat around all day you could you know you could put you could put sensors on the
bowl itself so that it detects when the cat is near because you now have got a you know an object
on the collar and they're like hmm well who sells well, who sells those? Well, you could. It's like, oh,
wait a minute. So wait, we could actually turn the collar, the bowl into accessories around our cat
food product. Yes, you could. This is what it means to look out at the world and see the data
that is not there. And, you know, this is a deliberately trivial example. And yet, you know, this is a deliberately trivial example.
And yet, you know, business empires have been built on less.
So far, we've talked very generically about big data and about data capital and so on.
But let's kind of move on a little bit to think about Oracle.
So why is big data so important to Oracle?
How is Oracle's platform differentiated?
And why do you think that Oracle are in a particularly strong position to help with this idea of data capital and be the platform that customers use to get the most out of this
new opportunity in this new area and this new technology? Well, big data is hugely important
to Oracle because Oracle makes the things that make data more valuable. all of the uh you know all through all of the data management and integration um uh and security
pieces that you find in that kind of platform layer all the way up to applications analytics
and algorithms that you you know think of as sort of the consumption points that you you know you
can talk about living in that that uh upper tier that software layer. And now, of course, Oracle is reinventing enterprise computing for the cloud.
And what that really means is reinventing enterprise computing as Oracle's strategy is that when you're talking about large enterprises with huge, and on-premise computing for a decade, probably
more. And so the question then becomes, well, how do you deliver enterprise computing as just a set
of services as far as the business is concerned when some of them are going to come out of a public cloud
and then some of them still are going to come from behind the firewall,
but you want them to appear to the business
as just being a bunch of services that they can consume.
Well, Oracle uniquely can do that.
And Oracle has adopted this really interesting and I think really compelling vision that says, hey, let's build out a public cloud using a particular technical architecture, which we can also deliver behind the firewall in your data center if that's what you'd like and make the two of them work together.
You know, you talk about smoothing the path of migration
to cloud computing, which is absolutely necessary
for crunching these huge amounts
and huge diversity of new data sources.
And that is just a really great approach.
I think I mentioned this to you on the call
when we kind of set this up
that I was open world recently
and I was really seriously impressed
with some of the stuff
that is coming out of Oracle now
around deploying big data into the cloud, for example,
and just generally the effort and initiative
and the kind of ruthlessness in a way
of Oracle getting involved in cloud
and starting to kind of move into that market
is very interesting. And I guess probably, you know, one observation about big data is that
Oracle perhaps doesn't get the credit it's due sometimes because it doesn't tend to tell the
story and focuses more on kind of like, you know, technology and so on. But certainly, I was very
impressed with what I saw at OpenWorld and the way Oracle is kind of really seriously moving into this market now? Yeah, you know, there's a knock against Oracle that there's just no romance at Oracle.
You know, there's just no, Oracle's not great at telling stories.
There's no, you know, high-minded talk about really changing the world, you know, that you get from other firms.
And frankly, I think that's true.
I think that criticism hits the mark.
I think the flip side of that, though, is that when you talk with executives at Oracle about how are we going to answer this cloud challenge?
There is no romance in the analysis.
It is hard-minded and perceptive.
And the basic evaluation is,
oh, cloud is inevitable.
I mean, it's obvious.
The whole history of computing is about essentially increasing scale
and access at lower cost.
And that's what cloud does.
So fine.
So we're going to reinvent enterprise computing for the cloud.
And here's what it takes.
It takes the entire stack, reinventing the entire stack.
So we're going to do infrastructure as a service.
We're going to do platform as a service.
We're going to do software as a service.
And there's just no romance in that
evaluation whatsoever. It's really clinical, it's really analytical, and it's really effective.
So, you know, I'd agree that this strategy is certainly effective, but I do think there's
romance in it, actually. At least, you know, romance in terms of what it means for customers,
what the capabilities it gives them, and, you, and the ability to take an idea that's small and build on it and deploy it to the larger scale.
But also the new product areas that it really opens up in terms of things for Oracle. You
think about some of the services and capabilities that this platform is going to enable really to
go well beyond just basic hosting and running of Hadoop in the cloud. That's exactly right. And there is, in fact, an entirely new kind of business
that Oracle offers, data as a service.
And this, again, fits squarely into the data capital story.
So the data as a service business
has got three billion consumer profiles
in it, 400 million business profiles. It's got data from $4 trillion worth of online and offline
transactions. And this is data that is available to marketers, advertisers, people who might be doing market assessments for
new products. We see CPG companies, consumer packaged goods companies, using this huge
amount of data. It's actually the largest data marketplace, using data from this data marketplace to wrap context around their own customer data, which then enables them to see attributes of their customers that they were previously blind to. Uh, you know, the fact that, that for example, um, a particular,
uh, shopper who is, uh, buying, um, healthy, uh, frozen meals, um, is really interested
in time-saving. It's not the healthy aspect. That was not the key. The key was this real need around time saving as a result, platform as a service, infrastructure as a service, business. And it's one that is entirely based on creating new services from unique data capital.
But isn't the kind of move towards cloud and deploying things in the cloud and managed
services, isn't that really something that, you know, whilst it's good for the business,
it's a threat to IT?
And isn't the move towards, say, Hadoop and systems like that, a threat to traditional kind of IT workers working in, say, Oracle and so on?
Is this all really at the expense of the IT department?
Well, I would put the situation a little bit differently.
I would say that there is an enormous opportunity here
for those IT departments that would grab onto it.
You know, the story about data capital is that this is now a new kind of unique resource
that enables competitive advantage.
This is the, and you know,
the argument that the IT department
has been making for decades now is,
look, you can't run your business.
You can't run this oil refinery without our data. You cannot prove out your oil reserves without our data. You can't price risk without the data assets that support it all. IT is critical to the business. And, you know, and that was the last part was where business executives would always go, I don't know about IT because, I mean, you can't run this business without all of our people. which the IT department stewards and shepherds, these data assets are unique assets that sit
behind some of our critical aspects of competitive advantage. And that then elevates the conversation
to one that's more about business value. It's more about strategy. It invites the conversation about,
well, okay, what new kinds of unique value can we create from these unique assets? And then you're
into algorithms and data science activity that demonstrate new problems that can be solved or demonstrate new solutions to critical problems
and lead the path to new value from new services.
So as this idea evolves of data being a form of capital
that's as important as any other form of capital to a business
in terms of growth and so on,
I suppose it really opens up quite a few questions
about things like ownership of data,
how easily it's shared or not shared, privacy and so on. You know, what do you think the implications are of this new
way of thinking of data and the value that's now placed on it? Well, I think there are a number of
implications from this idea of data as a kind of capital, you know, that we're just beginning to explore.
For example, you raised one about ownership of that data capital.
Well, what is the proper role for a company who has a unique data capital asset that they invested to create, but which records the actions and preferences of consumers.
There's still a lot of work to do there to flesh out that responsibility and figure out how to discharge that responsibility effectively while still enabling innovation.
So there's still a lot of work to do there. discharge that responsibility effectively while still enabling innovation. Um, uh, so that,
so there's still a lot of work to do there. One of the other areas though, uh, is in government.
Um, and not just, you know, not, not just tangentially, you know, about privacy and
things like that, but in thinking about, um, well, uh, data assets that are created by government-owned activities and facilities, who owns that and who should benefit from it?
So, you know, there are some of the largest ports in the world are facilities that are owned by governments and highly, highly
automated. Those data assets are unique. They are highly valuable. Should, let's say, should
entrepreneurs in those countries get privileged access to those data assets? Should data science
students in universities in those countries get privileged access to those data assets?
If so, on what terms?
How do the citizens of that country then benefit from that usage?
All of these are new questions that really lay out kind of a curriculum of work to do in data capital for a long time to come.
And we're starting to see this debate happening, particularly in Europe at the moment.
Google News had to close down its operations in Spain recently because the law came through
to say that even linking to newspaper websites without paying for it is not allowed. So that
service is now gone. And also, I think I read somewhere as well, again in Europe, where people and companies were allowed to use data from public websites as long as they didn't use it to create new businesses.
In a way, that's the kind of one use of it that would actually give value to the country.
Well, and I think it's important to say that that idea that, okay, you can use this data as long as you don't create a company from it. That's not necessarily wrong.
There is a really valid argument there.
However, there is another valid argument that makes the opposite point, you know, that says, no, look, one of the greatest uses of this thing would be to create economic value from it.
It creates jobs i it actually possibly
would create revenue which well maybe it could be taxed or maybe there's an implicit tax or some
kind of tariff that then you know benefits the the uh the citizens of that nation much the way
that um there have been oil dividends paid to citizens of oil richrich countries. The thing is that those debates need to be elevated.
They need to be raised up.
We need a more active conversation around those.
Well, Paul, we're just about out of time now.
So thank you very much for coming on the show.
And it's been great going through, I suppose,
actually the economic side of big data,
the implications of that, and this idea, obviously,
of data capital that really,
I think it really kind of puts the value of what we're doing
with kind of data systems and big data architectures and so on
in a real kind of economic context.
And certainly a lot to think about.
That's exactly right.
And thank you very much for having me, Mark.
It's just been a pleasure.
And that's the end of the show now.
And as usual, you can find show notes and links
to previous editions of the podcast on the website www.drillsofdetail.com. Thank you.