Drill to Detail - Drill To Detail Ep.1. 'After The Gartner BI&A Magic Quadrant 2016', With Special Guest Stewart Bryson
Episode Date: September 20, 2016...
Transcript
Discussion (0)
Hello and welcome to Drill to Detail, a new podcast series hosted by me, Mark Rittman,
where I'll be commentating on the big data, business analytics and data warehousing industry,
along with a guest who's either building the analytics platforms that sit behind all those web startups and companies disrupting every industry
these days, or like me, actually implements, consults, analyzes, and commentates on what
must be one of the most hottest and most newsworthy and topical areas of the IT industry these days.
So I was inspired by John Gruber's Apple-focused podcast series, The Talk Show,
where he interviews a mix of industry commentators, Apple execs in his case, and a sort of one-off guest with insights into a particular
area. And in this inaugural episode of the podcast series, I'm very pleased to be joined by
Stuart Bryson, who's an old friend, a colleague from the past, and someone who's probably very
well known in the Oracle BI data warehousing and analytics industry. Probably the second most popular speaker in the conference circuit.
So Stuart, why don't you introduce yourself to the listeners and tell us who you are.
Absolutely, yeah. So my name is Stuart Bryson.
I'm like Mark. I'm an Oracle ACE Director in the Oracle space.
I'm a co-founder of a company called Red Pill Analytics.
And I've been working with Oracle Technologies for about 18 years.
I started soon after college.
I took a bit of a walk on the wild side, worked for Informix Software for a couple of years.
I was on their SWAT team.
And that kind of got me into professional services.
And it also is what introduced me to really to data warehousing and BI.
So I started doing data warehousing on a different database than Oracle, but I soon saw the light,
got back in the red path. And so, you know, I worked for myself for quite a long time as a
consultant slash contractor. But all that while I was following this very good blog from this interesting gentleman
over in the UK who seemed to lay it all out there every couple times a week, really. You were a
pretty verbose blogger at that time, Mark. And I just knew that if I ever had the chance, I'd want
to work with you. And that dream came true.
I worked six or seven years there at Rittman Mead, quite proud of my time there.
And recently, looking at the two-year anniversary of our company about two years ago,
I joined another colleague I respected, Kevin McGinley,
and we started Red Pill Analytics and the rest is sort of history.
Excellent, Stuart. Thanks. Well, it's great to have you on here. And I couldn't think of anybody
I'd want to have on the first ever episode, really, because like me, you've got a lot of opinions.
You've been doing this for a long time. And really, in this first podcast, I wanted to
talk about some things that are very topical in our industry, but actually
are very relevant to you as well in the kind of areas that you work in so some people within you know some
people within the bi kind of world a lot of people probably would have seen the um the garden of
magic quadrant that came out recently the bi analytics magic quadrant so that was i think
around about sort of january february of this year and and it was fairly kind of dramatic very big
news because uh oracle for example the company that we work with mostly uh was actually dropped
from the magic quadrant uh which was you know a ago before that, it was in the Leaders Quadrant.
And so what happened in that period, as Stuart knows, was that Gartner kind of redefined what they considered to be a modern kind of BI tool.
And they started talking about the fact that really BI these days is led by the users rather than by IT.
And really, what constitutes in their mind a kind of modern BI tool is not a tool that is led by IT.
So, and there was concepts in there of things like
bimodal BI and that sort of thing.
And so I thought it'd be interesting to kind of talk about
in this first bit, as Stuart,
someone who has worked in this area,
you know, are you seeing this kind of idea?
Are you seeing this thing in the market where,
you know, I suppose tools like Tableau are being used more?
Or is that more kind of just hype from Gartner?
There's a couple of things going on here, I think.
First off, you know, it's my understanding and, you know, I can't quote anybody on this, but Oracle chose not to participate.
And that's why they're not in there at all.
Now, I can't validate that, but that they knew that they would not be judged
on their newest product set and chose not to participate. I don't know if that's the case.
Regardless, I don't think they would be probably where they're used to being anyway. I mean,
I think there's something interesting about what Gartner is doing. I have to agree with them to
one degree in that they are certainly describing what's happening.
I think that the idea to break enterprise reporting, as they've done, away from BI and analytics is probably a smart approach as far as how to categorize these things.
At the same time, there's a difference between describing and prescribing what we think necessarily should happen. And I think when you look at, it's not just in analytics really,
it's in a lot of what Gartner is doing now.
They sort of have this distaste for IT in general.
And I think that's coming out in the way they're categorizing things.
But it certainly does, Mark, describe what's happening.
And I think maybe even this has been happening with analytics longer than perhaps other, you know, we'll say development platforms or infrastructures or applications.
And that, you know, even with BI reporting tools, you know, folks have been pulling that data out and trying to put it in other more adaptive tools.
I think that's been going on for a long time.
What, you know, wrapping it up and into a term like bimodal or or third platform or whatever, you know, Gartner kind of wants to describe it.
I think, you know, we've been seeing that for a while with desktop tools, for instance.
So I think for anybody who's on this who hasn't read the Gartner report,
or certainly the detail of it, just to highlight what that is really.
So the Gartner report said that really, and this is a theme I think they've had
through a lot of their reports they do, because they obviously have so many segments
and the markets they cover, the concept of bimodal IT,
where there are some types of projects
that are really kind of done best as a very structured project by IT over many
months in many phases and so on so putting in your ERP system or something
kind of fairly big but some projects are better done more kind of maybe even by
that shadow IT so where the initiative comes from the business
and they buy what they want
and they start small and build out from there.
Now, what they're saying, obviously,
in this magic quadrant is that the innovation
and the license spend is happening in this area here.
Now, where that affects Oracle and vendors like Oracle,
so for example, IBM and Cognos and so on,
is if people aren't buying your tools, then that's obviously an issue.
But I think for us as practitioners, you know, is this just hype?
Is this something where, you know, is self-service IT of, you know, way of doing things are,
no data modeling or kind of optional data modeling. I mean, Stuart, this has been your
industry. This has been your kind of lifetime. You know, do you think there's value in the
statement to say BI should make data modeling and data curation optional?
So optional permanently, I don't believe.
However, I do think the concept of not necessarily starting there is meaningful. I mean, if you look at what, you know, if you were to launch an analytics platform from scratch today, you would certainly do something Lambda-like, meaning that you would try to address the streaming side of it and
address the sort of more batch-oriented side, you'd probably try to do both of those.
Because if you're building anything new today, it's going to have a very heavy mobile side
of it.
And sitting around waiting, a mobile application needs access to things quickly, especially if it has anything to do with customer, etc.
So I think that there's certainly value in making data readily available, and it doesn't have to be heavily curated.
I mean, it's usually some small bite-sized morsel of analytics that needs to go into those applications or go into mobile apps. So we
certainly need to make analytics available quickly. But we don't necessarily need full-fledged,
curated, conformed models available immediately. So I think that, you know, when I started talking
about sort of agile BI or sort of model-driven, I think is what I called it back in the day, and I kind of took a step from you on that when I started looking at that stuff, was that we wanted to just be more adaptive.
We wanted to try to push these enterprise tools to their limits.
I think what I've discovered in that time and now, I think there's still a lot of value in approaching enterprise reporting platforms in that way.
But really leading with easier analytics tools, sort of having what we call at our company an innovation stream.
And we may have stolen that from someone.
I'm not sure exactly where we got it. But sort of leading with an innovation stream and kind of following with enterprise IT tools, I think, is a little bit more realistic and a little bit more what's happening anyway.
I think that that gap between leading with something a little bit more innovative and following with something a little bit more structured is where the gap is today.
I think that that follow on is not happening.
Yeah. is where the gap is today. I think that that follow-on is not happening. Yeah, so a lot of work I've been doing,
particularly in the last kind of year,
has been around sort of big data and data reservoirs
and this sort of thing.
And there's a couple of things that come out of that.
So you're never going to use a tool like OBI, for example,
or Cognos or whatever,
probably as your first tool you'd access Hadoop data with.
Because of the schema on read part,
typically you're spending as much time getting schema out of it.
But typically you want to just see what data is in there.
And particularly now with tools like Apache Drill,
where it can point to data that has schema built into it
or certainly can be read out of there.
So JSON and their CSV and so on.
I think that certainly the ability to select the way
in which you deal with metadata,
you know, that metadata might come from the data itself. It might be added by the users
themselves as a part of their curation. I think that rather than saying all metadata
must be formally defined, you know, through a dimensional model and so on, you know, it
has its place. And we, you know, you and I have been teaching it for a long time. But
I think, you know, it's about having the right approach at the right time really where it gets interesting for vendors
like oracle for example is is how they kind of deal with that world and certainly i'd be interested in
your view in a second on on tools like dv desktop for example with oracle i mean one of the things
we're trying to do with with obi-12c projects and any project really where traditionally it's been
an enterprise BI project
for us is to think how can we incorporate some of this thinking in there? How can we get the
users to be involved in the curation process and so on? Have you thought, for you, have you
thought about different ways of doing projects now because of this or different tools?
Absolutely. I mean, if you really look at it, even for enterprise BI tools, there's always been a certain set of users.
And for some organizations, a large set of users that dumped that data out and took it to something
like Excel anyway. So I think that all tools like Tableau have done is make that a little more
visual and perhaps a little easier. I think that we can harness these things that are going to happen anyway. I mean,
trying to cut off this discovery and innovation side of analytics is a fool's errand. I think that
you're not going to be able to stop it, number one, nor should you. I mean, innovation should drive execution, really. And I think that, you know,
not acknowledging that is sort of a problem that, you know, IT has themselves to blame for a lot of
this. I mean, if you look at, you know, over the last year, I've been doing a lot of talks on
mashups, self-service, that sort of thing. And it gets the developers and business
folks in the room kind of jazzed and nodding their heads with me. But there's always someone
from IT that says, you know, we can't open Pandora's box. We can't allow this. And for that
reason, I think IT is just kind of just put blinders on to a certain degree because it's going to happen.
It's just the question is, is it going to happen in an environment that you can control or are you going to simply push folks out?
And I think that's what's happening.
They're going to the cloud.
They're going to shadow IT.
They're going to sanctioned shadow IT if that that's such a thing now, because that's
where the money's flowing to. So in that case, it's almost not shadow IT, it's sanctioned IT,
but just from a non-enterprise architecture. I think that IT, you know, has to bear some blame
for this. I mean, what do you think? Well, it does. I mean, we'll get onto that in a second.
And really the second topic I want to talk to you about is about how i how we how the it part play plays its part in this kind
of thing really and whether agile is still relevant and so on but i guess before we go into that just
a quick thing really um oracle have the area we work in stewart is is kind of oracle bi and uh
there's been a load of of um of new bi tools come out from them recently so there's been data
visualization desktop there's been the cloud things and so on.
Certainly from my side being candid,
you know, certainly when BI cloud service came out,
I struggled, we struggled really to think
of what we could do with it
because it was such a different market
to the market we're in.
You know, very departmental,
very self-service and so on.
But we've been trying to think, you know,
how can we use tools like data visualization desktop
which is their equivalent of kind of tableau desktop how can we use those tools put them in
users hands that initial work really of doing some kind of um of discovery and some curation and so
on but then somehow feed that back into a central model i mean it have you i suppose in a way have
you thought about ways and you can use those tools and use cloud and use desktop tools in a way that isn't thrown away, but actually they can contribute back to what the bigger model is?
Absolutely.
I mean, what's missing is some sort of a migration from the desktop tool to the enterprise tool.
Maybe we'll see that in Oracle's product line at some point.
But I certainly think that it's changed my perspective.
I mean, you know, it used to be Tableau was mentioned occasionally in our customers.
And now it's in all of them.
And I think that to try to decide, you know, we have a lot of customers that will say, hey, so we've got a lot of times it's IT representation speaking to
us and saying, yeah, how do we get Tableau out of their hands? And I just think that's the wrong
way to look at it. At the same time, there's a lot of things that Tableau and all of the tools
like Tableau don't do well. There is a place for curated content. I think that the problem is that,
you know, at least in the worlds that the world that I grew up in,
in BI data warehousing, and I think it's the same with you, Mark, is that we always started
with the curated content, we had to sort of lead with that. I think that that the that the tool
based or discovery based or however you want to describe it approach does two things for us that
are very, very valuable, and that we should endorse.
That is, one, it helps get our requirements for us, right?
So instead of IT handing over some archaic, never-read-again requirements document and making someone fill it out and triple it,
we give them a tool that allows them to express what it is they're looking for.
And that could be very, very valuable for trying to capture what it is that's truly in their hearts and what they really want to see.
And there's no better requirement for building a dashboard than something that's dashboard-like.
I think that's a big piece of this. And the second real piece of value that
we get from that is that while IT is building it, the business has something that they can use in
the interim. And I think that that's what IT sort of forgot about is that you'll get it in a year.
It's just not relevant today. And if they don't accept some sort of change to change that uh they're gonna they're
gonna find themselves even more dinosaurs than than perhaps they're already looking at now okay
so that's i mean that's an interesting kind of lead into the next thing i want to talk to you
about really was um so so stewart you know you you have talked a lot recently about about agit
it delivering agile projects for bi you know and we as well. It's been a common theme.
And we've kind of, I suppose the BI industry
has evolved from these very kind of long,
waterfall data warehouse projects
to BI ones that have agile in there.
But the kind of irony in a way with this,
with the whole kind of Gartner report
and the bimodal and so on,
is that it almost kind of obsoletes
everything we talk about, or potentially,
because the last thing in a way that obsoletes everything we talk about, or potentially, because the last thing, in a way,
that business users want to talk about is agile
and kind of methodologies and so on.
So I put a question to you, Stuart, is, in a way,
do you think this move towards the business running BI projects
and this bimodal approach in shadow IT,
is it going to kind of lead to, you know, first of all,
is it going to lead to almost like a dark ages of users kind of, you know, individual silos of data?
I mean, what do you think is relevant for agile methodologies and IT methodologies going forward in BI?
Do you think we're getting a listen really?
I think more so than ever.
And let me make that case. So, if we do sort of endorse and embrace the idea that the business is going to lead IT from a discovery perspective, still a sort of a backfill for enterprise reporting,
at least I would argue there is.
Not everything that comes out of their desktop tool
necessarily needs to go into the enterprise reporting tool.
I think that's the first thing that IT gets wrong
in trying to bridge this gap is thinking that everything must go in.
Well, actually, if Tableau is solving some departmental need that no one else cares about, that data is not something that anyone else really needs, then it's not necessarily requirements and the IT organization or the more structured architecture side of the house wants to follow and take some of these requirements from the business that they're finding on their own.
Then what better way is there to sort of rapidly take those requirements and build them than an Agile methodology?
We no longer are looking at the target state of an entire sort of BI application like there's some series of checkboxes that we can hit and be done. I think the idea that they're going to constantly be ahead of IT is a, you know, it's a challenge, but it's,
you know, it's an encouraging challenge. And I think agile methodologies for IT will help them
bring the reporting side closer to the business. And of course, they'll never overtake it,
but perhaps they'll follow closer behind using those sorts of approaches.
It's interesting, isn't it? Because I suppose it's an interesting thing in that the market for cloud BI and these desktop BI, it shouldn't need kind of training and that sort of thing.
And also, in a way, it's all about scratching your own personal itch. But for me, the interesting thing is how we deliver, how we go into customers with kind of the latest generation of kind of BI tools with the the kind of the the idea of kind of bimodal it and so on how we can in a way that typically these are now
led by the business they're not led by it and i and and so part of the issue we have is trying
to get it to stop being a blocker and to be an abler and that sort of thing but but but certainly
you know you i suppose the business aren't really interested in methodologies they're not really
interested in the big picture and so on.
And I suppose the hard thing is how do you – we often talk in Agile. You've talked and we've talked about Agile in the past,
having technical debt and having sort of sprints that do things
that don't really kind of do anything that displays on the screen
but make reports run faster or more stable.
And we've often said we would lead that, for example,
via a sprint that's about kind of performance for example but in it but in a way you know the difficult thing here is
is you know things go in cycles here and you can imagine a project that has been led off of things
like sort of tableau and so on how do we get get across to people about the ideas about things like
slow change dimensions and things like kind of you know the parts of a warehouse that are things you have to do, but don't really kind of give end user value. I mean, how do you kind of deal with
that sort of thing, really? So I think one of the things to note is that when you talk about like
taking a sprint to do a refactoring of architecture or something like that,
the business has no appetite for that. But generally, they don't have an appetite for that when they're in an environment where they're constantly waiting on IT to give them things.
When they're in an environment where they can lead and they have tools and capabilities that allow them to answer their personal questions or at least their departmental questions, something like Tableau, something like data visualization that can give them answers quickly to what they're looking for,
then they're not necessarily waiting for IT to produce every single piece of data that they're
going to consume. So I think what happens is if we embrace the idea that there will be,
that, you know, bimodal implies that there are two modes. if we embrace that there will be a mode that
innovates and leads and the customer can be in the driver's seat there, then they'll
have more patience for things like refactoring sprints, slowly changing dimensions and those
things because they're not waiting for the first cut of data. What we've always done in data warehousing
prior to this paradigm shift
is not give them anything
until it's fully done or conformed
or at least some piece of it is fully done.
And that's one that they don't have patience for.
But if they have something at their disposal
to answer some questions for them
while they're waiting for that next sprint,
I think they'll be more patient yeah yeah I mean another another another I suppose another big trend in our in our industry
has been cloud so if you think about now users are as likely to go and get a
cloud service for do bi I mean I noticed that there was a thing announcement this
week Google had a own kind of analytics tool that they bit very similar to the kind of the power bi and and dvcs and that sort of thing from oracle what's it called uh it's
called uh i mean it's on it's on twitter i i mentioned it they it's got a name that is some
variation of the usual words used for these kind of tools the idea is it basically acts as a bi tool
on top of uh google apps data so spreadsheets, Google BigQuery and Google Analytics,
but also has, yeah, it's interesting. But one thing, one trend I have noticed in cloud apps
and cloud BI is you can start to end up with silos again. So, you know, every kind of SaaS app that
has a BI reporting tool, you know, is a separate one. And again, leading into some of the things
that you've been talking about in the past like continuous integration and and all the things that we do around bi projects to make sure that
environments match and so on they're hard to do in the cloud when you've got disparate kind of
sources and so on you know in a way going back again to this doom-mongering thing are we now
heading into a period when we're going to have lots and lots of silos of bi apps that really
don't have any kind of configuration management across them because they're in the cloud by
different vendors and you know do you see that happening almost balk any kind of configuration management across them because they're in the cloud by different vendors.
Do you see that happening, almost balkanization of BI going forward? Well, so there are trends to cloud-based data.
Sorry, cloud-based continuous integration tools.
I mean, there's certainly a lot of CI tools, CircleCI, Bamboo's in the cloud.
There's several Jenkins in the cloud implementations
that will connect to different things that are in AWS, etc.
So they haven't boiled down to BI necessarily, but I am encouraged when you see what you're
describing the Google BI tool and it sounds very similar to AWS's QuickSight, which is similar in that, you know,
they're sort of their first go around, it connects to all the AWS things, right? So it easily can
report and federate over those things, which sounds similar. I think that it's encouraging
that those things are, that those tools and products are running in cloud services that
also have CI and pipeline tools. So I think it's a natural sort of merging of, you know, maybe perhaps finally BI gets
real CI because, you know, we still see with almost every customer we go to, lack of source
control, lack of automated testing, lack of automated migrations.
And that's for any tool.
I mean, take almost any enterprise reporting tool.
What's encouraging about the bimodal approach is that the new tools that are coming out,
BI analytics tools and that are running in the cloud, are tending to be on architectures
that are part of the third platform that are more
developer-centric and developed in more of these CI types environments.
So it might be that there's a natural extension to providing these capabilities in the cloud.
So I understand what you're saying as far as the business is not necessarily going to say,
hey, we're opening Pandora's box and going with this BI analytics tool.
By the way, we're also going to plug it into the cloud CI capabilities and automated regression capabilities that some other cloud vendor has.
They're not going to naturally do that.
But if there are hooks, the problem is there's no hooks in the BI tools today for doing any
of this.
So it has to be very, very manual.
If there are hooks, and the cloud is always good at providing hooks, we might see some
of these things become natural sort of tiers to delivering BI.
I'm at least hopeful of that.
Yeah, it's interesting.
I think really, you know, to kind of sum up my views on the whole bimodal thing and mode two, mode one, and all this kind
of side is, it's just reflecting the fact that, you know, not every approach is appropriate for
everything. And in the same way, in the same way that you hopefully wouldn't try and build your
entire enterprise reporting off of kind of, you know, a desktop tool, you also wouldn't want to
incur the kind of cost of doing curation and enterprise stuff for everything else and I think where Oracle had got
themselves into a situation and probably a lot of the big vendors is that's the
only way you could do things really and the challenge for them a little bit
and all these vendors is how to adapt and so on but it's kind of I suppose
my other side is it's always good if there's customer interest in this thing
and the fact that there is this innovation, there is this kind of change and so on, it shows us interest at least. At least
we're not in some kind of area that is kind of out of date really. Could I make one more point
on that before we move on? Yeah. So what's interesting is that Oracle's in a unique position
now with like data visualization desktop and the data visualization service, along with sort of the
enterprise side of the house. They're in a unique situation to perhaps deliver both sides in at
least a consolidated tool set. And I think that's probably where they're betting on their message.
I think it's a good message. But the ironic thing, Mark, is that I'm not sure who's going to hear it, right?
Because the business doesn't want to hear about Oracle's capabilities of bridging that gap because they don't recognize the gap.
IT is too angry that the gap exists to even recognize the need for, at least, you know, I'm painting with a broad stroke here, but a lot of IT
organizations don't want to hear about the tools that bridge the gap because they want to eliminate
the gap by eliminating one side of it. So I'm not sure who's going to hear that message. And I don't
know if there's a cost center in an enterprise anymore that's got the appetite to try to bridge
a gap because the cost center is usually either with IT or the business.
Interesting. Yeah, interesting.
So the last thing, Stuart, I want to talk to you about,
again, is quite relevant because of you and I and so on,
is obviously for anyone, again, who knows myself and Stuart,
we've worked over the years and the company we work for
has worked on, actually collaborated with Oracle
on some reference architectures
for data warehousing in the past, BI and so on.
And then Stuart and I worked on, with a guy called Andrew Bond and Doug Kaket,
we worked on an updated Oracle information management reference architecture
for last year, which is, the idea is that it's then taken by implementers
and by Oracle and by customers and gives a kind of like a canonical example of how you'd architect,
in this case, a big data extended data warehouse platform.
So one that had both kind of, you know, a data warehouse in there,
it had BI and so on, but it tried to incorporate some of the thinking
about data reservoirs, things like real-time loading uh and
events and so on and so forth and it had a number of kind of aspects to it and i thought it'd be
interesting to kind of in a way to to look back to that and think about yeah in a way has that
come to pass because with a lot you know with a lot of um reference architectures you know they're
sort of to an extent they're a sales aid and they're also a slight kind of punt as to what things might turn out to be and and so and actually the point here is when we publish this podcast
there'll be some show notes and in the show notes there's links to all these things so if you're
kind of wondering what we're talking about there'll be links in the show notes about to be white
papers for example for the architecture and so on but let me just kind of go through a few stages
a few a few parts of that architecture and just talk through with Stuart, you know, with yourself, Stuart, if it was relevant or not. So
first of all, one of the kind of main innovations in this kind of reference architecture for BI and
data warehousing and big data was the idea of splitting out the innovation layer and the
execution layer. So Stuart, do you want to explain what that is, what it's about, first of all? Certainly. I mean, if you look at the reference architecture slides,
or at least the diagrams, what you see is a very sort of distinct line between the execution,
which is on top, and the innovation, which is below. And sort of execution kind of talks about,
you know, I don't want to pigeonhole it too much
and say it's kind of what happens with IT. But in some ways, it is sort of the packaging and
hardening of processes. And then below it, you see the innovation side of the diagram. And it talks,
you know, there's the output for the discovery lab there. And you see, you know, it almost, you know, I don't know what your thought is, Mark, but it almost is the two sides of the bimodal.
It almost, and so when you think about that side of it as far as have they sort of, you know, prescribed what Gartner's talking about? Have they, you know, sort of,
you know, guessed it? It's, you know, it's not really clear. But that execution and innovation
side is trying, at least in my mind, the way I read it, trying to provide a place for
the innovation that we've been describing that might be led by the business,
or at least folks that are working on behalf of the business.
Yeah, I think it certainly has parallels, I think, with the bimodal IT sort of idea.
I mean, I think, I suppose a particular feature of big data and sort of new world projects really for BIs
is that typically
you don't know what it is you're doing at the start.
You certainly don't,
you certainly don't know what tools you're going to use.
You don't know,
know what,
for example,
algorithms you're going to use and so on.
So when we start data,
sorry,
big data projects,
we typically start with,
you know,
a selection of tools,
maybe some VMs,
a set of data.
Now,
during that process of working on that you're going to you're going
to arrive at the routines and the algorithms and tools and so on now nothing in that area should
those tools themselves ideally shouldn't go into production that kind of approach there
they should be taken hardened and so on and so forth and put into a kind of business as usual
area now where we find projects then go can go wrong is first of all trying to do that innovation
within a production style environment and i've worked on projects with banks for example where they've applied the same level of kind of
governance and so on to doing a project in r and python where you know every five minutes we're
asking for a new package or new whatever or where the project has gone well in in that first phase
in the pilot but then as soon as it gets put into as soon as it starts to be used the whole
environment gets locked down and then you can't do any more and that splitting of things and saying that to
innovate further you need to make sure that that's kept separate that's really important i think and
also understanding that the same level of governance and things like controls over what
tools are used that can't apply to the innovation area as well but you can't then put that into
production afterwards because it hasn't had that same level of scrutiny and so on right so you know that we found that as important area yeah i mean
there there always is that question when you talk about some you know we we've we've called it the
innovation stream and and maybe this is where we we stole it from i'm sure we must have stolen it
from somewhere but we call it the innovation stream and we talk about taking a first pass or second pass or third pass at things, sort of triaging it and trying to figure out whatever we produce out of here, where should it go next?
And I think that the idea of IT and security always comes into play at some point, sometimes sooner than later.
What's interesting about it is that they'll, you know, and again, sometimes this is a broad stroke,
but IT will try to lock down these layers and lock down these sort of discovery labs and whatever
we want to call these shadow IT, if we want to call it that,
applications to do analytics. But then they'll give everybody Toad access to the database,
right? It's like, there really is not that much difference between what's going on in a discovery
lab and somebody querying the source database in real time using Toad.
I think that you've got to get on board with getting people what they need because they simply will go connect to Toad, export the data out,
run around with it on their laptops, look at it in Excel, et cetera.
I mean I think you're I understand that that IT
has the best of intentions when they're trying to lock things down but but
people are going to find ways to answer the question so that they can make
better decisions rightly so exactly I mean so so I suppose a broader question
really is the whole idea I mean so this was meant to be the the successor in
this case is isn't this reference architecture to the previous one that
was more based around kind of you know a traditional data warehouse and bi and so
on and every every one of these reference architectures that Oracle do
which we get involved in that each one evolves a bit from the previous one so
so I remember years ago there was one that was very kind of I've got a
diagram somewhere it's all in kind of black and red and so on.
It was all the kind of.
You must have liked that black and red.
Yeah, yeah, yeah.
But it was, so that was very database centric.
And then there was a BI one as well.
A question to you is, as much as we talk about big data and we talk about data reservoirs,
are you actually seeing that being used on projects?
Are we actually seeing as much use of Hadoop and Schema on Read and all these things on customer projects, even new ones, as the architecture would suggest?
Or is it a bit of a kind of emperor's new clothes?
Or what are you seeing out there?
Is big data being used in BI?
It's not.
I mean, it's a discussion point.
It's not that it never is.
I mean, obviously, we've seen some of it,
and we've implemented some of it. But, you know, it's always a discussion point,
and everyone's trying to get their head around exactly what is it going to do. I mean,
I've always believed that, you know, if you have the need, you'll find the solution. And I think
that's why companies like LinkedIn, Facebook, Google have produced so many of these technologies is that the sort of consumer set of products, you know, what's available off the shelf just doesn't do this stuff.
But in reality, companies don't – a lot of companies don't have these needs.
We have a lot of companies that are saying we need to implement, of course we have to
implement big data. And it's like, but you know, we also don't have our standard reporting done yet.
And it's, it's the idea of, you know, if you don't even have your, your, your known questions
answered, are you ready to take on new questions? So I think what we are seeing though, Mark, is,
is a bigger focus on the Lambda architecture
and things like Kafka, things like data streaming. I think, you know, for a lot of people, you know,
when Hadoop was sort of the go-to for big data and you start to figure, well, do I load Hadoop
and do I load it first? Do I load it, do I load my data warehouse from Hadoop? Hadoop's not really great at that.
And it's actually increasing the latency of data getting into my data warehouse.
So I think what we've seen a lot and customers are a lot more responsive to is talking about the streaming stuff because what that does is address a very specific business requirement of getting them data faster. When you start talking about what is the value of
Hadoop, it's sort of amorphous about what that value is. I mean, if you have the need for
offline processing, machine learning, those sorts of things that those innovative companies I
mentioned before doing, then you need that. But the question is, you don't even know what, you don't even know if
you need it. But if you start talking about data streaming, that's immediate value. We can get,
you know, data to both your data warehouse, which is a known quantity, but also perhaps some of
these sort of small fit for purpose, maybe even call it microservices,
these little analytic apps that sort of spawn up all over the place, we can get data in there faster and in a more sort of structured way. I think that's, we're seeing a lot more,
you know, response to that sort of message. One of the things I struggle with on the reference
architecture, you know, I have a lot of respect for Andrew and his whole team, but I've often
tried to figure out if this is describing what's happening or if it's sort of prescribing how it
should happen. And I think that's sort of back and forth on this about whether or not this is a guide to how you should do it, or is it sort of a description of what's happening, and you just sort of accept it.
I'm not positive about that.
What do you think? and big data reference architecture in this case came out of the you know the enterprise architecture team who aren't particularly aligned to particular products and they are looking to i guess they're
looking to find success in quotes with from our customers and so this particular reference
architecture i think had quite a lot of um uh was driven a lot by a particular implementation in a
bank in in spain that was was doing this and there was some i think what they're trying to do is to
reflect things that are working and are providing competitive advantage for customers so i don't
i don't think this is a this isn't your classic kind of marketecher in that it's um something
there just to sell products and to put things into into slots i think there's some sort of
there's there's truth in it although obviously some of the you know in some of the concepts
in in the particular oracle one to do with say sort of like event stores and so on are almost to the point of being so abstract as to be you know unrecognizable
really but i think that certainly back a bit to to is big data being used in in bi projects i mean
the one thing i don't see being used is big data in the way that it's always described so whenever
you see anyone talking about big data and the awful kind of three v's you know velocity, velocity and all that, and it's like sensor readings from smart devices and all that.
But, you know, we don't, I don't, the projects I deal with as a BI implementer typically aren't that kind of thing.
You know, it would be, it would be a customer, for example, who is looking to do, and actually these are ones we actually are doing with customers where they're doing the kind of customer 360 sort of project where they're bringing in data from external sources and it's arriving in sort of JSON format, for example.
And so it's not big data in as much as there's lots of it, but it's coming in a format that probably lends itself better to being stored in HDFS, for example.
And then a tool like, say, Drill is much faster getting to it than than say a relational tool and so it's a bit my analogy is
it's like the space program you know in the space program in the 60s um you know we put some americans
put someone on the on the moon and there were technologies that came out of it that have been
used ever since so my point my point is is that some of the some of the things that come out of
it the flexible storage and the biggest innovation for me that i'm starting to see customers pick up
on now is this idea that that you know you can create a platform that lets you store all the data and then use different processing frameworks on that same data.
So if you think about, say, an Oracle big data appliance, you can put the data on there and you can use Spark, graph analysis, you can do SQL analysis.
It's this ability to land
all your data and actually to do anything with it that's starting to get traction now really and
so we're we're starting to find it's happening i think this is a slight element of a problem
looking for a solution yeah um i think certainly every customer every big customer that we deal
with that is adding capacity to their data warehouse is doing it through hadoop so they're not spending any less for example on on or having any less
big analytic databases but any extra capacity is being added by by hadoop um i think as you said
streaming is now the new batch you know in a way streaming is is how we're doing things
but what we what we don't see is almost the things that you hear people talking about.
And if you think about,
one of the challenges that we find in actually implementing big data projects for customers
is even more so than reporting projects.
They kind of don't know,
they often have no idea what they want to do.
And I think there's more of an onus
on us as implementers to go in with ideas.
And I think that's the challenge a little bit
for vendors like Oracle and other ones that only really have a kind of a software play microsoft for
example is they don't go in with the ideas they just have the platform and yeah it's it's you
know customers particularly anybody now that's thinking about doing a bi a big data initiative
in 2016 is probably you know with with respect it's probably not the kind of the ubers this world
and so on and that that kind of whole thing of, you know,
we want to do a big data project, but we don't know why,
is a challenge really, isn't it?
Do you think, though, that, you know,
if you're sort of a standard or traditional organization
and you've decided to do a BI initiative,
or sorry, a big data initiative,
and you're sort of trying to, you know,
find out what to use this
for, are you going to get any value? I mean, from my perspective, and the answer may be yes, but
from my perspective, you're looking for a big data implementation because you have a need and
you haven't been able to solve it with traditional tools. You know, you've got a lot of files coming from some sort
of system or systems. They're dumping in a directory, you know, just gigabytes, millions
or billions of rows a day. And you've been trying to load those into a relational database for years
and you've been, you know, you haven't been able to solve that or you know i
think when when when those sorts of uh organizations are looking for some sort of a of a solution and
they find big data then then you know everything aligns but it's when the organization that says
okay we're not doing big data we know we should be so let's do big data that there's not going
to be anything that comes out of that in most cases and at least in my opinion but i think that you're obviously correct so i think any any it project or any
project that doesn't really got a kind of business purpose is is right but sometimes that can help
when someone's boss for example suddenly decides they want to have a big data project and up until
then they've been you know their staff have been ignored on it and i think that it has been a
catalyst in some case for us where you know the boss has kind
of said you know a boss several levels above has said we must do a big data project but then
actually below them they've got loads of ideas and so sometimes that kind of very um yeah very
kind of uh just spur of the moment thing can be good but but we certainly find that a project
really that hasn't got a purpose is doomed and the ones that we find that have got success are
either the straightforward it ones
where they've got a very clear idea of creating a platform or doing things cheaper and so on
or the ones that have really come out of a kind of a skunkworks thing in say marketing where they've
started probably with a shadow it project and very quickly realized then they can't store the data
and so on and that they almost then welcome it because at that point it's too big to handle
and so on really.
And so, I mean, another interesting area of this
is the vendors selling tools in this area.
So again, talking of Oracle, for example,
where I'm not seeing a lot of take up of
is tools like say, big data discovery
or tools, these kind of big data enabled BI tools, where they're trying to
sell too many concepts into the customer. If you think about, I mean, every vendor has these things
there where you're trying to sell not only the concept of kind of data discovery, for example,
but Hadoop as well. And it's an interesting place to be in really, where there's a lot of demand
for these tools and projects, but I'm not sure that every implementer and every vendor is going to be successful with this, really.
If you think about where we sort of started our conversation today, which was about sort of end user business focused discovery.
And, you know, that doesn't necessarily align with big data implementations, right? Because users are, of course, if they're
really sophisticated users that are looking for, you know, and I'm thinking data scientists here,
those people will have a lot of success if IT happens to implement an enterprise-wide
big data initiative. But if we're looking at sort of the bimodal side where sort of
the business is leading, the last thing they want is something even more difficult to query
than a relational database. What they ideally want is Excel files, right? So I think that,
you know, but I absolutely see your point about what happens when a CIO or someone steps into a position and says we must have big data.
And it just so happens that down below there have been a lot of requirements that would have been satisfied by big data earlier.
Then there's a lot of success.
I think what's interesting is that it – I just did a podcast earlier today with someone that works for you, Michael Rainey, and he mentioned organizations like Google and Facebook where everyone's an engineer.
That was the terminology he used.
Those organizations will thrive with big data initiatives where you've got a lot of coders and all they need is access to the data.
But I think what's usually more the point is that in standard organizations, there's very few engineers.
And perhaps that's the play for implementers like us to try to bridge that gap. But I think it's problematic when really what they have is a Tableau-like
tool where they want really not very complex data stores. And Hadoop is not that. You have
to apply the schema as you read it. And that's even sometimes more challenging for the business,
if that makes sense.
Yeah, certainly. So, I mean, yeah, absolutely. Yeah. And I'm conscious of time now.
So there's one last thing I wanted to ask you really,
which is I've been,
I mean, one thing I've been doing recently
is thinking about what would,
in five years time, five years time, 10 years time,
you know, what would an analytic platform look like?
So if you were to go into one of our customers
in a few years time,
what would it look like?
I mean, a statement that I've been making,
you know, I tend to kind of make fairly kind of you know sweeping statements is i think that all analytic
workloads will move to hadoop or hadoop style technology in in the next kind of five years or so
and certainly um transactional workloads kind of you know ebs and and kind of erp type things will
be on say oracle or SQL Server or whatever.
But all analytic workloads, I think, will move to Hadoop.
What do you think of that statement?
And do you think that's true or what would you say?
I think it may not happen in that timeframe that you mentioned,
but I do agree with you that it'll happen.
And I'll have one small caveat. I could see a world where relational is dead,
but we still have Hadoop or Hadoop-like, something Hadoop-like, and frankly, something EPM-like, right?
So, so much of the business close – so many businesses close their books on, you know, talking about Hyperion financial reporting-like technologies.
I can't imagine closing the books on Hadoop, right?
So I can certainly see a world where the relational gets squeezed out.
It's getting squeezed from both sides if you think about it.
It's getting squeezed from the architectural IT play from cheaper and in some ways more robust solutions from the big data world, or maybe
not more robust, but at least more fast moving.
And then they're getting squeezed, relational technology is getting squeezed on the other
side by the business and the OLAP style, Hyperion type products that are specifically focused
at finance, closing the books and financial reporting.
You could see both sides of that squeezing relational out in such a way that it's gone
with data processing frameworks that become more and more robust. I'm thinking Kafka and Spark
that start providing data directly to EPM-style solutions for closing the books,
I could definitely see a world where that happens.
It's interesting, isn't it?
I mean, I think that we're probably saying this,
and there'll be people who would listen to us and say,
that's the worst thing you could say.
How can you say that relational databases would be wiped out?
But I suppose one thing I do think is there are probably massive advocates of hierarchical databases back in the sort of 70s and 60s with cobol who
who would say you know why would you have things differently and i think everything moves on and
i mean to my mind that you know if you take if you take a high-end um relational database like
say oracle or db2 there's so much in there that that you that hadoop is not trying to build out
so things like kind of um you know rollback and stuff and and even things like primary keys and and so on and i that that
workload i think is certainly for now the business of doing transactions and having relational data
sets and so on for for businesses it makes sense but certainly um there i think there's two things
that particularly mean that analytic workloads will move to Hadoop. And one is just everything for that kind of workload is about doing it fast and large and so on.
And it scales so well.
Second thing is just the frightening pace of innovation.
I mean, if you look at, say, what Cloudera have been doing, you know, with things like Kudu and Impart, and there's a drill, obviously, from MapR.
How can Oracle, how can anybody keep up with this it's it's the point where i would say that an analytic platform built now on say cloud era cdh or
hortonworks or whatever it's probably actually you know feature-wise better than some of the
stuff we see from the big vendors and that i think certainly in the same way that linux
wiped out the the commercial kind of you know proprietary un Unix. That was just natural, wasn't it?
But there's still going to be Windows on desktops and so on.
But I know it's an interesting kind of thing, really.
And I think that certainly, you know, streaming,
doing things in memory, clustered and all that is where it's going.
The one thing is, will it all be abstracted away in the end, though,
and just run in the cloud?
And actually, in the end, it's just this amorphous sort of service.
I think you're right.
I mean, at least, you know,
if you think of the reason that data warehouses initially
went into relational databases,
there was nothing else.
And it was the easiest thing to do.
I think that now with so many other options
and, you know, what we spend a lot of time
when we're, especially in the especially in like the Oracle database or
databases that initially began as transactional databases, and when we're doing analytics
in those databases, we're trying to find ways to get around the basic constraints of a transaction.
You know, an Oracle database is built around the concept of a transaction and that slows down uh you know
more uh the bigger and uh loads and bigger uh queries so i think that that there is a natural
uh extent we don't need transactions necessarily in your your uh average analytic app right we
don't need acid we don't need those things And we just need to get the data in and
we need to get rough estimates on the numbers. So I think it makes a lot of sense for organizations
to start with analytics as a place to cut costs. I think there's a cost-cutting side that's driving this, and I never want to necessarily dispute that that's a reason to do things.
But in the big data world, the schema on read side is where so much value can be had from these processing frameworks.
As you said, it becomes sort of unimportant how you process the data.
You choose the framework or the service or whatever that's the easiest for you.
That being said though, Mark, one of the things that data warehouses have always provided
and semantic layers and all of that is code reuse, single version of the truth, etc.
Do you think we'll lose that in the Hadoop powered data warehouse? Or do we leave that
sort of single version of the truth for the finance and the Hyperion style applications?
I think, certainly from my perspective, I think one of the things that will change about
data warehouse projects in the future is just automating and
and abstracting away and taking away some of the kind of the work that we do now if you think about
doing a slow change in dimension um data warehouse you know even now with with kind of etl tools and
so on there's still work involved isn't there you know it's not nearly as much before
should that need to be something we have to kind of select and and kind of build each time and
if you look at some of the things coming through from the vendors now around data preparation,
where they're using, for example, kind of classification routines to automate the kind of data profiling,
but also to do things like trying to work out the name of a column based on the values in there.
I think a lot of the stuff in the same way that cloud in general is trying to consumerize things.
And I think the work involved in doing this will kind of go away in the days when we
kind of sit there and choose which kind of like in odi terms knowledge module we pick for a seat
for changing dimension that will go away but there's no such thing as a free lunch and i think
you still have to think about how do you integrate data how do you how do you kind of handle history
and so on and really might when we talk about
cloud in general and we're talking about company and customers and so on you know there's always
still work to be done integration and cleansing and things they still they take time really and
i think that those things will always involve work but if we can automate part of it and if we can
kind of you know take away some of the kind of the low-level detail that that's good really the
other thing really is i doubt very much it's an interesting thing to think that we sell these kind of things
but then we go into a customer site you know in a few weeks time when they're using bi tools from
10 years ago and i think that's the as much as we're sitting here thinking you know is it going
to be the future it's a future kind of classification or drilled you know that in a way it's like who
was the who was the uh who was the author that said that you know the future is here but it
arrives in kind of you know different speeds in different places it's like, who was the author that said that, you know, the future is here, but it arrives in kind of, you know, different speeds in different places.
It's, you know, things change a lot, but things never change much at all today in some respects.
Yeah, I mean, I agree.
It's interesting to see, like you said, that we still go into customers and they're using BI tools from 10 years ago.
And the thing I would add on that is, and not using them in the proper way or even effectively. So what's interesting is that they're behind, well, not all customers,
obviously, and especially not mine. You are all wonderful. But there are some customers out there
or organizations out there that, you know, they're looking at, well, we should probably go do a BI,
sorry, a big data project. But their BI project is so poorly done. So maybe those organizations,
it's time to rethink maybe a greenfield implementation using more sort of modern
analytic tools. I want to make one more point. You said, well, what will the platform look like? I think that what we might see
in the future around analytic apps is there is no central place for them,
except for the finance side. And I think the finance side might completely be offloaded to
Hyperion style applications at some point. But then everything else just is being built and maybe it's being built in custom code.
Maybe it's being built but almost in a microservices style.
And if you sort of get on board with the data streaming frameworks, that becomes much easier to produce just-in-time analytics in small doses.
And I think if finance can close the books some way, maybe we don't
need that single version of the truth for everything else. I don't know.
It's interesting. I think for me, I mean, it's great that the industry is still kind of relevant.
It's great that we're still talking about it. And it's great that there's, if you think about
the fact that you pick up a copy of The Economist or Time and they lead stories about a business
that is innovating through analytics and so on.
And it's great to be in a part of the industry that is, you know, IT that is kind of so popular.
And I think from a personal side, you know, when a lot of the work in IT was outsourced to India and places like that.
And really analytics was the one thing that was saved because it is so personal to the people you're doing it with.
And it is so kind of important to the business.
And it's such a kind of thing that, you know,
it benefits from consultancy
and it benefits from expertise, really.
So I think we lucked out really in a way
being this kind of area.
And it's interesting to see how the vendors go
and how that might change over time and methodologies.
But it's great to be in an area
that is still so kind of popular and buzzing, really.
Yeah, it's where so much innovation is happening.
It's crazy.
I mean, when you go look at startups, you know, you're going to hear about, you know,
10 new startups, 10 new hot startups.
And, you know, seven of them are doing something around analytics.
Yes.
I mean, it's really, you know, we're at the cusp, man.
Excellent.
And I'm here just because I followed you.
Exactly. So, Stuart, it's been great speaking to you. I mean, it'm here just because I followed you. Exactly.
So Stuart, it's been great speaking to you. I mean,
it's great to have you as our first guest.
So thank you very much for coming on the show.
Oh, it's my pleasure, Mark. I look forward to
all the listening to this
and all the future episodes
you're going to have. I mean, it's just going to be fantastic.
I can't wait to
catch the feed in my iTunes. It should be excellent.
Cheers, Stuart. Okay, thank you. And thanks, everyone. Goodbye.