Drill to Detail - Drill To Detail Ep.1. 'After The Gartner BI&A Magic Quadrant 2016', With Special Guest Stewart Bryson

Episode Date: September 20, 2016

...

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Drill to Detail, a new podcast series hosted by me, Mark Rittman, where I'll be commentating on the big data, business analytics and data warehousing industry, along with a guest who's either building the analytics platforms that sit behind all those web startups and companies disrupting every industry these days, or like me, actually implements, consults, analyzes, and commentates on what must be one of the most hottest and most newsworthy and topical areas of the IT industry these days. So I was inspired by John Gruber's Apple-focused podcast series, The Talk Show, where he interviews a mix of industry commentators, Apple execs in his case, and a sort of one-off guest with insights into a particular area. And in this inaugural episode of the podcast series, I'm very pleased to be joined by
Starting point is 00:00:54 Stuart Bryson, who's an old friend, a colleague from the past, and someone who's probably very well known in the Oracle BI data warehousing and analytics industry. Probably the second most popular speaker in the conference circuit. So Stuart, why don't you introduce yourself to the listeners and tell us who you are. Absolutely, yeah. So my name is Stuart Bryson. I'm like Mark. I'm an Oracle ACE Director in the Oracle space. I'm a co-founder of a company called Red Pill Analytics. And I've been working with Oracle Technologies for about 18 years. I started soon after college.
Starting point is 00:01:30 I took a bit of a walk on the wild side, worked for Informix Software for a couple of years. I was on their SWAT team. And that kind of got me into professional services. And it also is what introduced me to really to data warehousing and BI. So I started doing data warehousing on a different database than Oracle, but I soon saw the light, got back in the red path. And so, you know, I worked for myself for quite a long time as a consultant slash contractor. But all that while I was following this very good blog from this interesting gentleman over in the UK who seemed to lay it all out there every couple times a week, really. You were a
Starting point is 00:02:15 pretty verbose blogger at that time, Mark. And I just knew that if I ever had the chance, I'd want to work with you. And that dream came true. I worked six or seven years there at Rittman Mead, quite proud of my time there. And recently, looking at the two-year anniversary of our company about two years ago, I joined another colleague I respected, Kevin McGinley, and we started Red Pill Analytics and the rest is sort of history. Excellent, Stuart. Thanks. Well, it's great to have you on here. And I couldn't think of anybody I'd want to have on the first ever episode, really, because like me, you've got a lot of opinions.
Starting point is 00:02:51 You've been doing this for a long time. And really, in this first podcast, I wanted to talk about some things that are very topical in our industry, but actually are very relevant to you as well in the kind of areas that you work in so some people within you know some people within the bi kind of world a lot of people probably would have seen the um the garden of magic quadrant that came out recently the bi analytics magic quadrant so that was i think around about sort of january february of this year and and it was fairly kind of dramatic very big news because uh oracle for example the company that we work with mostly uh was actually dropped from the magic quadrant uh which was you know a ago before that, it was in the Leaders Quadrant.
Starting point is 00:03:27 And so what happened in that period, as Stuart knows, was that Gartner kind of redefined what they considered to be a modern kind of BI tool. And they started talking about the fact that really BI these days is led by the users rather than by IT. And really, what constitutes in their mind a kind of modern BI tool is not a tool that is led by IT. So, and there was concepts in there of things like bimodal BI and that sort of thing. And so I thought it'd be interesting to kind of talk about in this first bit, as Stuart, someone who has worked in this area,
Starting point is 00:03:59 you know, are you seeing this kind of idea? Are you seeing this thing in the market where, you know, I suppose tools like Tableau are being used more? Or is that more kind of just hype from Gartner? There's a couple of things going on here, I think. First off, you know, it's my understanding and, you know, I can't quote anybody on this, but Oracle chose not to participate. And that's why they're not in there at all. Now, I can't validate that, but that they knew that they would not be judged
Starting point is 00:04:26 on their newest product set and chose not to participate. I don't know if that's the case. Regardless, I don't think they would be probably where they're used to being anyway. I mean, I think there's something interesting about what Gartner is doing. I have to agree with them to one degree in that they are certainly describing what's happening. I think that the idea to break enterprise reporting, as they've done, away from BI and analytics is probably a smart approach as far as how to categorize these things. At the same time, there's a difference between describing and prescribing what we think necessarily should happen. And I think when you look at, it's not just in analytics really, it's in a lot of what Gartner is doing now. They sort of have this distaste for IT in general.
Starting point is 00:05:15 And I think that's coming out in the way they're categorizing things. But it certainly does, Mark, describe what's happening. And I think maybe even this has been happening with analytics longer than perhaps other, you know, we'll say development platforms or infrastructures or applications. And that, you know, even with BI reporting tools, you know, folks have been pulling that data out and trying to put it in other more adaptive tools. I think that's been going on for a long time. What, you know, wrapping it up and into a term like bimodal or or third platform or whatever, you know, Gartner kind of wants to describe it. I think, you know, we've been seeing that for a while with desktop tools, for instance. So I think for anybody who's on this who hasn't read the Gartner report,
Starting point is 00:06:09 or certainly the detail of it, just to highlight what that is really. So the Gartner report said that really, and this is a theme I think they've had through a lot of their reports they do, because they obviously have so many segments and the markets they cover, the concept of bimodal IT, where there are some types of projects that are really kind of done best as a very structured project by IT over many months in many phases and so on so putting in your ERP system or something kind of fairly big but some projects are better done more kind of maybe even by
Starting point is 00:06:42 that shadow IT so where the initiative comes from the business and they buy what they want and they start small and build out from there. Now, what they're saying, obviously, in this magic quadrant is that the innovation and the license spend is happening in this area here. Now, where that affects Oracle and vendors like Oracle, so for example, IBM and Cognos and so on,
Starting point is 00:07:03 is if people aren't buying your tools, then that's obviously an issue. But I think for us as practitioners, you know, is this just hype? Is this something where, you know, is self-service IT of, you know, way of doing things are, no data modeling or kind of optional data modeling. I mean, Stuart, this has been your industry. This has been your kind of lifetime. You know, do you think there's value in the statement to say BI should make data modeling and data curation optional? So optional permanently, I don't believe. However, I do think the concept of not necessarily starting there is meaningful. I mean, if you look at what, you know, if you were to launch an analytics platform from scratch today, you would certainly do something Lambda-like, meaning that you would try to address the streaming side of it and
Starting point is 00:08:06 address the sort of more batch-oriented side, you'd probably try to do both of those. Because if you're building anything new today, it's going to have a very heavy mobile side of it. And sitting around waiting, a mobile application needs access to things quickly, especially if it has anything to do with customer, etc. So I think that there's certainly value in making data readily available, and it doesn't have to be heavily curated. I mean, it's usually some small bite-sized morsel of analytics that needs to go into those applications or go into mobile apps. So we certainly need to make analytics available quickly. But we don't necessarily need full-fledged, curated, conformed models available immediately. So I think that, you know, when I started talking
Starting point is 00:09:01 about sort of agile BI or sort of model-driven, I think is what I called it back in the day, and I kind of took a step from you on that when I started looking at that stuff, was that we wanted to just be more adaptive. We wanted to try to push these enterprise tools to their limits. I think what I've discovered in that time and now, I think there's still a lot of value in approaching enterprise reporting platforms in that way. But really leading with easier analytics tools, sort of having what we call at our company an innovation stream. And we may have stolen that from someone. I'm not sure exactly where we got it. But sort of leading with an innovation stream and kind of following with enterprise IT tools, I think, is a little bit more realistic and a little bit more what's happening anyway. I think that that gap between leading with something a little bit more innovative and following with something a little bit more structured is where the gap is today. I think that that follow on is not happening.
Starting point is 00:10:05 Yeah. is where the gap is today. I think that that follow-on is not happening. Yeah, so a lot of work I've been doing, particularly in the last kind of year, has been around sort of big data and data reservoirs and this sort of thing. And there's a couple of things that come out of that. So you're never going to use a tool like OBI, for example, or Cognos or whatever, probably as your first tool you'd access Hadoop data with.
Starting point is 00:10:23 Because of the schema on read part, typically you're spending as much time getting schema out of it. But typically you want to just see what data is in there. And particularly now with tools like Apache Drill, where it can point to data that has schema built into it or certainly can be read out of there. So JSON and their CSV and so on. I think that certainly the ability to select the way
Starting point is 00:10:44 in which you deal with metadata, you know, that metadata might come from the data itself. It might be added by the users themselves as a part of their curation. I think that rather than saying all metadata must be formally defined, you know, through a dimensional model and so on, you know, it has its place. And we, you know, you and I have been teaching it for a long time. But I think, you know, it's about having the right approach at the right time really where it gets interesting for vendors like oracle for example is is how they kind of deal with that world and certainly i'd be interested in your view in a second on on tools like dv desktop for example with oracle i mean one of the things
Starting point is 00:11:18 we're trying to do with with obi-12c projects and any project really where traditionally it's been an enterprise BI project for us is to think how can we incorporate some of this thinking in there? How can we get the users to be involved in the curation process and so on? Have you thought, for you, have you thought about different ways of doing projects now because of this or different tools? Absolutely. I mean, if you really look at it, even for enterprise BI tools, there's always been a certain set of users. And for some organizations, a large set of users that dumped that data out and took it to something like Excel anyway. So I think that all tools like Tableau have done is make that a little more
Starting point is 00:12:02 visual and perhaps a little easier. I think that we can harness these things that are going to happen anyway. I mean, trying to cut off this discovery and innovation side of analytics is a fool's errand. I think that you're not going to be able to stop it, number one, nor should you. I mean, innovation should drive execution, really. And I think that, you know, not acknowledging that is sort of a problem that, you know, IT has themselves to blame for a lot of this. I mean, if you look at, you know, over the last year, I've been doing a lot of talks on mashups, self-service, that sort of thing. And it gets the developers and business folks in the room kind of jazzed and nodding their heads with me. But there's always someone from IT that says, you know, we can't open Pandora's box. We can't allow this. And for that
Starting point is 00:13:00 reason, I think IT is just kind of just put blinders on to a certain degree because it's going to happen. It's just the question is, is it going to happen in an environment that you can control or are you going to simply push folks out? And I think that's what's happening. They're going to the cloud. They're going to shadow IT. They're going to sanctioned shadow IT if that that's such a thing now, because that's where the money's flowing to. So in that case, it's almost not shadow IT, it's sanctioned IT, but just from a non-enterprise architecture. I think that IT, you know, has to bear some blame
Starting point is 00:13:38 for this. I mean, what do you think? Well, it does. I mean, we'll get onto that in a second. And really the second topic I want to talk to you about is about how i how we how the it part play plays its part in this kind of thing really and whether agile is still relevant and so on but i guess before we go into that just a quick thing really um oracle have the area we work in stewart is is kind of oracle bi and uh there's been a load of of um of new bi tools come out from them recently so there's been data visualization desktop there's been the cloud things and so on. Certainly from my side being candid, you know, certainly when BI cloud service came out,
Starting point is 00:14:11 I struggled, we struggled really to think of what we could do with it because it was such a different market to the market we're in. You know, very departmental, very self-service and so on. But we've been trying to think, you know, how can we use tools like data visualization desktop
Starting point is 00:14:25 which is their equivalent of kind of tableau desktop how can we use those tools put them in users hands that initial work really of doing some kind of um of discovery and some curation and so on but then somehow feed that back into a central model i mean it have you i suppose in a way have you thought about ways and you can use those tools and use cloud and use desktop tools in a way that isn't thrown away, but actually they can contribute back to what the bigger model is? Absolutely. I mean, what's missing is some sort of a migration from the desktop tool to the enterprise tool. Maybe we'll see that in Oracle's product line at some point. But I certainly think that it's changed my perspective.
Starting point is 00:15:08 I mean, you know, it used to be Tableau was mentioned occasionally in our customers. And now it's in all of them. And I think that to try to decide, you know, we have a lot of customers that will say, hey, so we've got a lot of times it's IT representation speaking to us and saying, yeah, how do we get Tableau out of their hands? And I just think that's the wrong way to look at it. At the same time, there's a lot of things that Tableau and all of the tools like Tableau don't do well. There is a place for curated content. I think that the problem is that, you know, at least in the worlds that the world that I grew up in, in BI data warehousing, and I think it's the same with you, Mark, is that we always started
Starting point is 00:15:51 with the curated content, we had to sort of lead with that. I think that that the that the tool based or discovery based or however you want to describe it approach does two things for us that are very, very valuable, and that we should endorse. That is, one, it helps get our requirements for us, right? So instead of IT handing over some archaic, never-read-again requirements document and making someone fill it out and triple it, we give them a tool that allows them to express what it is they're looking for. And that could be very, very valuable for trying to capture what it is that's truly in their hearts and what they really want to see. And there's no better requirement for building a dashboard than something that's dashboard-like.
Starting point is 00:16:41 I think that's a big piece of this. And the second real piece of value that we get from that is that while IT is building it, the business has something that they can use in the interim. And I think that that's what IT sort of forgot about is that you'll get it in a year. It's just not relevant today. And if they don't accept some sort of change to change that uh they're gonna they're gonna find themselves even more dinosaurs than than perhaps they're already looking at now okay so that's i mean that's an interesting kind of lead into the next thing i want to talk to you about really was um so so stewart you know you you have talked a lot recently about about agit it delivering agile projects for bi you know and we as well. It's been a common theme.
Starting point is 00:17:25 And we've kind of, I suppose the BI industry has evolved from these very kind of long, waterfall data warehouse projects to BI ones that have agile in there. But the kind of irony in a way with this, with the whole kind of Gartner report and the bimodal and so on, is that it almost kind of obsoletes
Starting point is 00:17:42 everything we talk about, or potentially, because the last thing in a way that obsoletes everything we talk about, or potentially, because the last thing, in a way, that business users want to talk about is agile and kind of methodologies and so on. So I put a question to you, Stuart, is, in a way, do you think this move towards the business running BI projects and this bimodal approach in shadow IT, is it going to kind of lead to, you know, first of all,
Starting point is 00:18:01 is it going to lead to almost like a dark ages of users kind of, you know, individual silos of data? I mean, what do you think is relevant for agile methodologies and IT methodologies going forward in BI? Do you think we're getting a listen really? I think more so than ever. And let me make that case. So, if we do sort of endorse and embrace the idea that the business is going to lead IT from a discovery perspective, still a sort of a backfill for enterprise reporting, at least I would argue there is. Not everything that comes out of their desktop tool necessarily needs to go into the enterprise reporting tool.
Starting point is 00:18:59 I think that's the first thing that IT gets wrong in trying to bridge this gap is thinking that everything must go in. Well, actually, if Tableau is solving some departmental need that no one else cares about, that data is not something that anyone else really needs, then it's not necessarily requirements and the IT organization or the more structured architecture side of the house wants to follow and take some of these requirements from the business that they're finding on their own. Then what better way is there to sort of rapidly take those requirements and build them than an Agile methodology? We no longer are looking at the target state of an entire sort of BI application like there's some series of checkboxes that we can hit and be done. I think the idea that they're going to constantly be ahead of IT is a, you know, it's a challenge, but it's, you know, it's an encouraging challenge. And I think agile methodologies for IT will help them bring the reporting side closer to the business. And of course, they'll never overtake it, but perhaps they'll follow closer behind using those sorts of approaches.
Starting point is 00:20:20 It's interesting, isn't it? Because I suppose it's an interesting thing in that the market for cloud BI and these desktop BI, it shouldn't need kind of training and that sort of thing. And also, in a way, it's all about scratching your own personal itch. But for me, the interesting thing is how we deliver, how we go into customers with kind of the latest generation of kind of BI tools with the the kind of the the idea of kind of bimodal it and so on how we can in a way that typically these are now led by the business they're not led by it and i and and so part of the issue we have is trying to get it to stop being a blocker and to be an abler and that sort of thing but but but certainly you know you i suppose the business aren't really interested in methodologies they're not really interested in the big picture and so on. And I suppose the hard thing is how do you – we often talk in Agile. You've talked and we've talked about Agile in the past, having technical debt and having sort of sprints that do things
Starting point is 00:21:14 that don't really kind of do anything that displays on the screen but make reports run faster or more stable. And we've often said we would lead that, for example, via a sprint that's about kind of performance for example but in it but in a way you know the difficult thing here is is you know things go in cycles here and you can imagine a project that has been led off of things like sort of tableau and so on how do we get get across to people about the ideas about things like slow change dimensions and things like kind of you know the parts of a warehouse that are things you have to do, but don't really kind of give end user value. I mean, how do you kind of deal with that sort of thing, really? So I think one of the things to note is that when you talk about like
Starting point is 00:21:55 taking a sprint to do a refactoring of architecture or something like that, the business has no appetite for that. But generally, they don't have an appetite for that when they're in an environment where they're constantly waiting on IT to give them things. When they're in an environment where they can lead and they have tools and capabilities that allow them to answer their personal questions or at least their departmental questions, something like Tableau, something like data visualization that can give them answers quickly to what they're looking for, then they're not necessarily waiting for IT to produce every single piece of data that they're going to consume. So I think what happens is if we embrace the idea that there will be, that, you know, bimodal implies that there are two modes. if we embrace that there will be a mode that innovates and leads and the customer can be in the driver's seat there, then they'll have more patience for things like refactoring sprints, slowly changing dimensions and those
Starting point is 00:22:57 things because they're not waiting for the first cut of data. What we've always done in data warehousing prior to this paradigm shift is not give them anything until it's fully done or conformed or at least some piece of it is fully done. And that's one that they don't have patience for. But if they have something at their disposal to answer some questions for them
Starting point is 00:23:23 while they're waiting for that next sprint, I think they'll be more patient yeah yeah I mean another another another I suppose another big trend in our in our industry has been cloud so if you think about now users are as likely to go and get a cloud service for do bi I mean I noticed that there was a thing announcement this week Google had a own kind of analytics tool that they bit very similar to the kind of the power bi and and dvcs and that sort of thing from oracle what's it called uh it's called uh i mean it's on it's on twitter i i mentioned it they it's got a name that is some variation of the usual words used for these kind of tools the idea is it basically acts as a bi tool on top of uh google apps data so spreadsheets, Google BigQuery and Google Analytics,
Starting point is 00:24:05 but also has, yeah, it's interesting. But one thing, one trend I have noticed in cloud apps and cloud BI is you can start to end up with silos again. So, you know, every kind of SaaS app that has a BI reporting tool, you know, is a separate one. And again, leading into some of the things that you've been talking about in the past like continuous integration and and all the things that we do around bi projects to make sure that environments match and so on they're hard to do in the cloud when you've got disparate kind of sources and so on you know in a way going back again to this doom-mongering thing are we now heading into a period when we're going to have lots and lots of silos of bi apps that really don't have any kind of configuration management across them because they're in the cloud by
Starting point is 00:24:44 different vendors and you know do you see that happening almost balk any kind of configuration management across them because they're in the cloud by different vendors. Do you see that happening, almost balkanization of BI going forward? Well, so there are trends to cloud-based data. Sorry, cloud-based continuous integration tools. I mean, there's certainly a lot of CI tools, CircleCI, Bamboo's in the cloud. There's several Jenkins in the cloud implementations that will connect to different things that are in AWS, etc. So they haven't boiled down to BI necessarily, but I am encouraged when you see what you're describing the Google BI tool and it sounds very similar to AWS's QuickSight, which is similar in that, you know,
Starting point is 00:25:26 they're sort of their first go around, it connects to all the AWS things, right? So it easily can report and federate over those things, which sounds similar. I think that it's encouraging that those things are, that those tools and products are running in cloud services that also have CI and pipeline tools. So I think it's a natural sort of merging of, you know, maybe perhaps finally BI gets real CI because, you know, we still see with almost every customer we go to, lack of source control, lack of automated testing, lack of automated migrations. And that's for any tool. I mean, take almost any enterprise reporting tool.
Starting point is 00:26:07 What's encouraging about the bimodal approach is that the new tools that are coming out, BI analytics tools and that are running in the cloud, are tending to be on architectures that are part of the third platform that are more developer-centric and developed in more of these CI types environments. So it might be that there's a natural extension to providing these capabilities in the cloud. So I understand what you're saying as far as the business is not necessarily going to say, hey, we're opening Pandora's box and going with this BI analytics tool. By the way, we're also going to plug it into the cloud CI capabilities and automated regression capabilities that some other cloud vendor has.
Starting point is 00:27:00 They're not going to naturally do that. But if there are hooks, the problem is there's no hooks in the BI tools today for doing any of this. So it has to be very, very manual. If there are hooks, and the cloud is always good at providing hooks, we might see some of these things become natural sort of tiers to delivering BI. I'm at least hopeful of that. Yeah, it's interesting.
Starting point is 00:27:23 I think really, you know, to kind of sum up my views on the whole bimodal thing and mode two, mode one, and all this kind of side is, it's just reflecting the fact that, you know, not every approach is appropriate for everything. And in the same way, in the same way that you hopefully wouldn't try and build your entire enterprise reporting off of kind of, you know, a desktop tool, you also wouldn't want to incur the kind of cost of doing curation and enterprise stuff for everything else and I think where Oracle had got themselves into a situation and probably a lot of the big vendors is that's the only way you could do things really and the challenge for them a little bit and all these vendors is how to adapt and so on but it's kind of I suppose
Starting point is 00:27:59 my other side is it's always good if there's customer interest in this thing and the fact that there is this innovation, there is this kind of change and so on, it shows us interest at least. At least we're not in some kind of area that is kind of out of date really. Could I make one more point on that before we move on? Yeah. So what's interesting is that Oracle's in a unique position now with like data visualization desktop and the data visualization service, along with sort of the enterprise side of the house. They're in a unique situation to perhaps deliver both sides in at least a consolidated tool set. And I think that's probably where they're betting on their message. I think it's a good message. But the ironic thing, Mark, is that I'm not sure who's going to hear it, right?
Starting point is 00:28:46 Because the business doesn't want to hear about Oracle's capabilities of bridging that gap because they don't recognize the gap. IT is too angry that the gap exists to even recognize the need for, at least, you know, I'm painting with a broad stroke here, but a lot of IT organizations don't want to hear about the tools that bridge the gap because they want to eliminate the gap by eliminating one side of it. So I'm not sure who's going to hear that message. And I don't know if there's a cost center in an enterprise anymore that's got the appetite to try to bridge a gap because the cost center is usually either with IT or the business. Interesting. Yeah, interesting. So the last thing, Stuart, I want to talk to you about,
Starting point is 00:29:31 again, is quite relevant because of you and I and so on, is obviously for anyone, again, who knows myself and Stuart, we've worked over the years and the company we work for has worked on, actually collaborated with Oracle on some reference architectures for data warehousing in the past, BI and so on. And then Stuart and I worked on, with a guy called Andrew Bond and Doug Kaket, we worked on an updated Oracle information management reference architecture
Starting point is 00:29:58 for last year, which is, the idea is that it's then taken by implementers and by Oracle and by customers and gives a kind of like a canonical example of how you'd architect, in this case, a big data extended data warehouse platform. So one that had both kind of, you know, a data warehouse in there, it had BI and so on, but it tried to incorporate some of the thinking about data reservoirs, things like real-time loading uh and events and so on and so forth and it had a number of kind of aspects to it and i thought it'd be interesting to kind of in a way to to look back to that and think about yeah in a way has that
Starting point is 00:30:35 come to pass because with a lot you know with a lot of um reference architectures you know they're sort of to an extent they're a sales aid and they're also a slight kind of punt as to what things might turn out to be and and so and actually the point here is when we publish this podcast there'll be some show notes and in the show notes there's links to all these things so if you're kind of wondering what we're talking about there'll be links in the show notes about to be white papers for example for the architecture and so on but let me just kind of go through a few stages a few a few parts of that architecture and just talk through with Stuart, you know, with yourself, Stuart, if it was relevant or not. So first of all, one of the kind of main innovations in this kind of reference architecture for BI and data warehousing and big data was the idea of splitting out the innovation layer and the
Starting point is 00:31:20 execution layer. So Stuart, do you want to explain what that is, what it's about, first of all? Certainly. I mean, if you look at the reference architecture slides, or at least the diagrams, what you see is a very sort of distinct line between the execution, which is on top, and the innovation, which is below. And sort of execution kind of talks about, you know, I don't want to pigeonhole it too much and say it's kind of what happens with IT. But in some ways, it is sort of the packaging and hardening of processes. And then below it, you see the innovation side of the diagram. And it talks, you know, there's the output for the discovery lab there. And you see, you know, it almost, you know, I don't know what your thought is, Mark, but it almost is the two sides of the bimodal. It almost, and so when you think about that side of it as far as have they sort of, you know, prescribed what Gartner's talking about? Have they, you know, sort of,
Starting point is 00:32:27 you know, guessed it? It's, you know, it's not really clear. But that execution and innovation side is trying, at least in my mind, the way I read it, trying to provide a place for the innovation that we've been describing that might be led by the business, or at least folks that are working on behalf of the business. Yeah, I think it certainly has parallels, I think, with the bimodal IT sort of idea. I mean, I think, I suppose a particular feature of big data and sort of new world projects really for BIs is that typically you don't know what it is you're doing at the start.
Starting point is 00:33:07 You certainly don't, you certainly don't know what tools you're going to use. You don't know, know what, for example, algorithms you're going to use and so on. So when we start data, sorry,
Starting point is 00:33:15 big data projects, we typically start with, you know, a selection of tools, maybe some VMs, a set of data. Now, during that process of working on that you're going to you're going
Starting point is 00:33:25 to arrive at the routines and the algorithms and tools and so on now nothing in that area should those tools themselves ideally shouldn't go into production that kind of approach there they should be taken hardened and so on and so forth and put into a kind of business as usual area now where we find projects then go can go wrong is first of all trying to do that innovation within a production style environment and i've worked on projects with banks for example where they've applied the same level of kind of governance and so on to doing a project in r and python where you know every five minutes we're asking for a new package or new whatever or where the project has gone well in in that first phase in the pilot but then as soon as it gets put into as soon as it starts to be used the whole
Starting point is 00:34:04 environment gets locked down and then you can't do any more and that splitting of things and saying that to innovate further you need to make sure that that's kept separate that's really important i think and also understanding that the same level of governance and things like controls over what tools are used that can't apply to the innovation area as well but you can't then put that into production afterwards because it hasn't had that same level of scrutiny and so on right so you know that we found that as important area yeah i mean there there always is that question when you talk about some you know we we've we've called it the innovation stream and and maybe this is where we we stole it from i'm sure we must have stolen it from somewhere but we call it the innovation stream and we talk about taking a first pass or second pass or third pass at things, sort of triaging it and trying to figure out whatever we produce out of here, where should it go next?
Starting point is 00:34:56 And I think that the idea of IT and security always comes into play at some point, sometimes sooner than later. What's interesting about it is that they'll, you know, and again, sometimes this is a broad stroke, but IT will try to lock down these layers and lock down these sort of discovery labs and whatever we want to call these shadow IT, if we want to call it that, applications to do analytics. But then they'll give everybody Toad access to the database, right? It's like, there really is not that much difference between what's going on in a discovery lab and somebody querying the source database in real time using Toad. I think that you've got to get on board with getting people what they need because they simply will go connect to Toad, export the data out,
Starting point is 00:35:56 run around with it on their laptops, look at it in Excel, et cetera. I mean I think you're I understand that that IT has the best of intentions when they're trying to lock things down but but people are going to find ways to answer the question so that they can make better decisions rightly so exactly I mean so so I suppose a broader question really is the whole idea I mean so this was meant to be the the successor in this case is isn't this reference architecture to the previous one that was more based around kind of you know a traditional data warehouse and bi and so
Starting point is 00:36:32 on and every every one of these reference architectures that Oracle do which we get involved in that each one evolves a bit from the previous one so so I remember years ago there was one that was very kind of I've got a diagram somewhere it's all in kind of black and red and so on. It was all the kind of. You must have liked that black and red. Yeah, yeah, yeah. But it was, so that was very database centric.
Starting point is 00:36:50 And then there was a BI one as well. A question to you is, as much as we talk about big data and we talk about data reservoirs, are you actually seeing that being used on projects? Are we actually seeing as much use of Hadoop and Schema on Read and all these things on customer projects, even new ones, as the architecture would suggest? Or is it a bit of a kind of emperor's new clothes? Or what are you seeing out there? Is big data being used in BI? It's not.
Starting point is 00:37:18 I mean, it's a discussion point. It's not that it never is. I mean, obviously, we've seen some of it, and we've implemented some of it. But, you know, it's always a discussion point, and everyone's trying to get their head around exactly what is it going to do. I mean, I've always believed that, you know, if you have the need, you'll find the solution. And I think that's why companies like LinkedIn, Facebook, Google have produced so many of these technologies is that the sort of consumer set of products, you know, what's available off the shelf just doesn't do this stuff. But in reality, companies don't – a lot of companies don't have these needs.
Starting point is 00:38:01 We have a lot of companies that are saying we need to implement, of course we have to implement big data. And it's like, but you know, we also don't have our standard reporting done yet. And it's, it's the idea of, you know, if you don't even have your, your, your known questions answered, are you ready to take on new questions? So I think what we are seeing though, Mark, is, is a bigger focus on the Lambda architecture and things like Kafka, things like data streaming. I think, you know, for a lot of people, you know, when Hadoop was sort of the go-to for big data and you start to figure, well, do I load Hadoop and do I load it first? Do I load it, do I load my data warehouse from Hadoop? Hadoop's not really great at that.
Starting point is 00:38:52 And it's actually increasing the latency of data getting into my data warehouse. So I think what we've seen a lot and customers are a lot more responsive to is talking about the streaming stuff because what that does is address a very specific business requirement of getting them data faster. When you start talking about what is the value of Hadoop, it's sort of amorphous about what that value is. I mean, if you have the need for offline processing, machine learning, those sorts of things that those innovative companies I mentioned before doing, then you need that. But the question is, you don't even know what, you don't even know if you need it. But if you start talking about data streaming, that's immediate value. We can get, you know, data to both your data warehouse, which is a known quantity, but also perhaps some of these sort of small fit for purpose, maybe even call it microservices,
Starting point is 00:39:52 these little analytic apps that sort of spawn up all over the place, we can get data in there faster and in a more sort of structured way. I think that's, we're seeing a lot more, you know, response to that sort of message. One of the things I struggle with on the reference architecture, you know, I have a lot of respect for Andrew and his whole team, but I've often tried to figure out if this is describing what's happening or if it's sort of prescribing how it should happen. And I think that's sort of back and forth on this about whether or not this is a guide to how you should do it, or is it sort of a description of what's happening, and you just sort of accept it. I'm not positive about that. What do you think? and big data reference architecture in this case came out of the you know the enterprise architecture team who aren't particularly aligned to particular products and they are looking to i guess they're looking to find success in quotes with from our customers and so this particular reference
Starting point is 00:40:54 architecture i think had quite a lot of um uh was driven a lot by a particular implementation in a bank in in spain that was was doing this and there was some i think what they're trying to do is to reflect things that are working and are providing competitive advantage for customers so i don't i don't think this is a this isn't your classic kind of marketecher in that it's um something there just to sell products and to put things into into slots i think there's some sort of there's there's truth in it although obviously some of the you know in some of the concepts in in the particular oracle one to do with say sort of like event stores and so on are almost to the point of being so abstract as to be you know unrecognizable really but i think that certainly back a bit to to is big data being used in in bi projects i mean
Starting point is 00:41:36 the one thing i don't see being used is big data in the way that it's always described so whenever you see anyone talking about big data and the awful kind of three v's you know velocity, velocity and all that, and it's like sensor readings from smart devices and all that. But, you know, we don't, I don't, the projects I deal with as a BI implementer typically aren't that kind of thing. You know, it would be, it would be a customer, for example, who is looking to do, and actually these are ones we actually are doing with customers where they're doing the kind of customer 360 sort of project where they're bringing in data from external sources and it's arriving in sort of JSON format, for example. And so it's not big data in as much as there's lots of it, but it's coming in a format that probably lends itself better to being stored in HDFS, for example. And then a tool like, say, Drill is much faster getting to it than than say a relational tool and so it's a bit my analogy is it's like the space program you know in the space program in the 60s um you know we put some americans put someone on the on the moon and there were technologies that came out of it that have been
Starting point is 00:42:34 used ever since so my point my point is is that some of the some of the things that come out of it the flexible storage and the biggest innovation for me that i'm starting to see customers pick up on now is this idea that that you know you can create a platform that lets you store all the data and then use different processing frameworks on that same data. So if you think about, say, an Oracle big data appliance, you can put the data on there and you can use Spark, graph analysis, you can do SQL analysis. It's this ability to land all your data and actually to do anything with it that's starting to get traction now really and so we're we're starting to find it's happening i think this is a slight element of a problem looking for a solution yeah um i think certainly every customer every big customer that we deal
Starting point is 00:43:22 with that is adding capacity to their data warehouse is doing it through hadoop so they're not spending any less for example on on or having any less big analytic databases but any extra capacity is being added by by hadoop um i think as you said streaming is now the new batch you know in a way streaming is is how we're doing things but what we what we don't see is almost the things that you hear people talking about. And if you think about, one of the challenges that we find in actually implementing big data projects for customers is even more so than reporting projects. They kind of don't know,
Starting point is 00:43:54 they often have no idea what they want to do. And I think there's more of an onus on us as implementers to go in with ideas. And I think that's the challenge a little bit for vendors like Oracle and other ones that only really have a kind of a software play microsoft for example is they don't go in with the ideas they just have the platform and yeah it's it's you know customers particularly anybody now that's thinking about doing a bi a big data initiative in 2016 is probably you know with with respect it's probably not the kind of the ubers this world
Starting point is 00:44:24 and so on and that that kind of whole thing of, you know, we want to do a big data project, but we don't know why, is a challenge really, isn't it? Do you think, though, that, you know, if you're sort of a standard or traditional organization and you've decided to do a BI initiative, or sorry, a big data initiative, and you're sort of trying to, you know,
Starting point is 00:44:43 find out what to use this for, are you going to get any value? I mean, from my perspective, and the answer may be yes, but from my perspective, you're looking for a big data implementation because you have a need and you haven't been able to solve it with traditional tools. You know, you've got a lot of files coming from some sort of system or systems. They're dumping in a directory, you know, just gigabytes, millions or billions of rows a day. And you've been trying to load those into a relational database for years and you've been, you know, you haven't been able to solve that or you know i think when when when those sorts of uh organizations are looking for some sort of a of a solution and
Starting point is 00:45:32 they find big data then then you know everything aligns but it's when the organization that says okay we're not doing big data we know we should be so let's do big data that there's not going to be anything that comes out of that in most cases and at least in my opinion but i think that you're obviously correct so i think any any it project or any project that doesn't really got a kind of business purpose is is right but sometimes that can help when someone's boss for example suddenly decides they want to have a big data project and up until then they've been you know their staff have been ignored on it and i think that it has been a catalyst in some case for us where you know the boss has kind of said you know a boss several levels above has said we must do a big data project but then
Starting point is 00:46:09 actually below them they've got loads of ideas and so sometimes that kind of very um yeah very kind of uh just spur of the moment thing can be good but but we certainly find that a project really that hasn't got a purpose is doomed and the ones that we find that have got success are either the straightforward it ones where they've got a very clear idea of creating a platform or doing things cheaper and so on or the ones that have really come out of a kind of a skunkworks thing in say marketing where they've started probably with a shadow it project and very quickly realized then they can't store the data and so on and that they almost then welcome it because at that point it's too big to handle
Starting point is 00:46:45 and so on really. And so, I mean, another interesting area of this is the vendors selling tools in this area. So again, talking of Oracle, for example, where I'm not seeing a lot of take up of is tools like say, big data discovery or tools, these kind of big data enabled BI tools, where they're trying to sell too many concepts into the customer. If you think about, I mean, every vendor has these things
Starting point is 00:47:11 there where you're trying to sell not only the concept of kind of data discovery, for example, but Hadoop as well. And it's an interesting place to be in really, where there's a lot of demand for these tools and projects, but I'm not sure that every implementer and every vendor is going to be successful with this, really. If you think about where we sort of started our conversation today, which was about sort of end user business focused discovery. And, you know, that doesn't necessarily align with big data implementations, right? Because users are, of course, if they're really sophisticated users that are looking for, you know, and I'm thinking data scientists here, those people will have a lot of success if IT happens to implement an enterprise-wide big data initiative. But if we're looking at sort of the bimodal side where sort of
Starting point is 00:48:07 the business is leading, the last thing they want is something even more difficult to query than a relational database. What they ideally want is Excel files, right? So I think that, you know, but I absolutely see your point about what happens when a CIO or someone steps into a position and says we must have big data. And it just so happens that down below there have been a lot of requirements that would have been satisfied by big data earlier. Then there's a lot of success. I think what's interesting is that it – I just did a podcast earlier today with someone that works for you, Michael Rainey, and he mentioned organizations like Google and Facebook where everyone's an engineer. That was the terminology he used. Those organizations will thrive with big data initiatives where you've got a lot of coders and all they need is access to the data.
Starting point is 00:49:06 But I think what's usually more the point is that in standard organizations, there's very few engineers. And perhaps that's the play for implementers like us to try to bridge that gap. But I think it's problematic when really what they have is a Tableau-like tool where they want really not very complex data stores. And Hadoop is not that. You have to apply the schema as you read it. And that's even sometimes more challenging for the business, if that makes sense. Yeah, certainly. So, I mean, yeah, absolutely. Yeah. And I'm conscious of time now. So there's one last thing I wanted to ask you really, which is I've been,
Starting point is 00:49:48 I mean, one thing I've been doing recently is thinking about what would, in five years time, five years time, 10 years time, you know, what would an analytic platform look like? So if you were to go into one of our customers in a few years time, what would it look like? I mean, a statement that I've been making,
Starting point is 00:50:04 you know, I tend to kind of make fairly kind of you know sweeping statements is i think that all analytic workloads will move to hadoop or hadoop style technology in in the next kind of five years or so and certainly um transactional workloads kind of you know ebs and and kind of erp type things will be on say oracle or SQL Server or whatever. But all analytic workloads, I think, will move to Hadoop. What do you think of that statement? And do you think that's true or what would you say? I think it may not happen in that timeframe that you mentioned,
Starting point is 00:50:38 but I do agree with you that it'll happen. And I'll have one small caveat. I could see a world where relational is dead, but we still have Hadoop or Hadoop-like, something Hadoop-like, and frankly, something EPM-like, right? So, so much of the business close – so many businesses close their books on, you know, talking about Hyperion financial reporting-like technologies. I can't imagine closing the books on Hadoop, right? So I can certainly see a world where the relational gets squeezed out. It's getting squeezed from both sides if you think about it. It's getting squeezed from the architectural IT play from cheaper and in some ways more robust solutions from the big data world, or maybe
Starting point is 00:51:28 not more robust, but at least more fast moving. And then they're getting squeezed, relational technology is getting squeezed on the other side by the business and the OLAP style, Hyperion type products that are specifically focused at finance, closing the books and financial reporting. You could see both sides of that squeezing relational out in such a way that it's gone with data processing frameworks that become more and more robust. I'm thinking Kafka and Spark that start providing data directly to EPM-style solutions for closing the books, I could definitely see a world where that happens.
Starting point is 00:52:09 It's interesting, isn't it? I mean, I think that we're probably saying this, and there'll be people who would listen to us and say, that's the worst thing you could say. How can you say that relational databases would be wiped out? But I suppose one thing I do think is there are probably massive advocates of hierarchical databases back in the sort of 70s and 60s with cobol who who would say you know why would you have things differently and i think everything moves on and i mean to my mind that you know if you take if you take a high-end um relational database like
Starting point is 00:52:37 say oracle or db2 there's so much in there that that you that hadoop is not trying to build out so things like kind of um you know rollback and stuff and and even things like primary keys and and so on and i that that workload i think is certainly for now the business of doing transactions and having relational data sets and so on for for businesses it makes sense but certainly um there i think there's two things that particularly mean that analytic workloads will move to Hadoop. And one is just everything for that kind of workload is about doing it fast and large and so on. And it scales so well. Second thing is just the frightening pace of innovation. I mean, if you look at, say, what Cloudera have been doing, you know, with things like Kudu and Impart, and there's a drill, obviously, from MapR.
Starting point is 00:53:21 How can Oracle, how can anybody keep up with this it's it's the point where i would say that an analytic platform built now on say cloud era cdh or hortonworks or whatever it's probably actually you know feature-wise better than some of the stuff we see from the big vendors and that i think certainly in the same way that linux wiped out the the commercial kind of you know proprietary un Unix. That was just natural, wasn't it? But there's still going to be Windows on desktops and so on. But I know it's an interesting kind of thing, really. And I think that certainly, you know, streaming, doing things in memory, clustered and all that is where it's going.
Starting point is 00:53:57 The one thing is, will it all be abstracted away in the end, though, and just run in the cloud? And actually, in the end, it's just this amorphous sort of service. I think you're right. I mean, at least, you know, if you think of the reason that data warehouses initially went into relational databases, there was nothing else.
Starting point is 00:54:13 And it was the easiest thing to do. I think that now with so many other options and, you know, what we spend a lot of time when we're, especially in the especially in like the Oracle database or databases that initially began as transactional databases, and when we're doing analytics in those databases, we're trying to find ways to get around the basic constraints of a transaction. You know, an Oracle database is built around the concept of a transaction and that slows down uh you know more uh the bigger and uh loads and bigger uh queries so i think that that there is a natural
Starting point is 00:54:53 uh extent we don't need transactions necessarily in your your uh average analytic app right we don't need acid we don't need those things And we just need to get the data in and we need to get rough estimates on the numbers. So I think it makes a lot of sense for organizations to start with analytics as a place to cut costs. I think there's a cost-cutting side that's driving this, and I never want to necessarily dispute that that's a reason to do things. But in the big data world, the schema on read side is where so much value can be had from these processing frameworks. As you said, it becomes sort of unimportant how you process the data. You choose the framework or the service or whatever that's the easiest for you. That being said though, Mark, one of the things that data warehouses have always provided
Starting point is 00:55:57 and semantic layers and all of that is code reuse, single version of the truth, etc. Do you think we'll lose that in the Hadoop powered data warehouse? Or do we leave that sort of single version of the truth for the finance and the Hyperion style applications? I think, certainly from my perspective, I think one of the things that will change about data warehouse projects in the future is just automating and and abstracting away and taking away some of the kind of the work that we do now if you think about doing a slow change in dimension um data warehouse you know even now with with kind of etl tools and so on there's still work involved isn't there you know it's not nearly as much before
Starting point is 00:56:38 should that need to be something we have to kind of select and and kind of build each time and if you look at some of the things coming through from the vendors now around data preparation, where they're using, for example, kind of classification routines to automate the kind of data profiling, but also to do things like trying to work out the name of a column based on the values in there. I think a lot of the stuff in the same way that cloud in general is trying to consumerize things. And I think the work involved in doing this will kind of go away in the days when we kind of sit there and choose which kind of like in odi terms knowledge module we pick for a seat for changing dimension that will go away but there's no such thing as a free lunch and i think
Starting point is 00:57:17 you still have to think about how do you integrate data how do you how do you kind of handle history and so on and really might when we talk about cloud in general and we're talking about company and customers and so on you know there's always still work to be done integration and cleansing and things they still they take time really and i think that those things will always involve work but if we can automate part of it and if we can kind of you know take away some of the kind of the low-level detail that that's good really the other thing really is i doubt very much it's an interesting thing to think that we sell these kind of things but then we go into a customer site you know in a few weeks time when they're using bi tools from
Starting point is 00:57:52 10 years ago and i think that's the as much as we're sitting here thinking you know is it going to be the future it's a future kind of classification or drilled you know that in a way it's like who was the who was the uh who was the author that said that you know the future is here but it arrives in kind of you know different speeds in different places it's like, who was the author that said that, you know, the future is here, but it arrives in kind of, you know, different speeds in different places. It's, you know, things change a lot, but things never change much at all today in some respects. Yeah, I mean, I agree. It's interesting to see, like you said, that we still go into customers and they're using BI tools from 10 years ago. And the thing I would add on that is, and not using them in the proper way or even effectively. So what's interesting is that they're behind, well, not all customers,
Starting point is 00:58:30 obviously, and especially not mine. You are all wonderful. But there are some customers out there or organizations out there that, you know, they're looking at, well, we should probably go do a BI, sorry, a big data project. But their BI project is so poorly done. So maybe those organizations, it's time to rethink maybe a greenfield implementation using more sort of modern analytic tools. I want to make one more point. You said, well, what will the platform look like? I think that what we might see in the future around analytic apps is there is no central place for them, except for the finance side. And I think the finance side might completely be offloaded to Hyperion style applications at some point. But then everything else just is being built and maybe it's being built in custom code.
Starting point is 00:59:27 Maybe it's being built but almost in a microservices style. And if you sort of get on board with the data streaming frameworks, that becomes much easier to produce just-in-time analytics in small doses. And I think if finance can close the books some way, maybe we don't need that single version of the truth for everything else. I don't know. It's interesting. I think for me, I mean, it's great that the industry is still kind of relevant. It's great that we're still talking about it. And it's great that there's, if you think about the fact that you pick up a copy of The Economist or Time and they lead stories about a business that is innovating through analytics and so on.
Starting point is 01:00:06 And it's great to be in a part of the industry that is, you know, IT that is kind of so popular. And I think from a personal side, you know, when a lot of the work in IT was outsourced to India and places like that. And really analytics was the one thing that was saved because it is so personal to the people you're doing it with. And it is so kind of important to the business. And it's such a kind of thing that, you know, it benefits from consultancy and it benefits from expertise, really. So I think we lucked out really in a way
Starting point is 01:00:34 being this kind of area. And it's interesting to see how the vendors go and how that might change over time and methodologies. But it's great to be in an area that is still so kind of popular and buzzing, really. Yeah, it's where so much innovation is happening. It's crazy. I mean, when you go look at startups, you know, you're going to hear about, you know,
Starting point is 01:00:52 10 new startups, 10 new hot startups. And, you know, seven of them are doing something around analytics. Yes. I mean, it's really, you know, we're at the cusp, man. Excellent. And I'm here just because I followed you. Exactly. So, Stuart, it's been great speaking to you. I mean, it'm here just because I followed you. Exactly. So Stuart, it's been great speaking to you. I mean,
Starting point is 01:01:09 it's great to have you as our first guest. So thank you very much for coming on the show. Oh, it's my pleasure, Mark. I look forward to all the listening to this and all the future episodes you're going to have. I mean, it's just going to be fantastic. I can't wait to catch the feed in my iTunes. It should be excellent.
Starting point is 01:01:28 Cheers, Stuart. Okay, thank you. And thanks, everyone. Goodbye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.