Drill to Detail - Drill to Detail Ep.8 'Self-Service BI, Data Prep & Big Data Vendor Strategy' with Special Guest Jen Underwood
Episode Date: November 8, 2016Mark Rittman is joined by Jen Underwood to discuss the aftermath of the Gartner BI&A Magic Quadrant 2016 and the rise of self-service, Mode-2 analytics; innovation in predictive analytics and data pre...paration tools, and how the big data cloud vendors are differentiating themselves (or not)
Transcript
Discussion (0)
Hello and welcome to Jewel to Detail, the podcast series about big data, analytics and data warehousing with me, your host, Mark Rittman.
So my guest this week is Jen Underwood, someone who those of you who've worked with Microsoft products and Tableau and so on in the past might know.
But in my case, came to my notice after I stopped working full-time with Oracle BI tools
and started looking at the wider market out there.
I have to say at the time, I was getting a little bit disillusioned really with BI and data integration tools in general.
I didn't tend to see a huge amount of innovation out there. But when I came across Jen's website and blog,
and some of the tools that she was looking at and some of the new vendors she was talking about,
it actually kind of got me interested again in the whole topic. So I'm really pleased to have
Jen on the show this week. And Jen, do you want to introduce yourself properly?
Well, thank you. Thank you for having me Mark I've
been a fan for a long time of your content as well and I didn't realize you were inspired by
some of the products and that's it's exciting it is really an interesting time in the industry
right now I'm seeing the maturity to predictive happening I'm also seeing convergence with collaboration and
visualization technologies from the AR, VR gaming industry that's influencing products.
So it's a fun time in the industry. Exactly, exactly. So Jen, what's your background really?
Because you've had in a way a bit of a kind of parallel career to me. I mean, you've worked in
implementation and so on, and also for vendors. What's your kind of route to where you are now, really?
I'm glad you asked that, because I started in development, literally developing web applications,
and I fell in love with data, and I've always been in love with data.
And those two types of things, for 15 years or so, I did implementations of data warehouses,
web applications, and web reporting applications. And a lot of folks don't realize that I'm actually
a developer. I just happen to love the data and love the front end. And I went crazy on the data
side and digged into predictive analytics.
Oh, my gosh, it's probably already 10 years ago.
No, 13 years ago.
And I just think when you folks see the visualization side, the creative side that I have or the artistic side, they assume that I'm just non-technical.
But actually, I'm quite deep on the technical side because of the background.
Yeah, I mean, looking at your websites.
So what's the URL of your website again, Jen?
It's an easy one.
My name, jenunderwood.com.
The amount of work you put into that website, Jen,
in terms of just the actual presentation is fantastic.
But the range of kind of technologies and vendors you look at there,
I think at the moment you've talked about, you're talking about IBM Watson,
and we'll talk about that later on in the show.
You've got predictive analytics.
You've also got things like Spark and so on.
And I thought, there's a kind of female version of me there,
which is kind of probably a scary thought,
but it was, but again,
what sort of inspired me really, Jen,
was the fact that all these vendors you were talking about
and these products and so on that are out there,
and I suppose in a way, completely different channels to deliver kind of bi as well now so
um we're going to talk later on about beyond core for example and that's part of the kind of
you know the salesforce analytics cloud a whole kind of area of products and infrastructure and
kind of ecosystem that someone like myself from say an oracle background would never know about
and um it was it was kind of interesting but coupled with very good kind of, you know, presentation and so on, it was kind of fantastic.
So great to have you on the show this week.
And, you know, what we're going to do in terms of kind of the conversation today is I want to talk to you about some of the things you're seeing out there,
some of the changes in the market, some of the innovations and so on there as well,
and a little bit around kind of product strategy as well,
because I know that you do a little bit of work around that area as well, same as myself.
So it's good to have you on here.
And what I thought I'd start off with is actually something that was very topical
when I actually first came across your website and the work you do,
which was the whole business around the Gartner,
behind analytics magic quadrant coming out back in, I think it was about the end of the year, sort of December, January, that sort of time.
And as you know, Jen, there was a huge amount of kind of interest in that.
Yeah, there was.
And I'm just going to summarize it now, and I'm going to get you to kind of talk about what you think about it and the impact and so on.
But for anybody who's been living in a cave for the last kind of year works in bi uh you know it was all about the move you know really gartner effectively saying that bi market
had changed it was about self-service it was about kind of by the business making decisions
and as we know and and whole kind of like you know new definitions of what they see as being a bi tool
and so on but for myself with the business i was involved at the time the massive impact was that oracle jumped dumped out of the uh have magic quadrant and we've had sort of
actually several podcasts in this series just talking about that and uh and so um i mean jen
what was your take on this what was your what was your take on on on the magic quadrant and the
changes at that time really so i have answered this as a product manager. I led Microsoft's answers in 2012.
Let's see, make sure I get the years right.
And then I also led the last one.
They're currently going through the process right now, the vendors are.
So I've been through what I call, you know, the exercise.
And it is a massive exercise.
There's literally hundreds of different questions spanned across different areas
of the business intelligence solutions and one thing that i've blogged about for years has been
when i would have to answer this for microsoft oh my gosh i have to answer thousands of you know
questions and videos and it's a massive massive exercise but somebody else only has to answer maybe
30 questions because they're a niche vendor in one area but we're put on the same exact
chart and really we're comparing apples to oranges so I always take these with a little
bit of a grain of salt um just because of the having gone through the exercise and saying
well hey they didn't even have to answer 500 of the 1000 questions and i'm exaggerating a little bit but it is a it is a long exercise you go through
an rfi process initially to make sure that you have market presence interest there's references
from your customers a long survey or not survey that's not the right it's not a survey it's a it's an rfi really it's
a rfi uh you go through that exercise then you show your products and then you talk about road
maps and whatnot with the vendor uh with with the analysts and they are silent and that always
drives me crazy you know they may ask one or two questions and i and i really enjoy working um
you know i've known cind I was in for a long time
um so I understand and I can pretty much preempt or know we think a lot alike so I I have a feeling
for what she might ask or you know when you start reading the content what they'll ask
Rita was funny she surprised me and I didn't even know she liked me it was so so funny because she's
she she came out and was very confrontational
and one of them with the first question i thought oh oh my gosh she hates it um but i ultimately
found out she really liked it um but i will tell you they don't tell you much of anything they're
very quiet on the phone um but they do ask decent questions the going to you know your shock we had
no idea as vendors that they were that the Gartner analysts were
going to go in this direction. I was shocked at the results myself. I know what I presented. I
know the vendors out there. I consider myself fairly deep. I'm obsessed with reviewing everybody
every day. I was very surprised at the results. Now, I have given Tableau a hard time in the past.
I think two years ago I had, I went to the conference in 2014 and said, it was fun, but ho-hum.
Um, I wanted to see more innovation. I wanted to see the types of things that I'm seeing now
happen from them two years ago. Um, so that, so them to come down a little bit made sense to me. The click
did better than I was expecting, just because I've heard a lot of conversations in the market
with regards to the change to click sense, good conversations with regards to, hey,
they finally kind of got on track this year. But I was, they did a little better than i expected versus the conversations
that i was having and some migrations that were happening last year that i knew about so i was
surprised on them the the whole move together to say well self-service and this whole modern
i actually and it's funny i can't see who I talked to, but it was another very, very
well-known thought leader in the industry.
We were going back and forth on this saying, you know, I still believe there's very much
a need for the IT semantic layer for these solutions to be successful.
This whole concept of ungoverned self-service BI, I, as a consultant, have cleaned these
messes
and i think there's there's really truly a need for it and and what i think is going to happen
so a lot of times i've been very accurate with my forecasts in the past what i think is going to
happen and what i'm starting to see in the numbers and surveys and whatnot that are coming in
is that there's a renewed appreciation for the enterprise semantic layer.
But I think, I say, ultimately, I think,
I don't love that direction that it went.
I think there's still huge value.
And when you're picking one of these vendors,
you probably do want data quality,
or you probably do want master data management.
And you do want to make sure that you know you still have operational reports um it was it was interesting
it was a very different um direction than i was expecting having answered it yeah i mean it was
um i mean i've just got i've got the report in front of me now actually and um i mean certainly
the as you said if anybody's not seen it, just to kind of summarize
again, there was, so the magic quadrant, it splits the, I suppose, the kind of vendors into
for completeness of vision and for ability to execute. And in previous years, Oracle, for
example, have been in the kind of what's called the leaders quadrant. And at that point, I mean,
certainly for myself, a bi project typically would have
i don't know sort of 10 20 30 people on there all being paid a lot of money building a very kind of
like well-engineered bi system that had kind of very well-defined data sources and so on and so
forth um and and that was i suppose to my mind that was the kind of heyday of those big enterprise
bi projects um and what what um yeah what gartner was saying was that all
the spend now it's important to spend was happening in these self-service tools and and you know what
it reflects the reality that if you look at what we were doing in the business for example you know
we would use tableau we'd use kind of like you know cloud-based bi tools to do the analysis we
need um but as you say that there's a there's a need out there for there's always a need for
curation and so on.
I guess it's, we had a big internal debate back in the company I was with at the time about, you know, how do we kind of talk to customers about this?
You know, do we just deny it?
Do we kind of say, well, actually, it's rubbish or whatever?
And I think, you know, it probably was, I mean, analysts have to have something to talk about.
There's no point having an analyst report that just says everything's fine,'s changed but i think um there i think there i think there was some value in it really and i think that um but there's some interesting things coming out of that report really
that i just want to kind of run through with you just in terms of i suppose what what gartner you
know what gartner define as being a kind of a modern bi platform and so on really and the self
service self-service thing is interesting isn't it i mean there's been almost a backlash at this isn't there i mean what's your view on self-service what's your
view on people going out there doing it themselves and kind of is it possible is it doable is it
reality you know what what's your what's your view on that i i am i have a there there's a survey
coming up where the results are going to be revealed there's been a decline in self-service. When I say a decline, the growth.
And I've said for a long time that if you can't get somebody to read a simple parameterized report,
I doubt they're going to build their own report, even if it's super simple to do. It's an interest
level thing. It's, hey, I've got Excel. I'm married to an Excel analyst and oh my gosh, I show him these tools. He likes them. He goes right back to Excel.
That's just where he's comfortable. He doesn't have the time or the, you know, the genuine
interest. So when people talk about, oh, we're going to have everybody doing this. Well, you
might have more people viewing it because
it looks better we'll have we definitely have made much a lot of progress in that 20 percent of power
users and bi pros it's fun to work with those tools i love it uh and i didn't love it before
i was like i i would think about uh you know what what i would
have to go through to just get it to render and whatnot uh was fairly painful 15 years ago so
i i think that with self-service certainly it's mainstream if anything what we're finding and
certainly the numbers too even when you look at tableau, ClickWent Public, or Tableau's numbers dropped considerably this year on the growth,
and ClickWent Private, so they can kind of revisit and restructure.
I think what they're running into is this, hey, people that have already wanted this,
they saw it.
I mean, I started seeing this in 2009 when Click went with their Seeing is Believing
campaign. I think folks have pretty much bought the tool that they liked and they learned it if
you're one of those power users. And, you know, I just don't see a ton more people piling on here.
Yeah, it's interesting. I mean actually was was uh sitting in a on a
tableau um robot presentation yesterday which i'm obviously under nda i can't talk about it but
yeah obviously you know that that thing you the thing you say there really about you know where
do you go really from this is interesting because you know if you're if the definition of your of
your of your planet what you do is different is that you empower individuals to do things with
their own set of data and so on it then becomes you know much harder to scale that up and start
to do start to kind of like make a i suppose a single version of the truth and to take it to
kind of enterprise scale and so on and it's um i mean what do you think about vendors like say
oracle or microsoft that that come from a more of an enterprise uh sort of uh sort of space and then
try and bring out these desktop tools
and then try and make their Salesforce use these kind of land expand strategies and so on.
Do you see that much traction with that?
Is it, I hear you laughing there.
So, yeah, is that working?
Are they going to take over from Tableau or what though, you think?
I think it's, yes, I giggle because I'm very familiar with this.
It's very different.
I've worked at Tableau in the sales.
I mean, Tableau is completely dedicated to it.
The support structures, the whole company thinks visually.
It's a phenomenal organization for that specific area. When you start to stretch into
a Microsoft ecosystem or another massive vendors ecosystem, the sales forces
will not make it, they will not survive selling a free desktop tool or $9.99 per month or whatever
it is that you have. So it's an interesting time in the industry as well
as a lot of these folks, I mean,
when I was in the sales field at Microsoft
selling SQL Server BI,
really those engagements were minimum $200,000
license sales and higher.
It's really different and the approach is different.
And what those types of sales reps now are doing is they may gloss over the Power BI or whatever the other free desktop ones are and really be talking about, well, the Azure Data Warehouse or the Data Lake.
You know, they're selling the premium data stores that will keep them employed.
That is essentially the driving factor on a lot of these.
It's a loss leader product. So they won't get, the desktop will not get nearly the amount of
attention. What you're seeing is the community. Oh my gosh, the Microsoft community has been dying
for GoodBI for as long as I remember. And I am happy that I was a part of helping that turn around
because I've been on the other side of on the mic for that Oh Microsoft doesn't
have bi I mean what was when I was growing up it was Cognos right Cognos
was was it or our business object so so it's nice to see that they have they have a um a decent front end now because the back ends
have always been really quite good yeah i mean were you there when um when pro clarity was around
and then there was a whole business with yeah that was an interesting it was i remember at the
time i was i was you you when we spoke on the phone before this call you know you talked about
how it's actually oracle 8i 9i used and so on I remember the days I remember the days when when we I was
trying to go out there doing solutions in a cell um Oracle 9i OLAP which was which was the only
OLAP server that was was slower than everything that you ever put yeah we took the data from
really and it was relational and it was kind of whatever and we had to deal with kind of and our
competition was analysis services or IOLAP services then.
And that back end was fantastic.
But then Microsoft at the time,
and probably insulting you here,
is the front ends were just a kind of series
of shooting themselves in the foot, really.
You know, buying for charity and then kind of,
what was all that?
I mean, I suppose it's now come together, isn't it?
And now you've got, or not you,
but they've got Power BI, which seems to be fantastic.
I mean, where were you involved in that?
What period were you there and then?
This is actually a really fun.
So this is a really interesting story if you did not know this.
So I left Microsoft.
I was on the product team for BI.
So remember when I mentioned I answered the Gartner Quadrant in 2012.
The direction at the time from the chief architect on the BI team was,
we'll put everything in Excel.
We're going to learn to love Excel.
This, this isn't that and i had a lot of market feedback and whatnot from from customers and partners and even myself a lot of times i just know i just know i have a i'm like no it's going
to get rejected because people will define projects in excel and then they want the real thing done
and it's going to be
really hard to justify, well, hey, you're just giving it to me. And so why do we hire this firm
to do it? You know, even mentally, emotionally on so many different levels. So I had this,
the good and bad of Excel deck, this really exists. And I gave it, I gave a presentation
to the executives at Microsoft on this. And the architect says, well, you're going to learn to love Excel.
Okay.
And I literally put my resume out to,
at the time it was MicroStrategy and Tableau.
I knew, I literally cried
when I saw the Kraken keynote,
its most beautiful keynote I ever saw in my life.
So I was hoping that they would pick me,
meaning Tableau.
It was so beautiful with tree maps. There was all the top features that folks, you know,
as a product manager was asking me and I'm like, oh my gosh, I want to, I want to go there.
And I sent, it was a recorded song video of how beautiful this was. And I had to have it in my
life. And oh my gosh, well, they did choose. And I had a lot of resistance from the sales field, not the management at Microsoft.
But the sales field says you have to wait a year before you go anywhere.
So I stayed at Tableau for five months.
And finally, the non-compete pressure got so strong.
I said, all right, I'll sit out.
I'll just do my own thing.
And that's how I started doing
my own thing. Well, a couple of years later. Yeah, it's fun. I actually enjoy working with
multiple vendors. I don't think I'll ever go back to one. I'll be surprised if I do,
but I got a call. Oh gosh. The summer of 20, 20, is it 14? Yeah. the summer of 2014 that a new um a new leader james phillips had been
brought in and they wanted an analyst briefing and on the situation and the folks that called
me were on the engineering team i said well if you want me to come in and reinforce what you're doing, I can't do that. I don't like
where you're going and what you're doing. And the market, I love other solutions right now.
It wouldn't be a very warm, fuzzy meeting. And they said, no, we need to have the honest truth.
And we know that you'll be honest and you'll tell it like it is.
And he needs to hear that.
And I said, all right, well, then I'll come.
And I sent him a little engagement.
I gave him a call of friends and family rate because those folks were always quite nice to me, even when I left the first time.
So I went in and I gave a briefing to James Phillips, who is probably the most brilliant man I've ever met. I don't, it's a close, it's a close call between him and the fellow at Frontline System Solvers.
But that is how I got back engaged with that team. I gave them a lot of guidance on what I
loved in the market, where I saw the market going, where I saw was key flaws in the product.
And James got it. He, when when i would say something he would instantly understand
and that that was actually how they they recruited me back to help me go to market with some of those
ideas yeah i mean so you so you had quite a big you had quite a big influence then on power bi
and we met was that partly to do with you then or what you're saying? Oh, I wouldn't say that. I cannot take credit for, there are certain things, direct query,
I definitely pushed that one. I said, you know what, we live in a big data world and the engine
itself was quite silly. There's no way I could recommend it to anyone. And it was interesting.
I did get feedback post that meeting that the direct query was not something they had thought
about, which was amazing to me. But that was one of the things. So there's certainly things in there that
I've talked about, you know, that the canvas itself, making sure that it's artistic and that
you can do anything you want. One of the things I love about Tableau is it's literally like Photoshop
with data for me. So there's certain things that I certainly was big on. I liked, I'd seen what Watson was
doing Watson analytics and being able to create some of the quick insights. So I certainly had
ideas. Um, but so many, many other people and, and even the customer feedback that was collected.
So I will say one thing in the past when I went and said, Oh, I want to, I have thousands of
people that have asked
me about tree maps, and I would get shot down on that. Oh, too bad, so sad, how much more Excel
will it sell? And I'm like, oh, yeah. This time, if I come to James, and I literally, so here's a
fun fact, the R in Power BI was going to get shot down. And I said, take a look at my blog and look at the top blogs.
They're all R.
And that's interesting.
That's interesting.
I mean, in a way, we'll get onto that in a second
because R in BI tools is a kind of interesting area
I want to talk to you about as we go on.
So just taking a step sort of like away from that.
I mean, so going back to the i mean going back to that that
kind of report um that the whole kind of thing around bimodal and mode to it and so on analytics
and and and i suppose the fact that you know you've got um you've got the the users out there
going out there and doing self-service and and curation themselves and so on and you've got it
that that that you know we all know does a good job but sometimes you know we you know i count
myself in this area,
we do tend to kind of get slightly dogmatic on things.
I mean, what's your view on,
so we all know the IT is needed,
but business aren't interested and so on.
I mean, how do you see this kind of,
how do you see IT getting back into the game really
when it comes to kind of BI?
I mean, is it something that,
do you advise people on this, vendors and so on?
How does IT get back into the kind of conversation really around BI it something that you advise people on this you know vendors and so on how does
it get back into the kind of conversation really around bi well it's interesting one of the things
that i'm looking at is data warehouse automation if i would have had this 15 years ago my life
would have been so much easier engagements again memory called it the heyday i knew that if i got
a project i'd probably be busy for six months to a year.
It's very different now. The automation of data warehouses, I think, is a game changer for IT to develop a good, solid system, slowly changing dimensions that can accurately report over time
very quickly. So why did the business bypass IT? Because they were too darn slow. So I think that if IT can partner with the business and find someone in the business
and prove some small wins, they can certainly do things right together.
And they should be doing things right together.
And I keep hearing that in Europe, the laws with data security are so strong now even
lawyers are sitting inside of vendor selections not just for the contract early to review the
tool because the data security concerns i have to imagine that the business has to be a little bit
more responsible and at least consulting a little bit with it yeah yeah i mean we'll get on i think
we'll get in that area about you know i suppose the area talking about there about about it you
need to get involved because of kind of data governance and data locality and so on that
obviously goes into the whole kind of cloud question as well and um and and also i suppose
big data and so on let's get on to that in a bit actually i think that's quite an interesting area
to kind of look at but um so um so i mean in terms of the vendors i mean it was i mean for oracle i mean i'm not inside there i don't
know but it was obviously there was they didn't um they didn't get put into the into the kind of
quadrant now and so on and and and what was the impact on the vendors at the time around this is
it something where did it i mean i thought it had quite a big impact but you know you're pretty
close i do too i do too so i I have a couple of different popular blogs.
One of them is an old compare of Power BI and Microsoft
and the first one, it's completely out of date at this point
for the most part.
But that gets hit really heavily and so does my Gartner.
I have different conversations or whatnot.
Whenever the Gartner report comes out, I give my two cents
what I agree with and what I disagree with. And folks seem to hit on that one pretty hard.
But from a vendor perspective, yeah, I mean, this gives a lot of credibility to
the three vendors that were in the leaders quadrant. And, you know, folks, it's going to
lead every single sales effort, you're going to see that quadrant being used in the leaders quadrant and you know folks it's going to lead every single sales effort you're
going to see that quadrant being used in the cycle yeah i mean it was i mean i sometimes wonder
whether it was a bit of a storm in a teacup really that you know within our industry it was big news
but i was certainly going to see customers and and so on it was um the number one topic of
conversation though it's it's it's kind of interesting and i think um i think you had to
read it in a certain way and i think that um that i think the to my mind the essence of it kind of
rang true that that it reflects the things that i was doing it reflects things that i was seeing
there um i mean i don't know how closely you follow oracle for example but their strategy
around around kind of the their bi tool set is to try and do both things is to try and have you
know desktop tools that um that that are the same kind of like, you know,
built from the same code base as the enterprise tools
and maybe at some point in the future
be able to kind of to exchange metadata
and curate it and so on.
Do you think that is ever going to happen
or is that a bit of a pipe dream?
I mean, is it other vendors doing that as well
or will there always be two markets?
There'll be the enterprise BI
and there'll be these self-service tools.
They'll always be different.
You know, what's your view on that?
I actually think it's it's i i think that when the vendors do have that shared core it they have a massive advantage and that's certainly
something that i would be looking at to say hey can i upgrade this and one of the things that i
did like about even the very first power BI that I ultimately left Microsoft over.
But it was being able to take the model,
and Donald Farmer, I think,
was one of the original folks involved in this design.
But being able, Tim and Julie Strauss,
to take that model inside of Excel
and be able to upgrade that
to the enterprise analysis services,
that's a really strong story.
When I look at things like Informatica Rev,
and you can take that little self-service data prep recipe
and now bring that into Informatica's enterprise ETL,
that's compelling to me.
I like that.
So let's get on the bit.
So there were three specific kind of vendors
and tools that you talked about recently that particularly resonated with me really. And,
and as it was you that actually introduced them to me, I thought it'd be interesting to kind of
get your take on, on why you thought they were interesting. And so the first one was a tool,
a company called, or a tool called BeyondCore, which, uh what what struck me with that was a demo that i
saw that was from the gardener bi um something i think where um it was a bake-off wasn't it between
different vendors and um the idea was to take i think at the time the university um data so it was
how much how much somebody earned at 10 years after leaving university um and it was us data
and um and i think in the end every other vendor was put the data in and it was coming
up with various things and the BeyondCore one came out at the start and said well actually the
biggest predictor of someone's earnings in 10 years time is their parents earnings and it kind
of flagged this early on but the way it did this the way it kind of went through the process was
light years ahead of some of the kind of the predictive analytics integration I've seen in some of the tools from the larger vendors where it's just you can run our script but that's
it i mean for you for you jen you know what was your view on beyond corn why did it interest you
yeah so this is fascinating i i i think this is where the industry is going i really can say that
wholeheartedly i've been watching them for a few years. And one of the things I really liked,
it was a McKinsey. McKinsey, I'm not sure if they're worldwide, they probably are.
They're very reputable consulting firm. And they had a couple different case studies where
a lot of the automation of kind of the busy work in predictive was done for them and their models were made so quickly and were very accurate.
So what BeyondCore does is it, once you take a data set, you load it in and it will run through
all these different iterations in the background and then find, you know, what is the most
predictive and identify different algorithms. But it also gives you diagnostic and prescriptive.
So you can do what-if analysis and ask it questions.
Well, what if I change this variable?
What would my lift be?
As well as prescriptive capabilities, which I very rarely see of do this, do this, do this,
and very specific instructions.
And I thought, oh my gosh, this is really fascinating.
There's pieces that I like on this direction,
and KXEN, SAP had bought them a few years ago.
I was also quite intrigued with that as well.
The aspect in Watson Analytics, right?
So that one has so much potential.
They just haven't gotten the UX quite right yet.
But the whole concept of this was interesting because having been in data science for a while,
you need to accurately get data that reflects the business process.
There's a whole art form to collecting and shaping the data before it even goes in to the tool to do all that rest of the stuff and then i must admit i read i read your article on
that actually and i spent quite a bit of time over this over the summer taking my own data into a
data science kind of project and your article was very good i mean what what yeah what was the you
know digressing you here what was what was the kind of the essence of that article why is it
difficult ah the art of predictive analytics.
So when I was working with Boncar, I did some freelancing for them.
Essentially, the first thing I said we needed to teach people to do was to be successful with their product.
It's very misleading to tell them to just load their data and go.
And their sales reps were having problems because folks would get bad results.
And I used to get the same types of phone calls when I used to manage the analysis services.
There was a predictive engine in there that it didn't work.
No, it's not that it didn't work.
You have to prepare the data to be in the right shape to really describe that business
process in a predictive manner for the algorithms that you are using.
And that is an art in doing that.
There's a book called Moneyball, it's an analytics classic that talks about this art form and
even talking to the IBM Watson analytics, one of the product managers this summer and
I said, you know what, what's going on with that?
It was so awesome but I haven't really seen uptake and said, well, they haven't really mastered the data prop yet.
And just, was it two weeks ago, all the weeks blend?
No one knows last week.
Last week I was at the IBM Watson Analytics Conference,
and certainly they're showing different things and really amazing.
I liked the Watson data platform that was a data science platform.
But the one project, and no one in any of the sessions highlights this,
you just have to walk in the expo hall
and be able to recognize something amazing.
It was Project Farcast.
And what Project Farcast is doing
is trying to optimize the data prep process,
making suggestions.
If you combine these two variables,
if you do this and if you do that,
and it's the very first time I've seen that anywhere and i said if they can master that
then this would certainly be much more compelling and less error error prone i mean it's still going
to be it's still going to be prone to hey did you collect the right data to begin with um but at
least the prep automation which is most of the work uh would
certainly have be much more improved so there'd be better success in the end yeah i mean it's um
so so you know watson's had an interesting kind of press really isn't it and in arguably maybe it
was oversold arguably kind of expectations are high i mean what's your what's your take on on
ibm watson and and so on really i mean is is it a good product that's been oversold is it a kind of
is it misunderstood what's your view on that really i think i think it's misunderstood because
you know what what aspect of watson there are so many watson areas you have the specific
applications for health care for weather for crime and for regulatory compliance.
And then you have the more generic general analytics, IBM Watson with some automated predictive cookton and some cool types of data quality type features in there.
But you also have Watson that are the cognitive APIs on Bluemix.
So there's all these different flavors of Watson.
What I will say is the experience level that IBM has in cognitive computing,
and that is certainly something that is coming together in the digital transformation,
is compelling, and I know they've gone through.
So when I look at you know folks there's
usually an ebb and flow of when they're hot and when they're not um IBM's been you know
probably beat down a little bit the past few years I went to the conference in 2014 and it was
depressing um and I just it was and this year I said you know what I went again and it was
definitely more upbeat.
And they have decent messaging now.
It's like it's kind of coming together.
They figured it out.
They're getting their game on.
And they had impressive case studies there.
What I saw was quite good.
Now it's just a matter of, I think, getting folks in there and making those decisions doesn't make sense for them. So I think one of the last topics we'll be talking about, hey, how do you pick one of these vendors? Because a lot of
them sure look the same. It's that, that part's going to be really interesting to see is because
they've got really good stuff. But will people pick them? Or will they just buy from somebody
they already have their data with? Yeah, exactly. So, okay, let's move on a bit.
And the other thing that you talked about on your website,
which really interested me,
and I've more, I suppose I had quite a big,
I suppose, data integration ETL background and so on,
was Paxata.
So Paxata, a data preparation company.
I'd had some experience with that kind of market,
I suppose, with Oracle's data prep tool, data prep cloud service.
But what you showed in the review and what I've seen in demos since then was two most light years ahead of that.
I mean, so Jen, do you want to explain to me what Pexarter is and why it interested you?
And we'll talk about kind of why you think it's going to be a significant thing in the market.
Oh, yes.
So this summer, and I still I'm still
reviewing different different, you know, self service data prep tools. It's just a hot so I've
reviewed quite a few of them. So someone was asking me about Puxata and saying, you can't get your
hands on it, you can't get your hands on it. And I said, You know what, I will make a deal with them.
And I'll write a blog, because it takes a while to write those blogs that I do.
And so I made a deal with them. I said, you know what,
I will showcase your solution on my site. If you will let me play with it. I was not expecting at
all, you know, so I was reviewing it. Informatica Red actually did very, very well. In my review,
was very user friendly, very simple. But a lot of these, you know, it's the same kind of thing.
Here's the steps, you connect to data, here's the recipe of steps. You know, there's the same kind of thing here's the steps you connect to data here's the recipe of steps um you know there's the same kinds of transforms so i'm almost seeing data prep
becoming commoditized and does it upgrade to to enterprise so i'm looking at how can i deploy it
uh but but for the most part i was just like you would say earlier i think just not very innovative
here it's the same stuff over and over. Maybe some machine learning.
I saw a little bit of machine learning in some of these tools,
and some of these tools have catalogs.
The catalogs are really hot right now with cataloging and annotation
and crowdsourcing and auto-discovery of schemas.
But what excited me about Pixata was a completely different approach to data prep,
a visual approach, an approach where I could,
just as if I felt like I was having that Tableau experience, but with a data prep tool.
It was fun. It was very visual and innovative. Also, I liked that you could add annotations.
I could go back and forth. It just, the UX was exquisite. I fell in love with it.
And I told the guys, the guys that they knew me from Alteryx world and let me in. And I said,
you know, I'm jealous. I'm actually jealous because I know this is going to be a massive hit
and anybody in there right now will probably, you know, get whatever, whatever happens when you,
when you get bought up or whatnot, because they awesome it's really really neat yeah yeah i mean so so i'm going i
suppose going back to um we talked earlier on about about self-service in bi yeah how do you
i noticed that some of the bi tools now are starting to add data preparation features into
them you know do you think that what where do you see this market going really do you think they're
being you know what i mean where do you see it going with in the future do you think that, where do you see this market going, really? Do you think they're being, you know, I mean, where do you see it going, really, in the future?
Do you think that tools like Paxata will become, you know, less relevant because the features will be added into, say, Tableau and so on?
Or what's your view on it, really?
So I think that's a great question.
And I met with, there's an annual thought leader, sort of a, it's almost like a summer camp that happens.
In this past summer, the thought leaders that were in there had said,
this market's going to disappear, right?
It's just going to get gobbled up.
Either they're going to get acquired or put into those tools.
I still see that happening.
I don't know if it's really imminent or not.
I agree with that thought process.
I also think, though, this whole concept of hybrid data prep
and data virtualization,
there's still going to be a need
for these types of tools,
and you don't always have to visualize the data.
So it'll happen.
I still think there'll be the ability
to have some standalones,
but some of these, I think,
and even with Pixata,
I saw Microsoft Ventures now is in there. I don't know if they read my blog or not but i'm like oh yeah they'll get bought uh someone's gonna buy them i i would i guess
somebody's gonna buy them that's my goal it's interesting i mean we talked i mean before we
did this call we had a conversation and um and and one of the things we talked about as well was was i suppose in a way how how a dw project has changed now and and a
tool like paxata for example you know it's designed data preparation is about kind of i suppose a
domain uh expert you know someone who knows the data is getting involved in preparing the data
maybe more in my view for a big data type situation where you've got lots of new data sources going in
but i suppose it contrasts with a data warehouse project,
which is almost like kind of building,
you know, they're multi-year, multi kind of,
whatever project is put with mines.
But something you said to me on the conversation before
was that you can see DW projects being built in days now.
And really, I think that kind of,
and I agree with that in a way,
but it doesn't seem to be known within the industry.
I mean, just summarize what you're seeing here.
So you said to me before,
somebody could now build a data warehouse in days rather than months.
Tell us about that. What's that about?
Okay, so that's data warehouse automation.
There's a whole class of tools.
The first one I was made familiar with was with a fellow in the Tableau world,
Larry Keller, introduced me to Wearscape. Oh, it was years ago. And I played with it with a SQL Server backend.
And essentially what it does is it automates the building of facts and dimensions from
a data source, creates your ETL for you. And a lot of that, I mean, if you've done
Kimble-based data warehouses, you take the Kimball tool books and you have maybe
an ETL toolkit that you have some scripts in there and there's some framework that you use,
a lot of that is repetitive. So these tools in the market, Warescape is one of them,
Attunity is one of them, I believe. Time Extender is one. There's Agilius, I believe, has one. They
have a comparison, I believe, on their site
of various different data warehouse automation tools. But what you do is you point it to a source,
to data sources in general, whether it might be a relational database, some files, etc.
It will automatically extract into a landing zone the dimensions and the facts.
And when you kind of look at it and say,
do I like these that were automatically generated?
Do I want to customize any of these?
And when you're ready to go on the schema,
and then it can also, one of the tools that I'm reviewing,
Attunity, that Compose will import if you have, say, an Irwin diagram, and then take that
as well as input. And it'll generate a dimensional data warehouse for you literally, you know,
in minutes. It's kind of scary. I mean, I suppose, I mean, I've seen accelerators,
I've seen kind of like macros and things where, and actually, view of where ScopeWeb was interesting in that, you know,
I worked on a big project where it was, it was the kind of the silver bullet really for the project.
It was going to do this delivery of things very quickly.
But it was in the end kind of, you know, got rid of because it lacked kind of,
I suppose the solution lacked depth or whatever.
And I mean, how do you, I mean, how do you deal with the fact,
I suppose the complexity often comes in, isn't it,
where you've got to do complex transformations and stuff like that.
Yeah.
How is that going to be kind of sped up?
Yeah, that's interesting.
So I started reviewing, I just started reviewing the first set of these.
I finished, for the most part, I'm finished with self-service data prep.
I still have a couple people that are knocking on the door.
But for the most part, now I'm starting and I started with Attunity. And one thing I noticed that they have
in theirs is customizing of the actual scripts, pre and post scripts and customizing the scripts
themselves. Because, yeah, when I looked at these data prep tools, even I'm trying to remember this
was years ago in one of the briefings and they said well
there's no script you mean there's no script it might have been trifecta uh what do you mean
there's no script there's always scenarios uh one retailer in the U.S. I worked with they had
each time they bought another chain you had different types of product numbers and structure
no nothing straightforward nothing and just the real world's not straightforward. So I thought that's going to be a deal breaker on that tool. So yeah,
I could see how you would say that. I will look into that. So I have to be transparent
that I've only started looking most recently deeply into a Tunity Compose. Hopefully,
Warescape is next. I've been pinging them and telling them
that I'm going to be looking at these tools
and I'll start to dig deeper.
I believe Agilius, it's A-J-I-L-I-U-S.
They're based out of Australia,
has breakdowns, the strengths and weaknesses
of these tools.
Okay, okay.
So let's go from this wonderful world now of innovative BI tools and so on and
the almost pre-cambrian world of these things here to a completely different world which is
big data and cloud and so on. And something that I said on Twitter recently and it's been
conversations I think you've been involved in I I have as well, is as the big vendors move their big data solutions into the cloud,
so you've got Oracle doing that now and you've got Amazon there
and you've got Google and so on there,
and everything becomes almost elastic compute storage in the cloud and so on,
you start to think about how they're going to differentiate things.
And you must get customers coming to you saying,
well, Jen, there's a couple or vendors coming to you and saying you know um how do we differentiate these how do we as as every vendor
starts to add you know elastic services to do hadoop and so on there they'll become similar
you know first of all what's your view jen on this you know what's your view on the the big
data market in the cloud and and the vendors out there and so on let's start with that first of all yeah let's start with that so the gartner at the gartner
was that a month ago the keynote had talked about the digital transformation and the trends and what
we're going to see happening and the whole concept and i've been saying this for a while data is gold
and whoever has the most data you know essentially is going to win this war
with intelligence and automation and this you know automated customer experience and omni-channel
experiences what concerns me so there's some some good things and the good things are there'll be
a lot more standardization if i when i learned hadoop and learned hdfs okay well now i can pretty
much play on any of these vendor systems.
Same thing with Spark.
Okay, great.
I know Spark and SQL Context and Spark R and I can function on these different systems.
Maybe it's PySpark that I'm playing with
because everybody supports that
and it's not necessarily specific to the tools.
But what you're getting at is
from a differentiation standpoint,
I'm struggling to see some of that.
It's really what I see right now as I look at, when I looked at IBM Bluemix, it reminded me a lot of the Azure offerings and very user-friendly and very software.
You know, it was very, very simple to, say, set up an event hub to adjust zillions of events for an IoT center, for example. It was just a point click and then say, hey, I want to dump
it into this data source, whether it's DashDB,
maybe it's Azure Data Warehouse.
That's so simple. It's so much
the same. I think the things that I'm
looking at or thinking about
are, can I move
the data away if I get
upset? I think it was at Microsoft
in UK just increased their prices
22% upset yeah i think i think was that microsoft and uk just increased their prices 22 22 percent um
yeah but yeah so one of the things is i'm always well i want to be able to move my data i think
that's really important if i want to break up with a vendor you know sometimes you want to break up
and it's going to be things that you know customer experience they talk about customer experience but
um not all these vendors are really great at customer experience and it will be that i think
when you have a commodity it's those other things that you start to value a bit more
and yeah yeah i think i think there's some i think there's some quite distinct differences
there i mean it's um i think certainly um look at it, let's take some vendors here.
So we take Amazon, Google, Oracle, and so on there.
So Amazon, for example, is the kind of big player, isn't it,
obviously in the cloud market and so on,
and you've got Elastic Map Producer and so on there.
Certainly, to my mind, like you say, the customer experience,
the maturity in that solution there and the vendors in that space
and so on is interesting.
I've also been looking at Google Cloud as well,
and you've got BigQuery in there.
But it's much more a developer tool, isn't it?
It's much more command line and so on there.
Yeah.
I mean, what's your – okay, those two platforms there,
Amazon and Google, what's your experience in those things really so i'm having to ramp up on google i was working with
was it big um look it was working with oh i'm trying to think what the amazon one was
uh redshift redshift and bigquery google bigquery when i was uh in the tableau world
and working for tableau we were working on those connectors and whatnot years ago.
So I'm ramping back up on those now. I just ramped up on Redshift again. And I did have
a little bit more trouble with, you know, just getting some of the tools in place and whatnot
than I did before. Google BigQuery, what is interesting to me is I'm seeing very large
accounts on the East Coast of the US.
I don't know if it's just they have a really strong sales force here. Or again, you know,
I happen to know that the groups on the East Coast of the US have been very big on compliance type sales approaches. It's not very salesy if you're just running in and scanning servers and then
saying, hey, here's your bill. But that certainly was a pattern uh for a few of the big vendors and then on the east coast so somebody
like google comes in and you talk about customer experience and you know if these other people are
getting sued for their bills for how much you know database type they have i'm not going to
name vendors there's two there's two there's at least two that i know do this and it's you know database type they have i'm not going to name vendors there's two there's two
there's at least two that i know do this and it's you know that when you have a contentious
relationship like that um now you come now you have somebody else come in and they could have
that because they were basically you know the only options anyone had well now you have google coming
in and they have a really, really amazingly performant database.
You know, maybe they're coming in with a different customer experience this under, but they're winning some really big accounts.
There's also one thing that I'm following.
I saw and I tweeted, oh, my gosh, this had to be at least six months ago, the biggest database project in the world.
And it was narrowed down to Amazon and Google.
So I want to know who wins that.
I've been following GDELT for a while.
I think it's an amazing Google project.
It's on the Google BigQuery in the back end.
And we'll take all the news,
you know, the unstructured text,
and you can query it and play with it.
And I was very impressed
with what that project could do.
So if they get their game together,
they make it a little bit more user friendly
because yeah it's not it's all a bunch of command
I just looked at it it flows a bunch of command lines
and blah
yeah I was going to
I was using Google Dataflow this afternoon
and I was trying to bring some data in from the various sources
I bring into my home IoT
project and I just thought oh this is interesting
this is more like Spark streaming
writing the code yourself
and I think it's going to this is interesting. This is more like Spark streaming, writing the code yourself than that.
And I think it's going to be interesting as,
so currently at the moment,
the state of the art, I suppose, really,
is Amazon, is Azure,
which I think is very interesting as well.
I think what will be interesting as well
is certainly what Google might end up doing,
I think, with,
at the moment, obviously,
they're building out the Google Cloud platform and it is fairly kind of developer style yeah and whether whether they'll start to
build out some of the tools or sort of i suppose partner like the amazon market space you know to
bring in other vendors like you know attunity and so on um but i think if you think about the access
that google's got to other data and just the kind of their their search their search kind of
background and so on as well i think that that's an angle i think google can start to sort of differentiate on and i think with oracle as well
oracle is interesting because it's it's obviously very late to the game in terms of the the you know
it's got its new uh version 2 iis with uh that's now elastic and so on but you think about what
access they've got to just to deploy that stuff within applications you know there's a there's
an announcement this week where uh there's a team under jack berkovitz that is um is adding kind of data lake stuff to all their
cloud apps for example and also the deals they can do with things like data as a service i mean
do you think they'll all do this or do you think in time each of these vendors will start to form
its own kind of like you know niche and you'll go to kind of oracle if it's a business kind of
one or it'll be if you go to google if it's a kind of like a machine learning one what do you think on that or is this all just
kind of what do you think on that that's great question so when i looked at ibm's
you know ibm as well you know it was blah blah blah the same things that i hear every every
vendor however what was different was the industry models. And I said, oh, that's nice and that's different and that's deep and that's a value.
So industry models, I think, are quite interesting.
You'll have the AI if they're an expert, I think, in some of these other things where you can tap into some of the AI.
And we talked about Salesforce earlier, buying BeyondCore and making Einstein and all that.
I see more of bringing that value.
If you can bring an expert, you have the data.
Again, now we're going back to what Gartners was saying.
If you have the data, this is going to be who has the most data,
winner take all.
Who has that data to create the best artificial intelligence
automated into the business process
when the buyer's making a decision um when they want when they want to have that interaction
that is but what concerns me too is then what are we talking about i don't want to live in a world
with five vendors i i love variety and having innovation and having options.
So it's going to get interesting.
So one of the things that I'm looking at too,
I think Google has some kind of, is it not monopoly?
I'm trying to think of what it's called,
but certainly Microsoft is one,
and it's just kind of being brushed aside right now,
at least in our political infrastructure here.
But I don't want to see a bunch of monopolies
yeah interesting i mean it's it's i suppose the thing about the big vendors is they've got a kind
of like a an amazing consistent consistent amazingly consistent ability to to completely
destroy out any kind of uh acquisition they make really um which i think is is interesting and um
it will be interesting to see how BeyondCore get on.
I think it's a nice way of wrapping this up, really, in a way.
It'll be interesting to see how the likes of BeyondCore get on.
I think Platformer were built more recently as well.
Yeah, they were.
A lot of these vendors were built.
They were, right about the same time.
Yeah.
Yeah, I mean, it's interesting.
I think certainly looking at what BeyondCore are doing with adding that
decisioning and kind of analytics into the serum serum applications and so on is good um but but in a way the market
has got his own way isn't it of kind of innovation i don't know i mean what what how do you feel
about beyond core for example and those things going into the products there do you think it's
going to be the doom of them or or i don't know what's your view on that no and i think one of
the key one of the key factors there is is mark Benioff had said they're going to keep it a standalone.
They have the embedding into, you know, it's just transparent and seamless for that Salesforce user.
But there's also that, you know, it's the standalone.
You could put in pharma, you could put in manufacturing, you could put in manufacturing you can let that be
you know non-specific and what you call it generic or general or you just have to be aligned to a
line of business um for them i think that was that was really key because if they just got embedded
into the crm i don't you know they would really be losing a lot of innovation for so many other
use cases well i think a good example of that is is in oracle's world there was a real-time
decision so have you heard of that where it was um dynamics i think it was and it was a similar
sort of thing it was a very fast decisioning engine that was added into the crm siebel tool
and and never i mean i'm being pretty harsh here but never really kind of innovated beyond that and
i don't know it's it's interesting isn't it i think that there is certainly the intentions are good but it's it's um
yeah it's uh interesting so so jen i'm conscious of time now and uh i was one of the longest
recordings i've done actually so it shows uh shows kind of i suppose how interesting it is
talking to you um i mean so jen it's been great to speak to you oh it's great it's great i mean
i can hear so many stories we talk about actually
which is great um so so jen what what's just as a bit of a kind of like a heads up what's coming up
on the website soon what are you what are you kind of writing on at the moment what are you kind of
working on and so on just as a bit of a kind of a teaser there uh well i'm going to be writing
about the new our studio that came out there's been a couple different announcements industry
it seems everybody's making their announcements right now in advance
of the Tableau conference. So I'll
probably have an industry wrap up on
just some of the news announcements.
I do plan to cover
Watson Analytics. I can tell you that
they're not as far as long as I would
have been. They still have a lot of potential
but just seem to be still
missing some key features. So I'm going to be
talking about that in the future.
And then Tableau Conference.
I want to – hopefully they've got a lot of innovation to show me.
I've seen the analyst agenda.
It certainly looks good.
And there's one bullet that they are not even saying what they're going to show.
So they're letting that be a surprise, and they put that at the end.
So I am hoping for something really good.
Excellent, excellent.
So Jen, well, that's absolutely brilliant speaking to you.
And thank you very much for coming on the show.
And it's been fantastic.
So thank you very much.
Oh, thank you for having me.
Thank you.