Drill to Detail - Drill to Detail Ep.45 'Tellius, YellowFin and the State of AI in Analytics Today' With Special Guest Jen Underwood
Episode Date: December 13, 2017Mark Rittman is joined in this episode by returning special guest Jen Underwood to talk about what's new and innovative in the BI and analytics industry right now, and how AI and machine learning are ...this year's data discovery and data visualization.
Transcript
Discussion (0)
Welcome to Drill to Detail, the podcast series about analytics, big data, and the people and technologies driving our industry.
And I'm pleased to be joined once again by Jen Underwood, who came on the show first time around last November,
and joins us once again to share her knowledge of the tools and vendors in the BI and analytics space.
So welcome, Jen, back to the show. And it's great to have you on here.
Oh, thanks for having me again. It was fun last time.
So Jen, for anybody that doesn't know you, very few people, but anyone doesn't know you, just tell us, I suppose suppose what you do now and your route into this industry uh right now i help essentially evaluate many different solutions and i keep a constant pulse on the market i'm always looking at the news following the trends and exploring and
playing with these tools how i got into this was I implemented solutions as a consultant for data warehousing,
reporting, and even advanced predictive analytics for 15 years.
And then I went to the vendor side and was a data platform pre-sales engineer
that got recruited into product management for Microsoft and then went out on my own and here
I am excellent and your website is is is probably the thing you're most well known for at the moment
so just tell us about what you do there so I have my core website is jUnderwood.com, and essentially I will provide very honest, sometimes brutally honest, reviews of solutions and my take on the industry.
And a lot of times, some of my forecasts and whatnot, I forecast before Tableau was going to crash and a few other forecasts through the years that have been fairly dead on.
So now I've got a large financial portfolio following for that aspect. But I review solutions and talk about all these
different trends that I see in the market and I keep it real. Yeah, that's excellent. So and I
said to you, I think last time we spoke that it was reading your blog that kind of got me back
into doing this again. I actually had lost interest a little bit in analytics and blogging and I saw what you were doing and I thought that was really inspiring. So
it's good to have you on here again and to get some of your insights. So what I want to do is
talk to you about two things in this episode. So first of all, I suppose what's happening in
analytics at the moment within the industry. So which are some of the interesting tools and
vendors coming out that you've seen recently. And then I want to drill down a bit further into the topic of
AI and machine learning, which is very topical in this world of analytics at the moment, you know,
but I'm interested to understand from you which vendors are doing things interesting in this
space. And generally, whether you feel there's value in there, and it's actually innovation
happening, or it's kind of hype and so on?
So let's start off really with analytics.
And what have you seen just in broad terms?
What are some of the trends you're seeing around analytics tools, analytics techniques in the industry at the moment?
And we'll get into some detail of products after that.
Well, aside from the move to the cloud, and I heard someone last week say even the the banking institutions and financial
institutions are moving to cloud and that was very different than two years ago when I would
have conversations and they were anti-cloud conversations so that that sentiment is
certainly changing when I think about some of the things that excite me that I'm that I've been
really enjoying exploring has been this
automation what can truly be automated across the entire life cycle from detecting insights
to preparing data intelligently repairing maybe pipelines of data you know communicating the
results in human language just all these different aspects of automation that is getting very interesting to
me from not even just you know like i said from acquiring the data creating a model to the
predictive analytics side of what we would call feature engineering kind of the art and science of
usually a person is involved in that and they should still be involved in that
even if parts of it are automated just Just across the entire span, it's been very interesting.
Okay, so that actually is interesting because when we first spoke,
we talked about BeyondCore as one of the products
and one of the companies that were doing things innovative in this space.
And I noticed you've been talking on your website about tele-estate discovery.
And I think reading it, it struck me as a kind of almost like a combination of BeyondCore
and visual analytics and kind of ML workbench and so on.
Just tell us a bit about why you thought tele-estate discovery was interesting and what it does
that is new and maybe solves problems that haven't been solved before.
Well, what I really liked about it, and this group came to me, they'd seen my Domo expose
and somehow they reached out
and i get a lot of vendors that will reach out and i don't expect much i mean it's i i get i'm
critical um on a lot of these i'll try to be nice to them but i'll give them a balanced review on
the phone well he shares me his product and it had it did have aspects of beyond core where there's
the automated analytics world it'll automatically find the insights and it describes them and it ranks them,
tells you the difference between changes,
and also lets you do some drag and drop visual discovery as well.
And hook in your machine learning.
So if you do have a data scientist or even the algorithms that it's using and creating,
as you find know find those
insights you can use them and apply them to your own data so it was a combination of beyond core
plus tableau-ish those things together and maybe a machine learning workbench all belted into one
and I'm like oh this is really neat so I've had a lot of fun with that group I've watched them
they're growing it's been fun because I remember when there were only like a couple people when he first reached out to
me and he just was literally getting started. And then I went to Strata with him in February this
year. And it was a little bit bigger, but it was a really good event for them. And then I went to
Strata with them again in September and they'd already doubled in size and you know you're looking for
more people they're still looking for more people and I'm like this is so exciting because I knew
when I saw it you just you're like okay you've got the full package like this is good stuff and
it was all built on Spark as a platform for big data and you can scale with it and I'm like it
can be on-prem or in the cloud. I'm like, fabulous.
This is fabulous.
Yeah, yeah.
It sounds interesting.
So just to put it in,
for anybody who's not heard of that product,
I mean, just put it in some kind of context.
So what BeyondCore did,
to my mind that was interesting,
was it would, there's automation in there.
It'd automate maybe the creation of graphs and charts
and so on off of your data,
but it would try and solve that.
I think, you know, the founder called it the comprehension gap you know this issue where not
only do we need analytics to be actionable but we also need data to be i suppose in a way it has to
be put in context so to make a decision on data you need to have that context and the understanding
of how this data set relates to that data set and so on there i mean is that the sort of thing that telius does as well and do you think that's one of the things that beyond core did well
it is absolutely one of the things i loved about beyond core was that you could pick a metric and
then in human language it would describe and you could visually experiment and it wasn't even just
that it would find the insights that also allow you to do what-if analysis and prescriptive
analytics the one thing that it was missing was the ability to kind of drag and drop and visually explore
you know once you got it it was kind of a static you know this is these are the results and this
is the output it wasn't really a visualization tool and they made a big the the ceo made a big
deal that it was not that they never wanted to really be that this one uh telius data is a little
different in that telius will do the same
thing you pick, whether it's, you know, I want to reduce discounts or I want to optimize profit
margin or opportunities, win rate. You pick the metric that you want to optimize and then it will
find the different insights. Again, it will pre-populate a bunch of charts for you, but it
allows you to look at the key influencers and,
you know, interactively visually explore and then create dashboards with it.
And if you want to then look at the algorithms or even, you know, create your own algorithms,
you can do that, which is a little bit different, right? So you could have a data scientist
feed and kind of say, well, we really want to run it. We don't want it just to run out of the
base ones. We want to customize our own as folks are doing this and the other thing that it also does that again i liked
with beyond core was it was actionable you can do the prescriptive analytics you can do the what if
analytics so it's not just you know it's predictive it's prescriptive and making recommendations and
that to me is really cool yeah i mean the example I think we were reading from your blog at the time,
you wrote, you know, there's a feature of smart insights,
which is slices of data calculated using machine learning algorithms.
And it talked about a segment,
might be people with a salary between X and Y,
and there's an insight on that.
I mean, quite a few BI tools now have some form of integration
with machine learning to do things like, I don't know,
in the oracle example
there's a feature that explains a you know uses kind of various stats models on a segment and
tells you this is what's interesting about them but you know i mean that that isn't actionable
really i mean what was do you think that's useful that kind of thing or is that very much kind of
previous generation well when i think about what makes it actionable is that what if type and you
can experiment with the data or you can upload another data set and run it against there and see what the output would be.
That piece of it and having kind of the citizen data science, that aspect be actionable.
That to me is interesting and a little bit newer than what we've seen in the past where we've only seen maybe recommended charts or recommended.
Here are the insights that we found,
but you couldn't really do too much with it.
Yeah.
I mean, so you said there about what-if analysis.
And I always thought one of the kind of paradoxes of BI
is that when I first got into this area years and years ago,
it was always talked about as being what-if analysis.
But actually, that's something that's very lacking in most BI tools, isn't it?
You don't really get the ability to change the kind of levers and see what would happen if you change this variable.
I mean, do you see that now or is that still something that is more an idea than practice?
I see it as more of an idea than a practice.
I saw, I think, one of the most impressive demos that I saw at Tableau's conference.
It was a former customer of mine, Frontline Systems. They are experts in
optimization. And a lot of the what-if and prescriptive analytics are using some optimization
routine simulation. And I can't remember what the other algorithm is that's used often.
But essentially, they went on stage and wanted to optimize route, you know, the sale, the traveling salesman problem.
And I thought that was very interesting.
So we're starting to see, but how Tableau was, was achieving it was they were going to use extensions and their platform to folks that really do optimization and simulation.
And, you know, some of the prescriptive analytics, they weren't necessarily going to build it themselves.
I don't see a lot of tools.
I'm definitely seeing some now that have a little bit of write-back.
I think it was spent, you know, the year where you can change the data.
I'm like, oh, my goodness.
I'm not sure how I feel about that.
But I'm seeing more of that in general.
But I still don't see a lot of what-if unless they're, you know,
some kind of cube, an old OLAP cube,
and some of the old stuff you think about from
oh my gosh 20 years ago so you're at the tableau conference weren't you what was your take on that
do you think it was do you think they still got momentum was it good or what i i'm concerned about
them um they i wanted to see more and i did i did blog about this uh reading between the lines I think is what
I called that blog because if you went there and you hadn't been there the year before the year
before they provided a like something like a five-year roadmap or a longer roadmap so you saw
these really cool demos and you saw automated insights and you saw you know recommendations and you saw a lot of
really cool natural language type features in search and you get there this year and you're
like okay well where is it you know let me play and i want the prep tool to maestro uh so we see
these these things and then you come this year and, you know, they're showing things like on a
mobile app, being able to email the results to take action. And I'm like, did you really just
show that devil on the stage? I just can't believe you're highlighting this. Oh, wow.
So it was very interesting. And I think even Cindy Hausen at Gartner gave them a little ding on,
you know, basically showing old school BI stuff and the crowd cheering about it.
So for the most part, I had mixed emotions and I talked to the product team and asked them because one of the nice things is they do invite me to their analyst day, even though technically I may or may not be an analyst.
It depends on who you talk to. But I sat in there and I said, hey, I'm concerned about you guys.
And they said to me, what you're not seeing and certainly what we're not showing is all the architectural changes that we've made in the back that are really critical for a cloud future and will allow us to iterate much faster. Now, having been at Microsoft, when we went from Silverlight to HTML5, like a bunch
of times, and it took literally four or five years, however long it took, I'm impressed, you know,
looking at it, and I did test this back end change. So they had the back end database change.
The hyper engine is now available in the solution. So that's a massive change.
And just looking at the architectural components, and I said, okay, I respect that.
I respect that a lot.
Actually, how fast then that has happened, having been through that with another product,
and how difficult that really is.
Because you think about when you re-engineer a platform, or you rebuild a platform, all the existing functionalities, you can't take away from your customer base they just have to work and one thing i was impressed with was the fact that you can
basically in place upgrade to that to that new platform for them to the hyper platform it's just
and i've done it myself i'm like well this is impressive because usually and even in microsoft
when we did power bi we we did Power BI B2.
I mean, it was, you know, you have these difficult conversations with the people that are early adopters.
Well, you're going to have to rebuild.
You know, it is what it is.
And, you know, maybe we'll give you some business investment funding.
You know, we'll give you $5,000 or something to help you out with this rebuilding effort.
And you still get very angry customers.
But yeah, so I'm impressed that they were very elegantly able to do that. But they certainly weren't showing that and they weren't saying that.
And I'm not sure if that was a mistake for them.
The other thing that I noticed is I think that they are much more guarded.
So when I used to work really closely with those guys,
they would not provide roadmaps to anybody.
They were very, very secretive.
And then they went to being more open.
And now I think they're back secretive again.
So they had all these NDA stations.
And I know they're not giving me even all the insight because they know I work with
multiple vendors.
And I said, you know what?
I think you're holding back on a few things.
And the reason why
is because now that you think about their biggest competitors literally across a bridge
the 520 bridge in washington from them that which is already who's that then microsoft
oh right yes yes yeah i think about rbi is definitely definitely wrinkled uh definitely
at least here it's it's very prevalent in the United States now.
But yeah, so they're already cloud ready.
When you're on the cloud and you can make multiple releases per week,
it's another ballgame.
Yeah, totally.
I mean, I guess the fundamental issue they have to deal with
is that visualization has become commoditized now.
What Tableau could only do is now is now commoditized and and all the work they're putting
into re-architecting their platform to run in the cloud well that's actually not directly benefiting
customers who actually now just expect you to be in the cloud and the other vendors are already there
anyway and so oracle was the same it would spend huge amounts of time you know inwardly looking at
changing its architecture and so on which gave no benefit at all so i mean it was being harsh but very little benefit i mean on
that topic as well i mean we i'm talking about looking with you in a second but you mentioned
yellowfin as well and yellowfin was another um vendor that i've noticed actually and i haven't
got around to looking at properly but you know you were quite keen on them and you were quite
interested in what they were doing and tell us about what Yellowfin is and, again, what market they serve and what they do well.
So Yellowfin is a cloud vendor.
It's probably one of the oldest cloud BI vendors.
If I think about good data, and there was a couple, you know, 10 years ago already,
that they were way early to the cloud.
I really rarely run into them.
I think they're either New Zealand or
Australia based. But I'm certainly I know the CEO there and I follow a few of their updates,
and they're nice enough to send me updates as well. And I'd seen one that was keen on this
automation and the augmented analytics. And I said, you know what, I'll sit on this session. And what I saw, I really liked. So the design experience, and I think a lot of folks don't
realize you can have the same functions and check the boxes off on the RFI for these solutions,
but it's really how it's designed that user interaction. And that's what made Tableau
magical. And that's where I looked at how augmented analytics has been implemented inside of Yellowfin that I was impressed with.
I said, oh, this is nice.
If I designed it myself, this is what I would have wanted.
The other thing that I liked that they had was they had some, if you do have, you know, working with some of these machine learning platforms,
machine learning is really the topic of the year without any year. Without any question, it's just dominating.
Hadoop was way back in the day, everything was about Hadoop.
And then you think about then it was data visualization,
and this year has definitely been machine learning.
What they've done is made it very easy to consume predictive analytics,
just calling the endpoints or the JSONs,
and being able to visualize those outputs within Yelefin's UI very nicely and easily for the,
again, for maybe just the business user. And I'm kind of bridging the data scientist that
usually sits in a silo by themselves with somebody on the front line making decisions.
And that's what I liked, some of the things I liked about their solution.
So you said they're consuming kind of predictive models and so on.
Where would these models come from?
Is this primarily a data scientist or a data engineer tool?
Or how does that work in practice, really?
So what they're doing, so this is a traditional, when I think about Yellowfin, it's a traditional
BI. You know, here's's dashboards here's some reports here's you know some etl process you know when i
think about they're kind of the traditional if the burst is another one that would have been in
their space you know an early adopter they are what they've done with machine learning is essentially
and they they demonstrated it's a company called h2OAI, very up and coming, hot in that space.
DataRobot's another one, hot and coming up in that space.
And I mean, there's a lot of different cloud-based machine.
You develop a machine learning algorithm and then you usually deploy it as a web service
and you're able to then consume or call it and get the results back.
And they've done a nice job of doing that.
So I would say integrating that feature is probably a data engineer,
somebody that's more advanced that would be doing that.
But the business user is able to consume it and say,
yeah, I want to use this model right in the UI in the tool.
And that's what they've done a little bit differently
that I haven't seen anybody else doing yet.
Okay. Okay.
So I suppose a meta topic on you know, on this whole area is
there's a lot of these vendors around.
Yellowfin, you said, Tellius there, and we mentioned Burst and so on.
Oh, there's a lot, yeah.
Yeah, and there's always more and so on.
And, you know, part of the work you do is around, you know,
advising vendors and so on.
How do they get noticed?
How does a vendor get noticed and raise itself above for example all the other
you know bi tool vendors and startups that are that are in this space as well i mean what what
kind of advice do you give and and how do they i suppose have a strategy and they go to market and
so on that actually gets them noticed above everybody else oh yeah there's a lot of noise
in the market so i think when when someone calls me just about the first question that I'm going to have
is what makes you unique and beautiful.
And if you can't say that very crisply in two minutes, kind of your elevator pitch to
me, then I'm already concerned about you.
So from that perspective, I need to know what makes you unique and beautiful.
And then from there, you know, looking at the website itself,
I still think websites are very important.
You're either going to lose somebody or you're going to draw them in further.
It's having the right content, you know, personalizing
and having the right, you know, demonstration there
to get them kind of engaged and looking at,
just interested enough to at least ask for a demo.
From a, you know, standing out perspective perspective i do think they need to do pr i had a talk with click tech earlier this
year i guess click whatever you want to call them now um because that's another vendor i'm worried
about right i said you know what i haven't heard anything really much from the pr perspective after
you guys went private after is it toma bra Bravo I said oh did I say it right
so yeah I told their PR team or their AR team it was analyst relations and I told them I was just
concerned about them because I never see any news anymore now Looker's the opposite oh my gosh I
literally told told her and they use a fantastic company and I told her she's the best one I know but stop emailing me
only email me when there's only email me when there is something you know maybe a major release
not every single day or you know because I was like you're just at this point making the spam
folder but I do think there needs to be a little bit of PR there needs to be events there needs to
be things that resonate with the audience and not just a
bunch of noise so i think the mistakes people make or they just throw a bunch of noise up
meaning crummy blogs if you're going to do a blog do a good blog and if you're going to do an event
do a good event don't just do noise yeah exactly there's a lot of there's a lot of startups now
coming out of the open source teams that develop things like kind of i suppose uh drill and uh druid and so on
there as well and and certainly i reached out to imply one of the ones talking about building a
kind of a front end on top of druid and and they're not particularly kind of bothered about
um about kind of engaging with with kind of analysts and silent moment there's a lot of kind
of stealth mode stuff going on and um yeah it's interesting time there's a lot of innovation going
on but um i guess how you make yourself known and whether you think that's worthwhile doing or not is an interesting
thing really um so uh you you mentioned there looker as well and looker as you just said looker
seemed to be you know the recipient of a lot of good press at the moment a lot of activity and i
hold myself up there as being you know part of that as well i mean i've been i've been impressed
with the company i've been impressed with the company. I've been impressed with the company.
I'm impressed with their attitude
and their product fits very well
with the kind of the use cases
and the people and the industry I currently work in.
But largely it's more of a back-end play
than a front-end play.
I mean, what was your take
on the Looker Join event recently?
And just generally, your view on the product itself. So i've been following them for a while and looking at it and what i think
they do very well aside from getting the word out there and making a lot of noise is that they have
a very nice website really crisp calls to action so whoever designed that they've done a really
nice job the the colors are appealing it's just these all these you know they're they're not real they're subjective things that really can make
a difference in a first impression you think about all the startups that have an ugly website i know
it sounds ridiculous but there are a lot of them and that first impression beyond core for instance
i told them i would not at the time. I would not.
We had to fix that website.
It's like, oh my gosh, don't even bother with anything else with me until you fix that website.
It was really quite interesting.
But so the Looker website's really quite compelling. Some of the things that I've heard from early customers is that their chat, that engages them for support.
So they get support, right? They just
nail it. I experienced that myself, actually. Yeah, I had to get them to change the domain name
of our instance of it. And they did it within, you know, they responded within a couple of minutes,
and then they fed back to what was going on. And it happened. And it was, you know, doing it via
a chat window, you'd think would be a terrible way of doing it but that's their service and and i suppose demeanor with customers is really good people love it so that was feedback and i did
some research earlier this year and that was the feedback that kept coming up was that those guys
really delighted customers and again when you think about who they're competing with these big
big companies that you're just a number i'm frustrated with adobe for instance and i'm just like okay if we're not going to have any kind of way if you're just
going to point me to a forum all the time why am i bothering the second the moment there's
there's competition for adobe or an alternative i'll probably go to them in a heartbeat because
they haven't cared about me as a customer and i totally get that um but if you think about that
i like that part about Looker.
When I had the review with them,
it was interesting because normally I'm a little grumbly
and there was things that I liked
and there are things that I'm like,
ah, I don't want to look at Looker ML.
I'm like, that scares me.
It reminds me of the click scripts
that scared me too back in the day.
And it does make a difference.
There's certain people that might like that.
There might be some job security,
but for a normal business user, they're not going to like that it's going
to scare them so but then I did talked about what they talked about with me and my big takeaway from
my briefing with them was they had and I can't remember I had it on a little sticky piece of
paper of what how many connectors native connectors that they had to different back-end data sources that were written and co-designed
with the vendors. And it reminded me of the conversation that I knew, and Tableau Public
still does have this, a whole group of engineers dedicated to optimizing performance of connectivity
for large data sets with drivers and these different drivers that made them very special it wasn't an odbc connection that's like it may work it may not work and you know that part with
looker i'm like holy cow okay well that's your beauty it's not you know something you can visually
see because the rest of it looks like everything else but the being able to have all those advanced
functions whether you're working with terra data or google bigquery or whatever it is maybe it's
amazon redshift that's what made them to me me, that was like, okay, that's your unique,
beautiful thing. You're not leading with it, but that's what I value.
Yeah, definitely. I think there's a few reasons why they're doing well at the moment. I think
they were, I suppose, they rode the wave of the cloud-based elastic kind of analytic databases
so as i think as looker was launched also with things like athena and bigquery and so on and
you know it certainly within the customers that would use those data those platforms
engineering is is very influential and that's the company i work at the moment so engineering would
have a big influence on what kind of analytic tool or BI tool might be chosen
both to run their business and perhaps to kind of partner with them to deliver analytics for
their customers and and so the fact that the connection is the connection through the BigQuery
and so on is so good and they are so aligned with those kind of sources that that means I think it's
a very kind of easy sell it's the obvious sell into all those kind of you know sas based uh you know data
driven retailers and so on um but as a as a bi tool it's not brilliant really and and you know
there's no there and that that's been charitable i mean it what it chimes well with the audience
that use it because i think they've come from tools like say google data studio or maybe things
like that but you know there's no support for things like drill down in hierarchies you know there's no none of the basic stuff that we'd expect from from any kind
of bi tool a gooey kind of um you know metadata building tool you know anything it's very primitive
and i just start to catch up now well i feel better that it wasn't just me that was grumbly
then because yeah i'm like i'm not impressed i'm not impressed i'm not impressed i'm like okay finally i'm impressed with the connectors and i it's yeah you want me you want
to you i rally for a lot of these small companies and they do they are winning customers and i was
i was assuming it was the google relationship kind of opening the door for them in a lot of
accounts and i think you had written something and i and i'd reached out and said you nailed it
that's what it is that they're doing.
They're embedding in the back end.
And so if you're an embedded BI tool, you don't have to be as savvy on the front end.
It's a different kind of audience often.
It's not necessarily the drag and drop data discovery audience.
It's, hey, we just need some reports and a little bit of visualizations.
And they can shine brightly there with the big data in the SaaS world.
Yeah, definitely.
I mean, even in basic terms, I mean, mean oracle for example if they're trying to sell into these
companies that you never have windows running there and and and their their metadata admin
tool runs on windows and it's just things where i think they they were i think just i think to say
they're in the right place at the right time is slightly kind of uncharitable because they've i
think what they've done is they've been in the right place at the right time with a product that
chimes with an audience with a problem and they've executed is they've been in the right place at the right time with a product that chimes with an audience
with a problem, and they've executed flawlessly.
And I think, you know, and actually in the dealings I've had with them,
they're such a nice bunch of people as well.
I mean, I deal with them quite a lot in my day job,
and they've been lucky, I think, in some respect,
and they're certainly, in my view, a back-end player
in as much as it's the core innovation with looker
was the ability to connect to these cloud data sources and deal with them in a way that was
reflective of how they work so their column store they're in the cloud they're whatever
um you know they they just got that you know lloyd tab or whoever you know who wrote at the start he
got that right right at the start but my concern i suppose is what they have now is not necessarily
defensible as a product you know connections there are companies out there that do connections through to sources, you know, progress and so on.
And the front end is not particularly kind of innovative.
So I wonder how defensible it is.
But certainly at the moment, they've been dealt a good hand and they're playing it flawlessly.
Well, yeah, and they definitely get the customer experience too. So they're
winning on that side. Yeah, exactly. So let's move on then to the other topic I want to talk
about, which was kind of ML and AI. Now you said earlier on that this year, last year it was with
data visualization, this year it's machine learning and AI. So to your view, where do you see examples of that being used well in BI products?
And where's the good use cases you're seeing around AI within analytics and BI?
So starting, let's start at the beginning of the life cycle of even just connecting to data or
creating, say, a data catalog of information that you might have in your organization for things like GDPR. It's the ability for machine learning and some of these,
you know, whether it's the scanning and catalog tools, or it's the ETL tools that you're connecting
to and the data quality tools, where you connect to a data source and it'll auto suggest
different corrections, it'll suggest metadata joins,
automatically join information,
maybe even populate into a model.
Datarama has a really impressive,
and I still haven't seen anything quite like it yet,
where all you have to do for a marketing type person
and it's marketing line of industry specific,
but all you do is connect to a data source
and it just creates everything in the model for you
and says, does this look right?
And you kind of like, wow, okay, yep, this looks good.
And you just click next.
So that part's really fascinating to me that I'm seeing a lot of improvements there.
The other piece is once it gets into the ETL process, things like Clare, Informatica's Clare, supposedly, and i say supposedly because i have not hands-on tested
claire if you have things like data drift in your big data pipeline or there's errors in etl
and usually as an administrator and having built etl packages you'd get alerts on your phone you
have to go manually fix these or you know whatever the situation would be it's saying uh what what
informatica is telling me is that claire can self-repair
almost like a selfie my husband does self-healing grid in energy so there's self-healing etl and
the whole concept of you can handle some of these changes and auto-fix it based on some of the other
changes that have been made or some of the intelligence that it has or these are the
joins that it should be or this is the the fixes that should be made. And I'm like, okay, that's impressive
because that's painful for that particular piece.
Once we get past ETL and you think about visualization,
there's a lot in that space right now
of this is recommending the best visualizations
and here are the insights that we found.
And here's why these changes,
and you can drill down into the changes
with things like teleus.
And even I think Yellowfin has a little bit of that too, where you can drill down into the changes with things like teleus and um even i think
yellowfin has a little bit of that too where you can look and pick two points and you know get the
analysis of the changes the influencers there that's way cool um you think about the natural
language description narrative science there's another one automated insights there's actually
a couple of these in this space that will look at the dashboard and i was very skeptical of this at first and then i read some use care some use case some case studies
i'm like okay i get it and it was actually what arjun probably talked to you from beyond core
when you had your had your coffee with him it's this context if not everybody can decipher a
dashboard and so having the natural language go ahead and be able to look at the full context of the dashboard and just spit it out in a few sentences, yeah, this is what this means for all the people that don't, you know, or maybe they misinterpret it.
And it was USAA.
It's one in the United States is one of the big military insurance companies.
And my husband was military, so we've used them forever.
And I saw their case study and I said, well, this is really impressive and this makes a lot of sense to me.
That a lot of folks that probably work at USAA do not know how to decipher the data on a dashboard.
So the natural language has been an interesting aspect there too and when i think about the capabilities when i used to show power bi there was this one feature where you could
with cortana analytics ask a question get an answer i'm seeing more of that and once once
you make it so easy to search data and just ask a question and get an answer in human language and
it does all the hard stuff in the background that is impressive to me and
that i think will finally get you know moving moving from okay now it's still still a bunch
of power users to now we really can now anybody really can use vi if you can ask a question or
you know the language that part's to me very interesting and then of course you look at
machine learning and i went to ibm world of watson last year i think about the time we talked and the most interesting project i saw that was not in the watson era it
was actually in the research area it was called um far project farsight it was automated feature
engineering which is the kind of the art the money ball part of machine learning they had figured out
a way to do some of that automation
of those types of things and then recommend to users,
or to users, recommend to the data scientists
building these algorithms,
how they can improve and what types of things.
And I'd never really seen it done that well before.
We'd seen products like KXEN,
if you remember that back in the day,
that would do some of this stuff.
But what I'm seeing now with DataRobot,
I'm immensely enamored with right now.
They do a bit of that.
They have some blueprints and they put together all these pieces.
So now you guide it.
You still have to prepare the data before it gets there.
But once you have prepared data, wow, the feature engineering part,
the putting through the blueprints and making
sure you're not making mistakes with guardrails and you know sampling or unsampling just all sorts
of best practices that are now you know automated in these tools is phenomenal and then they self
learn so that's the other thing so even like going to the very far end of now we have gadgets out
there and they may not even be
connected to the cloud there's a little company that's a startup in france that i did a paper for
called tell me plus i guess i like all the tells um but tell me plus has automated embedded
artificial intelligence at the edge it can self-train so it starts learning from the you
know these little devices and whatnot from the you know these little devices
and whatnot on the you know whether it's the energy grid or wherever you decide to have it
maybe it's you have it deployed on a website somewhere and it's learning the behavior of
the normal patterns on your website it has been really impressive to see well wow this ai can
teach itself and you don't even think about it you hear you hear all this about deep learning
and stuff and oh robots are going to take over the world it is getting a little creepy um when they can self-teach themselves
and you know they'll be smarter than us in no time yeah i mean i suppose one of the things
that's interesting maybe for the small vendors coming into the space is if they're going to be
involved in machine learning actual kind of volumes of data are quite important and there
was a good article i think it was in wide or one of the magazines recently saying that
you know for the big vendors like i suppose google and microsoft
and so on yeah what they build can be commoditized in terms of bi and analytics and so on but the
thing that can't be commoditized is their machine learning skills and and data sets because you need
to have the kind of the scale of data that say google would have or ibm would have to be able
to have a data set big enough to actually do what you're trying to do i mean do you find i suppose to put it into
a question do you find that these small vendors with kind of niche problem spaces struggle to
find the actual volume of data they need to make really predictive kind of things out of it i think
that's going to be this that's going to be the topic of the next five years or so there's going
to be a lot there's going to have to probably be some legislation
and whatnot, too, around some of these data sources.
So, yeah, there's even a movie.
I don't know if it's made it over the pond yet.
The Circle, I think it was called.
Okay, I haven't seen it, no.
Oh, it's kind of creepy.
Essentially, I think it's a spoof on Google
having all your search data and everybody being transparent and, you know, them knowing everything about you and about
everybody.
And, ooh, it was a little creepy.
But, I mean, that is the reality of what we're dealing with right now.
But I do think, yes, it does make a difference on who has all this.
This data is certainly garbage in, garbage out.
But think of it on a much grander scale.
And there are a few players that own, you know, if you think about Microsoft owning LinkedIn and all the business data and anything business related is pretty powerful.
And Amazon kind of dominating the retail world and now they're getting into pharmaceutical and they're taking over and disrupting all these industries.
It's the power of some of the data that they have that enables them in a digital disruption to disrupt.
Even to the point of actually,
for the tools themselves that you're building,
let's imagine you would use one of the tools
you mentioned before, like Yellowfin
or anything really where the customer themselves
is actually kind of running machine learning on their data.
Very few customers, to my mind,
have the volume of data for them to be able to do proper kind of machine learning on their data you know very few customers to my mind have the volume of data that need that for you to get for them to be able to do proper kind of machine learning and
and i suppose you know coming out of it bringing insights out of that and that their data sets are
too small and not every not every not every customer has a big data set to work with really
but they all want to have the value of machine learning out of it well i can say and and i mean
certainly i've been doing the the it was called data mining and
we call this predictive analytics or if you want advanced analytics back in the day you can
certainly create some of these algorithms with thousand thousand records i mean you don't need
to have massive data to find the patterns at all where this makes a difference is if you think about
you know combining external data with internal data to really
create that magic of really accurately reflecting the business process and the decisions. That's
the key. And usually, you know, there's a demographic element to decision making,
there may be a financial element, there's going to be environmental as well as, you know, the
data that may be coming into a system that's just the person enters
or the customer data that you'll have.
And it's all that other information
that if you think about these companies have
on what you've been searching
and what you've been looking for
and, oh, this is your demographic profile
and this is how much you spent last year.
All this other information that say maybe Google
or a Microsoft or an Amazon has that these little companies don't have um that's that's kind of
that's where they have the advantage and where it's getting kind of interesting is
who owns the data i think we're going to have questions on this because now we're starting to
see well if you buy office 365 and then well now we can now we can resell your data back as workforce analytics
or this isn't that I was like well wasn't that our data to begin with where now we're now we're
paying you again to monetize our data um it's it's getting going to be very interesting I think that
that those kind of topics and hey does anybody really have a fair shot to try
um to compete with this and how do they compete with somebody if it's a digital data world and
they have all the data how do you break into that i don't know yeah i mean it's instructive it's
instructive what you it's instructive what these places have because i mean i have this kind of
long-running personal project to gather in all my data and analyze it and so on and and so you know
a lot of i mean linkedin for example is very hard to get data from automatically there's no api you can call
to bring that in you know you can get things like tweets from twitter and so on there i i bring in
things like facebook data by doing an export and it brings back every interaction you've got there
which is interesting but but yeah i mean i suppose there's a privacy element to it but there's also
i suppose linkedin was an interesting purchase by you LinkedIn was a bit of a joke within the IT industry
in terms of, I suppose, as professionals,
we see it in the same way as recruiters,
but that work social graph that they have there
is very interesting.
And also, particularly, as companies move into the cloud
and things like Active Directory are no longer relevant,
you can almost imagine LinkedIn being the Active kind of businesses in the cloud in the future
really oh yeah yeah it's an amazing it's an a you know think about just the power of knowing all the
different people in an organization and understanding you know all their jobs their
careers their history i mean people are the most important asset in an organization, really.
It's people and data.
And there's a lot of knowledge that they have
by that acquisition.
It was, there's a reason they paid a lot of money for it.
Yeah, especially as you, I mean, you self-maintain that.
I mean, as you move from company to company,
you update it yourself.
And so not only does it have the networks
within your organization now, but it actually has how you link back to other companies you've worked
at and it's a fantastically valuable uh resource really so so it was i think it was um an
interesting purchase really by my i mean interesting in a good way really yeah so think about
one of the things that even you know you go you go through your non-compete in a company,
but then you always have your NDA.
And part of the NDA that usually you will have, and I get this, and people will call
and they'll ask these financial folks that they want to do these briefings to get a pulse
on the industry.
And they'll ask inappropriate questions they're not allowed to ask.
And, you know, you just have to shut them down.
But one of the most popular ones they'll ask is, well, what was the org structure like in that company?
How did they design engineering, or how were decisions made on products?
You're like, I'm not allowed to answer that question.
But Microsoft has that information.
If a group is organizing and putting in, these are my real exactly how companies are and you know have engineered and designed their org structures
yeah and typically that's a da information yeah interesting isn't it i mean so so you mentioned
data robot there so what's data robot and what are they doing in this space it's interesting
yeah so it's funny i used to and stalking is definitely not the right term. But when I, when I really get super interested, I'm like, please, please, please brief me.
I really want to see this. I'm so excited about this tech.
I called them and I emailed them and all I stop over at their booth at events and they just would not talk to me um it wasn't until this year i i ran into them at strata and we really connected
because i actually have a you know a couple years of data science training and and post-grad
and we just connected so what they do is they take machine learning and data mining whatever
you want to call this you're for creating algorithms and you. And you still need to feed it in what I would call a flattened, properly prepared machine learning data set.
Same thing with Watson Analytics and a lot of these other tools that automate machine learning.
You have to feed it in the right information.
Nobody's solved that yet.
It'll be interesting.
I don't know if it'll ever completely be something that could be automated.
I think there'll always be a human involved in that process.
But once they get the data, they can run through thousands.
They can scale this massively parallel to generate and compare and run thousands of algorithms, experiment with some feature engineering some low level feature engineering but they
automate that machine learning process of finding the absolute best best model and the best you know
lift for whatever the metric is you're trying to optimize and they've built in so they've built in
the solution you know here would be some templates per se if you're doing churn or if you're doing, you know, different types of ways to model.
They've built in these, they call them blueprints.
And then they've also done some, and they like to call them safeguards or guardrails where, you know, common mistakes.
Maybe I have very unbalanced data set or I have, you know, I haven't treated my data because normally you need to normalize a lot of the variables
because these things at the very core, the algorithms use statistical,
a foundation of statistics to generate predictions.
And they will do a lot of that for you.
And that's pretty time-consuming work.
So if you think about the process of maybe it
took three months before to maybe design a good algorithm you know and can be able to compare and
manually run and look at all these models you know they can run thousands of models in a few minutes
and it's just amazing so that part there it's been um then they've grown leaps and browns i think
they it's been interesting when I talked to my contact there,
he's funny, he's like, our hair's on fire.
I'm like, okay, I get it.
Just let me know, yay, nay, do we're going with this
or not going with this?
But yeah, it's impressive, it's fun,
and it's good stuff.
It's fun stuff.
So you mentioned that whole kind of preparing data thing
at the start there, and I recently blogged about google's new products in the area google data prep which is
cloud data prep which is the old which is technology yeah yeah yeah and and just been
talking to them actually before i spoke to you about coming on the show actually but
that that's an interesting product and and trifactor presumably have you know you've
known them for a while so what's your take on them and also what's happened to our friends
at um at paxata they were suddenly they were they were they were kind
of um so pax arta were the kind of the the smart they were the prettiest girl at the ball you know
a year ago everyone's talking about yes they were i've heard nothing i've heard nothing from them
since then i know because they are not working with me um it's funny yeah I'm mad at them right now. It's funny.
So I did.
I fell in love with Pexada.
I thought it was so beautiful.
And I found out that, you know, some group invested in them.
And I'm like, and they wouldn't even do like a blog with me.
I'm like, come on, do at least like one little thing with me.
They were funny.
So I don't know.
They went out.
They fell off the radar.
They completely fell off the radar for the most part.
And who I have seen is, and maybe they're working in stealth mode.
You never know what's going on.
But they definitely got a bunch of money.
I know some people moved around because one of the people did work with me.
He went to Connecticut.
He's like, oh, my gosh, I can't believe know what happened after you blogged i can't believe it either uh but but it's been funny so i
know some people have moved around but i think they i haven't heard a lick of them i haven't
seen them really at conferences it's been very interesting like what happened these guys but
trifecta i've seen everywhere um and they've been around know. And they've been around for a while, haven't they? They've been around for a while.
Yeah, they've been around for a while.
And I've reviewed them a few times.
And, you know, I had some recommendations here and there.
I didn't love them as much as I loved Pixata, but I liked what they had. they've had with being able to recommend sources or recommend joins. And this is what other people are doing for a business.
These are preparing data.
It was very valuable.
And I think those guys,
they,
they got in with Google and,
and they've certainly gotten word out there and I've seen them at the right content.
You talked about,
you asked earlier,
what makes people magical being some at the right,
right conferences and having announcements
and getting that buzz they've done a really nice job this year of doing that yeah exactly exactly
so and the tool is very good it solves that problem with with bigquery and all that i suppose
moving data around on the google platform it suddenly gets much more complicated it's it's
cloud data flow it's it's whatever whatever um and there is no yeah and there is no there is
no obvious kind of like etl tool on there that you would use but so cloud data flow came along
it's got scheduling now and i use it at home for stuff that i do and used it work a few times and
it's not you know it's not let's be frank it's probably not the most innovative um take on on
data prep and i'm sure trifacta would say that as well but you know but they've got other stuff
that well no i mean that in a kind way they've probably got other things they're doing that are kind of innovative but it
did the job well and it integrates with bigquery so you can read from bigquery right to bigquery
just does a job really well oh that's nice yeah i like i did like it uh so yeah they've done they've
done some things well this year and they've executed i mean you think about you can there's
a lot of things when you call about putting together a product, it's not just the product itself. It's everything from that website
to, you know, sales and marketing execution. And they've done that this year. They've executed,
whether it's on the sales side, the partner relationship side, whatever side it is,
they're certainly doing. The other one in the space, you know, and I think about it and I look
at Amazon. So Amazon released Glue, right glue right yeah what's your take on that what is that um i i was disappointed with it um although a lot of times and i have to
keep in mind the way amazon will work is they do minimal viable product and they just kind of throw
it out there but at first to me it just looked like a spark python scripting kind of engine and
then all of a sudden it's now you're seeing a little bit of a data catalog eking out of it.
And you're like, oh, okay.
So it's kind of expanding now.
But certainly it wasn't when I think about,
when I think about Trifactor or something like that,
it was nowhere near the same, even user experience.
It even seemed like a completely different market.
And I asked them, I said,
is this just for like Python scripters
and people maybe building software that would be interested in this?
Because a business user isn't going to use this.
And it's interesting.
So when you think about them on the bigger scale, I think Microsoft has more businessy apps.
And that's what makes them, you know, really shine brightly.
And Amazon caters more to people building software, kind of the devs, the app devs.
And Glue is more app devish it's not it's i wouldn't
point mr under it to it yeah i mean the other thing that's interesting with amazon is it's
impossible to get them on to come on here i've reached out to various people i've known vaguely
and it gets closed down every time i'd love to have someone come on the show and talk about glue
or talk about kind of clear sight or anything at all but they they don't kind of uh i suppose reach i mean i guess maybe they would reach out
to some people but certainly not to me and it'd be it'd be good to understand a bit more detail
behind some of their products you know why you know what's what's the take on some of the things
they brought out recently and and what's the vision on this and so on but it's quite hard to
get anybody to talk about it outside of amazon really really i was gonna say what i do know that's different and i hope this changes with them
and i would say this is the mistake that ibm makes it's the mistake that amazon makes
maybe even the mistake google makes it's hard to say i don't know but one thing that microsoft does
really well is that they have everybody in the whole company is an evangelist and everybody has their own it's just it's you blog and you
put videos and you talk and you go to events amazon is very structured and i've had people
talk to me and tell me before oh no everything has to go through the right channel and i even
saw some people reading on the event thing this week saying you know hey they wanted to talk to
an amazon employee on a camera at the event, and they absolutely shut that down immediately. And thinking, you know, it's the same thing with
with IBM, it's like, you can't seem to get these folks to engage with you. And it's just the way
that Salesforce and some of these companies operate, where they're very, very strict. And
at some point, when they get out marketed by, you know, all of, you know, some of the competitors,
customers that are saying, hey, everybody evangelize. Once they get out marketed by you know all of you know some of the competitors customers that are saying hey
everybody evangelize once they get out marketed then maybe they'll have to scratch their heads
and be like well maybe we can't control the message exactly we just need to to use some of
you know some of our folks to help spread the message well just give us a recap again website
url what's on there and what are some of the things you're going to plan to talk about in
the near future give us a flavor of uh what's coming out what's coming of the things you plan to talk about in the near future? Give us a flavor of what's coming down the line, really,
from Jen and the website in the future.
Yeah, it's that time here where you self-reflect
and you think about the next year.
So my website is jenunderwood.com,
and really soon I'll just, each month I do a recap of the industry,
you know, updates and my take on it
and what I think is marketing fluff and what do i think is marketing fluff and what
do i think is really cool stuff now when i start looking to next year i'll probably be doing more
interactive reviews one of the things that i've done a little bit too much is sponsored blogs
this year i'm going to actually reduce a bit of that so that i can get back to writing about what
i want to write about um so i'm going to expect more natural more natural more reviews more
probably a little less sponsored rah-rahs um that i had a lot of that this year and next year is
going to be a little bit different i do want to dig more into products next year so i'll probably
dig more in and i'll be doing a lot around the machine learning again this machine learning ai
space maybe even some internet of things. I've got a Raspberry Pi here
sitting now almost two years, so it's probably already outdated, but I have to figure out what
the latest Raspberry Pi or whatever that would be so that I'm not falling behind on gadgets.
My advice to you is don't get into IoT at home. I literally spend all of my spare time on that.
And I have, I'll make you laugh, I've got about four four raspberry pies i've got literally about eight
alexas around the house plus google home plus all those things linked together everything is
everything is iot and everything runs into a big data lake and do analytics on it the problem is
two things one is it is endlessly interesting you know and the second one is nothing ever works in
the house because everything is controlled by kind of ml analytics and so on and the lights
come on during the night and the heating goes off and,
and all that as it makes decisions and so on.
So,
it's hilarious.
I know,
I know,
I know.
So,
but it's a great kind of,
I suppose,
I mean,
you,
you know,
you mentioned like,
you know,
doesn't tell me,
tell me plus predictive objects and G predicts the things that you talked
about.
Yeah.
So there's interesting stuff happening there as well.
But anyway,
look,
I'll let you go now, but that's um it's been fantastic speaking to you um what i'll do is
i'll put links to your articles um that we've talked about here um on the show notes and um
you know other than that it's been great to catch up with you uh jen uh you know we'll speak
hopefully the new year and you know best of luck and best wishes to you and your family
oh thank you it's always enjoyable chatting with you.
Love your perspective.
And I love your blogs too.
Cheers.
Okay, thanks, Jen.
Thanks. Thank you.