Drill to Detail - Drill to Detail Ep.19 'Elasticsearch, Elastic Stack & Elastic Cloud' With Special Guest Mark Walkom
Episode Date: February 21, 2017Mark Rittman is joined by Elastic's Mark Walkom to talk about Elasticsearch, Kibana, Logstash and the Elastic Stack; business models built-around an open-source software core; and their move into cl...oud services with Elastic Cloud
Transcript
Discussion (0)
So hello and welcome to another episode of Jewelry to Detail and I'm your host Mark Whitman.
So today I'm very pleased to be joined by Mark Walken from Elastic who's speaking to us all the way from Australia.
So Mark, welcome to the show and why don't you introduce yourself and tell us how you got involved with Elastic.
Yeah, good day. Thanks Mark.
So I'm a solution architect with Elastic.
My story goes back nearly five years now when I started using Elasticsearch for analysing mail logs
and pulling business insights out of those.
It was the first time I'd ever come across Elasticsearch
and I was completely enamored by this piece of software that was so simple to use but
it was so powerful once you started playing around with things. And it was the first distributed,
the first proper distributed system I'd ever used and as an infrastructure engineer at the time
I was just blown away and it was an awesome introduction into distributed systems and the
problems that they pose and I guess I could see a bit of a you know the future this is where we
need to go to be able to scale to cope with the data I mean at the time I was dealing with around
a billion events a month so it was pretty high scale and that was really interesting.
From there, yeah, so from there I
joined, well actually I started the Elasticsearch as it was then, a user group in Sydney
because it was just an amazing piece of software and I was really enjoying it and I'm a big open
source fan so I wanted to give back to the community as in a way that I could and then
you know I think it was about a year after that Elasticsearch as the organization was known then
started to expand and they started to move into Australia. Previously there was
I think there's only about 50 and so I started interviewing with them and I
originally joined as a support engineer because they needed the resources down
here for our Follow the World support, sorry Follow the Sun support and so I
jumped at the chance.
I've been keeping an eye on the organization.
They were doing some awesome things with the products, so technically.
But they also had a really awesome culture, and that really appealed to me.
And it was an opportunity for me to work directly with open source,
whereas previously I'd sort of – I used a lot of open source in my job.
For me, it seemed like an amazing opportunity to join a purely open source organization.
From there, I started.
I was the first hire in Australia and New Zealand.
I think I was the first hire south of the equator, from what I understand, which is kind of crazy to think about.
And when I joined, there was about 100 people.
Now we have something like 460, and're known as elastic now we had a name change
and um that's sort of a brief view so uh of where we've gone and and i've moved from a support
engineer into a solution architect as we've gained more resources within the region and
we needed the technical skills to help our sales people so
it's an interesting company isn't it it's interesting company i mean it's i mean the
whole story behind elastic is interesting and the product itself i mean the reason i wanted to speak
to you guys was it's a product you see used all the time it's almost like viral in the way that
you see it being used in organizations and because i guess it's so easy to actually use and install
and i use it myself to capture kind of iot events and so on but actually it's very kind of good technology as well and
I mean tell us a bit about the story I mean how was Elastic formed and what was
the founding story really of the company? It's a funny story actually so I
think of something like 10 or 12 years ago our technical founder started this
product called Compass.
Well, actually, he didn't start that. He was in between jobs and his wife was a chef.
And he wanted to make her a search engine for recipes and ingredients
and all these different sorts of things.
So he thought, well, I'm going to build a search engine.
So he built this.
He just sat down and started building.
He started learning Java.
He started learning Lucene.
And he came out with this thing called Compass.
And that was pretty good.
At the time, it worked pretty well.
And, you know, he sort of things ebbed and flowed.
And he sort of started, played with the product a bit more.
And I guess it's interesting.
I think he sort of saw where things were going even 10 years ago,
the need for search on a distributed scale.
And I think actually it was the seven-year anniversary
of the very first release of Elasticsearch last week.
So 10 years ago, he started the original idea of where
he wanted to take it. And he said, well, I want to take this to the next level. He wants to turn
into a new product and essentially a different concept. And so Shai created Elasticsearch and
released that as open source. And so he was playing around with that for
about three years and he decided it's time to go to the next level again and actually form an
organization around this. So Shai Bannon, Stephen Sherman, Rubenis and Simon, I can't remember
Simon's last name, sorry. But they all joined together.
They joined forces and they formed the Elasticsearch organisation.
And they started with just four people.
And it was a typical start-up thing where for a while there,
everyone was working out of a single house, including the garage.
So, you know, they'd have stand-ups in the garage every day.
And it's exploded since
then and we've got well we're in Amsterdam we're a Dutch company at
Amsterdam is essentially where our primary office is we've got a large
presence in San Francisco we've got offices in New York Japan Sydney Berlin
London there's about 30 countries that we're in at the moment.
Yeah, yeah, definitely. I mean, I'd like to kind of go on to a bit about how you've managed to,
I suppose, commercialize, but keep it open source as well. And that's an interesting kind of model
there. But I mean, something that I think is worth us understanding at the start is, you know,
you mentioned Lucene, obviously, Hadoop had a kind of element in search there as well.
How did what was done with Elasticsearch, I suppose, how does it sort of differ from Hadoop had a kind of element in search there as well how did what you what was done with Elasticsearch I suppose how does it sort of differ
from Hadoop and Lucene and so on what was a particular problem that you solved
in the innovative way really yeah okay so I don't know where Hadoop was ten
years ago to be honest with you so but you know what we're seeing now is Elasticsearch is a fantastic real-time search engine and
it's a real-time analytics engine. So it's the keywords there are real-time. So
it's the ability to spin up a cluster that can cope with you know we have
customers doing terabytes a day so so billions and billions of events. And the ability to take that data
and run real-time analytics on it,
so instead of having this data sitting there
and having to build Excel spreadsheets with pivot tables
and all sorts of crazy graphs,
instead of having to run some sort of analysis job
and then come back the next day,
so something like what would traditionally be a SQL query
with a SQL job that you have to wait overnight for.
You can do that in real time or near real time.
So we're talking seconds as opposed to hours or minutes.
And Hadoop is a fantastic engine for storing data.
It doesn't have that real-time aspect.
Even with things like Spark with micro-batching, there's still delays in there. So we see ourselves as complementary.
We have integrations with Hadoop. So we're seeing a lot of deployments where Elasticsearch
is that real-time engine. And then things go into Hadoop as a long-term store and for
more complicated jobs. So you can't do everything with Elasticsearch. It doesn't have everything in there. You know, there's a lot of
really, really advanced stuff that we're seeing people doing
that it's just not a fit for Elasticsearch. So
these people are doing that in Hadoop, which is fantastic. So they're
writing our jobs with MapReduce and Spark and
all this sort of thing. So we see ourselves sort of that real-time
engine and then pushing it into Hadoop for that the long-term archival storage or complex jobs
that need to run overnight for example. Yeah definitely I mean I know on projects I've worked
on in the past we looked at we actually replaced say you know using Hive with regex for example
with Elasticsearch I think in in that particular instance, the kind of
integration with Hadoop and the complementary nature has been useful. Yeah, definitely. And
there are some search engines or some aspects of what we do within the Hadoop ecosystem. I guess
we're finding a lot of adoption just in general. And you mentioned this before, the crazy adoption
growth that we're seeing,
is because it's easy to get started.
So the barrier to entry is, well, it's crazy low.
I mean, we've done a few changes recently which make it not as easy,
but that's because we care about the stability,
the long-term of people's data.
So there's a little bit of upfront pain now, I suppose,
but it's in the best interest of where you're going to after this because we see people roll out Elasticsearch and they go,
well, I'll just start with a single node, and then it turns into three nodes,
and then it turns into, you know, honestly,
we've got deployments of hundreds of nodes out there.
So it's just that natural growth because people are like, oh, wow,
look what we can do with real-time access.
You know, look at the business decisions, look at the operational decisions,
all that sort of stuff.
And so that's sort of why I think we see that growth and that, well,
I hate to use the word, but that synergy with Hadoop.
Yeah.
Definitely.
I think the other aspect that's been useful for me when I've looked at it
is the kind of faceted nature of the search.
I mean, the fact that you can, is it the case that you can search across all attributes in the domain, that sort of thing?
I think that particularly, that ability to search across everything is something that you, I suppose, coming from a SQL background, you wouldn't necessarily kind of appreciate.
But it's quite, I suppose, revolutionary in how it works. Is that a key part as well?
Yeah, I think so.
I think that ability to, you know, by default when you search, you search everything.
So if you search for just a word like Australia, then you search every document.
And the whole thing about it is the relevance, you know.
So you're not just getting back that literally, well, here's every document.
It's doing some calculations in the engine itself to give you things that are relevant.
So I think that's one of the key features.
But yeah, not only can you do that, you can do faceted search.
So you can start breaking things up into different categories.
And we're seeing people, various different travel agencies or car sales or house sales and they're using it to provide very simple to use
but highly sophisticated bucketing or faceting of information so that people when they want to use
a website they don't have to you know start building a massive filter they just go oh I'll
tick a box oh I only want hotels five star within four kilometers of where I'm at you know that sort of thing it's
and it's it's amazing you know that sort of functionality um I would I mean you could probably do it with big but without search it's just a couple of api calls and I think you know
again that sort of drives that adoption and that ease it's interesting when I said that to you you
know you almost kind of sound like it was just obvious really but I think certainly you know
fascist search when when you kind of come across it,
when you've not seen it before, is very powerful really.
And then you guys added like Kibana and Logstash.
And so what was the story around that really?
How did that come about and what role did they play?
Yeah, so Kibana, I'm just trying to remember my history.
I think Logstash was the first product.
So Logstash was originally just a tool for ingesting specifically logs,
so hence the name Logstash.
And it was built by a frustrated system administrator, Jordan Sassell,
and he just wanted a tool that he could do multiple things with,
so he could parse different types of logs
and do regex-type translations and enhancements
and all sorts of really good stuff,
but without having to know the syntax for regex
because it's pretty painful even if you do know it.
And so he started that, again, as an open-source thing,
and he was just like, well, I've got all this data.
I can process it, but where do I put it? And he came across Elasticsearch and it just it just fit it
was just a natural fit and so Jordan was we approached him with the company approached him
and we said hey we want to hire you we want to bring the Logstash on as a fully supported and feature-rich application.
And so Jordan joins probably, I'm guessing here,
but it's a good three years ago.
And then when SysData was in Elasticsearch,
people started putting more and more in Elasticsearch.
And so this guy named Rashid Khan, he said,
well, I've got data in Elasticsearch.
I want to view it because Elasticsearch, it's all API driven and it's all JSON.
But it's hard to analyze JSON text, essentially.
So he started building this thing called Kibana.
And it went through a few iterations.
And again, we saw adoption go crazy because, again,
it just made things even easier to use on top of Elasticsearch.
And so we said, hey, Rashid, come come and join us and so that happened as well so we've ended up with this
you know this this search engine with this ability to return results within microseconds
and an ingest tool that lets you take data from various different sources and translate that and enhance that and add value to
it and to a graphical front end that lets you build out all sorts of amazing visualizations
with just literally through your browser so it's turned in a pretty amazing well we call it the
elastic stack now so it's a pretty amazing stack yeah yeah and i use it myself as well i mean it's
it's very good and i mean it's uh i mean it myself as well i mean it's it's very good and
i mean it's uh i mean so how do you i mean obviously all this was open source so but
you're a company now that's got employees and revenue and so on so how does that kind of work
how do you how does the commercial open source kind of model work really for you
yeah so we have so we have four core products that are open source.
People are starting to refer to this sort of a model as actually open core as opposed to open source,
which is probably a whole other discussion we can have.
So we have Elasticsearch, which is the original product that we had.
Then we had Logstash, Kibana,
and we also have a family of functionality
that we've collectively referred to as Beats.
And they're just like the shippers for different sorts of information.
Yeah.
So that's our open source platform there.
Not to mention, you know, that we've got a whole bunch
of different integrations like the Hadoop connector
and then a whole bunch of other clients and things like that.
And so on top of that, we have the commercial offerings
is primarily what we refer to as XPAC.
So it is extensions to Elasticsearch and Kibana at this stage.
And they add in, I guess you could call them enterprise features.
And additionally to that, we also have hosted Elasticsearch and Kibana.
So originally, we started with a monitoring plugin
that was called Marvel.
And that was for monitoring your Elasticsearch cluster.
So the ability for you to sort of see what's going on,
you know, my search is slow,
have I seen a big increase in queries,
that sort of thing.
And then we added a security plugin to that
and that had things like TLS encryption between nodes in the
cluster and to the cluster so you the request you were making was secured and then added things like
active directory and helper app integration. So Mark I noticed there's a product on your
on the website called PreLert about predictive analytics you know what's that about then how
does that come about? Yeah sure so we we've been adding commercial products into what we've called our XPAC, which is a collection
of functionality.
And we've sort of been looking at, well, what's next for Elasticsearch and what's next for
the data that people are holding?
And one of the things that we've noticed is that, I mean, data never gets smaller.
The big data, people have started talking about data lakes,
and it's getting to the point where now people are going,
well, it's data oceans, and I don't know what comes after an ocean
because maybe it's an ocean world, something like that.
I don't know.
But we saw the need to automate even further.
So instead of someone sitting there and watching dashboards,
we need a way for the system to tell us when there's something wrong
and learn from the data that's coming through.
So I think it was the very first Elasticon,
or the second Elasticon, I'm not really sure on which one,
but there was a vendor there called Prelude,
and Shai Banner, our technical founder, and he's now our CEO,
walked over and just started a conversation.
And this company had built a way to run mathematical models on data
to then build thresholds and further models of what you're doing to be
able to discover when things are anomalous and to build behaviors.
And they had an integration into Elasticsearch, which was obviously the key point.
And so there's obviously a whole bunch of talks behind closed doors.
And so Preload joined us, I think it was in October or November last year.
And since then, we've been pretty crazy busy trying to rebrand that.
So it was previously Preload.
It's going to be called Machine Learning.
And it's actually going to be integrated into our XPAC functionality.
Okay.
I mean, you mentioned about Elastic on there and so on.
I mean, i guess i mean
elastic has done very well it's very successful and it seems to have kept its kind of like i
suppose its roots and its credibility and so on i mean is the community model you've got is the way
that you work with the community and so on and the open source community is that kind of key to it
really yeah i i think so i i think um you know there's definitely been a change in the market where businesses, I mean, pretty much there's been reports done where it's 90 plus percent of businesses use open source.
And I think it's the successful open source company has to be able to engage their community and their users and make sure that their open source products
are compelling enough to use,
but then also balance, and this is the tricky thing,
balance existing.
And I think we've definitely, it is a tough road
and we're always constantly adjusting to different bits and pieces and the community and the market and all these sorts of things and
even internally and but yeah look I think ultimately the community has
really driven the growth of Elasticsearch and Logstash and Kibana and and the
the functionality that we're constantly putting back into the open source.
Because we don't, we've basically stated that we will never release something into open
source and then pull it back out.
So we're not, you know, we don't have enterprise versions where, you know, if the open source
has this version and then, oh well well, if you pay us money,
you'll get a different version with some added features
on top of Elasticsearch.
We will put features into Elasticsearch consistently.
And then if you want some of those other features,
then that's part of the XPAC,
but there's a clear delineation.
And I think that's also pretty important
for us as an organization,
but also ultimately so that the open source users
and our community know where they stand.
And that's also...
So tell us about cloud.
I mean, everybody coming onto this at some point
always wants to talk about cloud
because it's the interesting thing.
And you guys have done Elastic Cloud,
which I think it's kind of you know your stuff on on aws
and so on there i mean tell us about i mean to what extent you can you know what is the strategy
around cloud and and what's the benefit to developers really running it on there yeah sure
so we've seen you know historically you would build and manage your own elastic search cluster
and you know we've seen some other entities start up their own hosted Elasticsearch.
And you know, only a push these days, like everything's in the cloud and the latest thing
is serverless.
So, you know, it's essentially services everywhere to handle everything.
And we saw that that was a gap in our commercial offering
and the ability to make sure that we can offer a service
that customers have faith in
because we're the creators of these products
so we definitely know how to host them
and provide these sorts of services.
And so a couple of years ago,
there was a bunch of Norwegian,
I think there was, I can't remember how many there was, but there was a bunch of Norwegians that had this service called Found.
And we spent a lot of time checking the market and we found them to be up to our own standards.
And so they were also brought on into the Elastic team and hence Elastic Cloud was formed and so
from there we've
obviously rebranded and we've grown the team from a handful to I think there's probably 20 or 30 in the team now
It's pretty large. So we're investing a lot of time back into our that service and also interesting we're actually taking the the
underlying mechanisms for managing that cluster and the automation the deployment models and all
the upgrades and even things like snapshots and we're turning that into a we're calling it elastic
cloud enterprise so the ability to take all that automation deploy it in your own cloud or in the public cloud, in your own VPC or virtual
network, and then get the same functionality.
So API-driven, web-based, and you don't need to worry about, oh, if this Elasticsearch
node has had a problem because the underlying host, you know, the AWS host has gone down,
do I need to start up a new one?
Elastic Cloud Enterprise and Elastic
Cloud handles that. It's all automated. And so we're sort of seeing that the server, the
infrastructure as a service or platform as a service really...
So where would somebody... So if somebody hadn't... If there's one person in the world
that hadn't heard of Elasticsearch and used it and so on, how would somebody get started
with the technology? Where would they go to and where would they download things from
and so on, really?
Yeah, sure.
So the best place to start is to head to our website,
which is just elastic.co.
And the joke there is that we couldn't afford the M for elastic.com,
but it's actually a nice URL.
And so you can head to – our documentation is pretty comprehensive.
So we actually have a – if you want to start looking at something for logging using
the elastic stack then we have a rundown on how to install everything how to
integrate everything if you're a more of a search focus then we have a thing
called the elastic search definitive guide and that that basically is from
first principles how to install elastic search how to put data in there, how to do searching and everything in between there.
And that's a phenomenal piece of resource that people can leverage.
And again, it's actually part of our open source offering because it's open source docs.
From there, if you've got questions, we have community forums, which is at discuss.elastic.co.
And the engineers that write the products hang out there as well. questions. We have community forums which is at discuss.elastic.co and the
engineers that write the products hang out there as well. So you can get answers
from the community or also from us. On top of that we have training and so
again we have the engineers that are writing the product that are providing
this training so we try to keep as close as possible to our users if that's not
obvious and so that's worldwide so there's always training going a lot we try to keep as close as possible to our users. That's not obvious.
And so that's worldwide.
So there's always training going a lot around somewhere.
And if you still need help,
then we have commercial support where we can, again, we can have our own engineers helping you roll out your platforms
and make sure everything's in the best possible way.
Yeah, fantastic. And I guess the point, the good thing is it's so easy to get started with, really. your platforms and make sure everything's in the best possible yeah fantastic and i guess i guess
the point the good thing is it just you know it's so easy to get started with really but then i
suppose the fact that you've got people like yourself there as well that can can help with
the kind of more complex integrations and so on i mean the fact you've got this open source model
come up linked to the kind of commercial one there and you guys are doing well i think that's kind of
a great testament to what you're doing really yeah yeah definitely i it's i don't think we're
going anywhere other than you know onwards and upwards we're not just going to disappear in a
puff of smoke um like some some other products may have in the past so yeah definitely excellent
excellent well look mark thank you very much for doing the call anyway i'm conscious you're
working at the moment over in australia so um but thank you very much for that and it's been great to have you on and
yeah cheers thank you very much thanks for having me michael it was a great time