Drill to Detail - Drill to Detail Ep.19 'Elasticsearch, Elastic Stack & Elastic Cloud' With Special Guest Mark Walkom

Episode Date: February 21, 2017

Mark Rittman is joined by Elastic's Mark Walkom to talk about Elasticsearch, Kibana, Logstash and the Elastic Stack; business models built-around an open-source software core; and their move into cl...oud services with Elastic Cloud

Transcript
Discussion (0)
Starting point is 00:00:00 So hello and welcome to another episode of Jewelry to Detail and I'm your host Mark Whitman. So today I'm very pleased to be joined by Mark Walken from Elastic who's speaking to us all the way from Australia. So Mark, welcome to the show and why don't you introduce yourself and tell us how you got involved with Elastic. Yeah, good day. Thanks Mark. So I'm a solution architect with Elastic. My story goes back nearly five years now when I started using Elasticsearch for analysing mail logs and pulling business insights out of those. It was the first time I'd ever come across Elasticsearch
Starting point is 00:00:46 and I was completely enamored by this piece of software that was so simple to use but it was so powerful once you started playing around with things. And it was the first distributed, the first proper distributed system I'd ever used and as an infrastructure engineer at the time I was just blown away and it was an awesome introduction into distributed systems and the problems that they pose and I guess I could see a bit of a you know the future this is where we need to go to be able to scale to cope with the data I mean at the time I was dealing with around a billion events a month so it was pretty high scale and that was really interesting. From there, yeah, so from there I
Starting point is 00:01:32 joined, well actually I started the Elasticsearch as it was then, a user group in Sydney because it was just an amazing piece of software and I was really enjoying it and I'm a big open source fan so I wanted to give back to the community as in a way that I could and then you know I think it was about a year after that Elasticsearch as the organization was known then started to expand and they started to move into Australia. Previously there was I think there's only about 50 and so I started interviewing with them and I originally joined as a support engineer because they needed the resources down here for our Follow the World support, sorry Follow the Sun support and so I
Starting point is 00:02:24 jumped at the chance. I've been keeping an eye on the organization. They were doing some awesome things with the products, so technically. But they also had a really awesome culture, and that really appealed to me. And it was an opportunity for me to work directly with open source, whereas previously I'd sort of – I used a lot of open source in my job. For me, it seemed like an amazing opportunity to join a purely open source organization. From there, I started.
Starting point is 00:02:52 I was the first hire in Australia and New Zealand. I think I was the first hire south of the equator, from what I understand, which is kind of crazy to think about. And when I joined, there was about 100 people. Now we have something like 460, and're known as elastic now we had a name change and um that's sort of a brief view so uh of where we've gone and and i've moved from a support engineer into a solution architect as we've gained more resources within the region and we needed the technical skills to help our sales people so it's an interesting company isn't it it's interesting company i mean it's i mean the
Starting point is 00:03:29 whole story behind elastic is interesting and the product itself i mean the reason i wanted to speak to you guys was it's a product you see used all the time it's almost like viral in the way that you see it being used in organizations and because i guess it's so easy to actually use and install and i use it myself to capture kind of iot events and so on but actually it's very kind of good technology as well and I mean tell us a bit about the story I mean how was Elastic formed and what was the founding story really of the company? It's a funny story actually so I think of something like 10 or 12 years ago our technical founder started this product called Compass.
Starting point is 00:04:07 Well, actually, he didn't start that. He was in between jobs and his wife was a chef. And he wanted to make her a search engine for recipes and ingredients and all these different sorts of things. So he thought, well, I'm going to build a search engine. So he built this. He just sat down and started building. He started learning Java. He started learning Lucene.
Starting point is 00:04:31 And he came out with this thing called Compass. And that was pretty good. At the time, it worked pretty well. And, you know, he sort of things ebbed and flowed. And he sort of started, played with the product a bit more. And I guess it's interesting. I think he sort of saw where things were going even 10 years ago, the need for search on a distributed scale.
Starting point is 00:04:56 And I think actually it was the seven-year anniversary of the very first release of Elasticsearch last week. So 10 years ago, he started the original idea of where he wanted to take it. And he said, well, I want to take this to the next level. He wants to turn into a new product and essentially a different concept. And so Shai created Elasticsearch and released that as open source. And so he was playing around with that for about three years and he decided it's time to go to the next level again and actually form an organization around this. So Shai Bannon, Stephen Sherman, Rubenis and Simon, I can't remember
Starting point is 00:05:42 Simon's last name, sorry. But they all joined together. They joined forces and they formed the Elasticsearch organisation. And they started with just four people. And it was a typical start-up thing where for a while there, everyone was working out of a single house, including the garage. So, you know, they'd have stand-ups in the garage every day. And it's exploded since then and we've got well we're in Amsterdam we're a Dutch company at
Starting point is 00:06:11 Amsterdam is essentially where our primary office is we've got a large presence in San Francisco we've got offices in New York Japan Sydney Berlin London there's about 30 countries that we're in at the moment. Yeah, yeah, definitely. I mean, I'd like to kind of go on to a bit about how you've managed to, I suppose, commercialize, but keep it open source as well. And that's an interesting kind of model there. But I mean, something that I think is worth us understanding at the start is, you know, you mentioned Lucene, obviously, Hadoop had a kind of element in search there as well. How did what was done with Elasticsearch, I suppose, how does it sort of differ from Hadoop had a kind of element in search there as well how did what you what was done with Elasticsearch I suppose how does it sort of differ
Starting point is 00:06:48 from Hadoop and Lucene and so on what was a particular problem that you solved in the innovative way really yeah okay so I don't know where Hadoop was ten years ago to be honest with you so but you know what we're seeing now is Elasticsearch is a fantastic real-time search engine and it's a real-time analytics engine. So it's the keywords there are real-time. So it's the ability to spin up a cluster that can cope with you know we have customers doing terabytes a day so so billions and billions of events. And the ability to take that data and run real-time analytics on it, so instead of having this data sitting there
Starting point is 00:07:33 and having to build Excel spreadsheets with pivot tables and all sorts of crazy graphs, instead of having to run some sort of analysis job and then come back the next day, so something like what would traditionally be a SQL query with a SQL job that you have to wait overnight for. You can do that in real time or near real time. So we're talking seconds as opposed to hours or minutes.
Starting point is 00:07:56 And Hadoop is a fantastic engine for storing data. It doesn't have that real-time aspect. Even with things like Spark with micro-batching, there's still delays in there. So we see ourselves as complementary. We have integrations with Hadoop. So we're seeing a lot of deployments where Elasticsearch is that real-time engine. And then things go into Hadoop as a long-term store and for more complicated jobs. So you can't do everything with Elasticsearch. It doesn't have everything in there. You know, there's a lot of really, really advanced stuff that we're seeing people doing that it's just not a fit for Elasticsearch. So
Starting point is 00:08:34 these people are doing that in Hadoop, which is fantastic. So they're writing our jobs with MapReduce and Spark and all this sort of thing. So we see ourselves sort of that real-time engine and then pushing it into Hadoop for that the long-term archival storage or complex jobs that need to run overnight for example. Yeah definitely I mean I know on projects I've worked on in the past we looked at we actually replaced say you know using Hive with regex for example with Elasticsearch I think in in that particular instance, the kind of integration with Hadoop and the complementary nature has been useful. Yeah, definitely. And
Starting point is 00:09:11 there are some search engines or some aspects of what we do within the Hadoop ecosystem. I guess we're finding a lot of adoption just in general. And you mentioned this before, the crazy adoption growth that we're seeing, is because it's easy to get started. So the barrier to entry is, well, it's crazy low. I mean, we've done a few changes recently which make it not as easy, but that's because we care about the stability, the long-term of people's data.
Starting point is 00:09:43 So there's a little bit of upfront pain now, I suppose, but it's in the best interest of where you're going to after this because we see people roll out Elasticsearch and they go, well, I'll just start with a single node, and then it turns into three nodes, and then it turns into, you know, honestly, we've got deployments of hundreds of nodes out there. So it's just that natural growth because people are like, oh, wow, look what we can do with real-time access. You know, look at the business decisions, look at the operational decisions,
Starting point is 00:10:08 all that sort of stuff. And so that's sort of why I think we see that growth and that, well, I hate to use the word, but that synergy with Hadoop. Yeah. Definitely. I think the other aspect that's been useful for me when I've looked at it is the kind of faceted nature of the search. I mean, the fact that you can, is it the case that you can search across all attributes in the domain, that sort of thing?
Starting point is 00:10:28 I think that particularly, that ability to search across everything is something that you, I suppose, coming from a SQL background, you wouldn't necessarily kind of appreciate. But it's quite, I suppose, revolutionary in how it works. Is that a key part as well? Yeah, I think so. I think that ability to, you know, by default when you search, you search everything. So if you search for just a word like Australia, then you search every document. And the whole thing about it is the relevance, you know. So you're not just getting back that literally, well, here's every document. It's doing some calculations in the engine itself to give you things that are relevant.
Starting point is 00:11:07 So I think that's one of the key features. But yeah, not only can you do that, you can do faceted search. So you can start breaking things up into different categories. And we're seeing people, various different travel agencies or car sales or house sales and they're using it to provide very simple to use but highly sophisticated bucketing or faceting of information so that people when they want to use a website they don't have to you know start building a massive filter they just go oh I'll tick a box oh I only want hotels five star within four kilometers of where I'm at you know that sort of thing it's and it's it's amazing you know that sort of functionality um I would I mean you could probably do it with big but without search it's just a couple of api calls and I think you know
Starting point is 00:11:55 again that sort of drives that adoption and that ease it's interesting when I said that to you you know you almost kind of sound like it was just obvious really but I think certainly you know fascist search when when you kind of come across it, when you've not seen it before, is very powerful really. And then you guys added like Kibana and Logstash. And so what was the story around that really? How did that come about and what role did they play? Yeah, so Kibana, I'm just trying to remember my history.
Starting point is 00:12:23 I think Logstash was the first product. So Logstash was originally just a tool for ingesting specifically logs, so hence the name Logstash. And it was built by a frustrated system administrator, Jordan Sassell, and he just wanted a tool that he could do multiple things with, so he could parse different types of logs and do regex-type translations and enhancements and all sorts of really good stuff,
Starting point is 00:12:51 but without having to know the syntax for regex because it's pretty painful even if you do know it. And so he started that, again, as an open-source thing, and he was just like, well, I've got all this data. I can process it, but where do I put it? And he came across Elasticsearch and it just it just fit it was just a natural fit and so Jordan was we approached him with the company approached him and we said hey we want to hire you we want to bring the Logstash on as a fully supported and feature-rich application. And so Jordan joins probably, I'm guessing here,
Starting point is 00:13:29 but it's a good three years ago. And then when SysData was in Elasticsearch, people started putting more and more in Elasticsearch. And so this guy named Rashid Khan, he said, well, I've got data in Elasticsearch. I want to view it because Elasticsearch, it's all API driven and it's all JSON. But it's hard to analyze JSON text, essentially. So he started building this thing called Kibana.
Starting point is 00:13:55 And it went through a few iterations. And again, we saw adoption go crazy because, again, it just made things even easier to use on top of Elasticsearch. And so we said, hey, Rashid, come come and join us and so that happened as well so we've ended up with this you know this this search engine with this ability to return results within microseconds and an ingest tool that lets you take data from various different sources and translate that and enhance that and add value to it and to a graphical front end that lets you build out all sorts of amazing visualizations with just literally through your browser so it's turned in a pretty amazing well we call it the
Starting point is 00:14:39 elastic stack now so it's a pretty amazing stack yeah yeah and i use it myself as well i mean it's it's very good and i mean it's uh i mean it myself as well i mean it's it's very good and i mean it's uh i mean so how do you i mean obviously all this was open source so but you're a company now that's got employees and revenue and so on so how does that kind of work how do you how does the commercial open source kind of model work really for you yeah so we have so we have four core products that are open source. People are starting to refer to this sort of a model as actually open core as opposed to open source, which is probably a whole other discussion we can have.
Starting point is 00:15:13 So we have Elasticsearch, which is the original product that we had. Then we had Logstash, Kibana, and we also have a family of functionality that we've collectively referred to as Beats. And they're just like the shippers for different sorts of information. Yeah. So that's our open source platform there. Not to mention, you know, that we've got a whole bunch
Starting point is 00:15:33 of different integrations like the Hadoop connector and then a whole bunch of other clients and things like that. And so on top of that, we have the commercial offerings is primarily what we refer to as XPAC. So it is extensions to Elasticsearch and Kibana at this stage. And they add in, I guess you could call them enterprise features. And additionally to that, we also have hosted Elasticsearch and Kibana. So originally, we started with a monitoring plugin
Starting point is 00:16:06 that was called Marvel. And that was for monitoring your Elasticsearch cluster. So the ability for you to sort of see what's going on, you know, my search is slow, have I seen a big increase in queries, that sort of thing. And then we added a security plugin to that and that had things like TLS encryption between nodes in the
Starting point is 00:16:26 cluster and to the cluster so you the request you were making was secured and then added things like active directory and helper app integration. So Mark I noticed there's a product on your on the website called PreLert about predictive analytics you know what's that about then how does that come about? Yeah sure so we we've been adding commercial products into what we've called our XPAC, which is a collection of functionality. And we've sort of been looking at, well, what's next for Elasticsearch and what's next for the data that people are holding? And one of the things that we've noticed is that, I mean, data never gets smaller.
Starting point is 00:17:04 The big data, people have started talking about data lakes, and it's getting to the point where now people are going, well, it's data oceans, and I don't know what comes after an ocean because maybe it's an ocean world, something like that. I don't know. But we saw the need to automate even further. So instead of someone sitting there and watching dashboards, we need a way for the system to tell us when there's something wrong
Starting point is 00:17:27 and learn from the data that's coming through. So I think it was the very first Elasticon, or the second Elasticon, I'm not really sure on which one, but there was a vendor there called Prelude, and Shai Banner, our technical founder, and he's now our CEO, walked over and just started a conversation. And this company had built a way to run mathematical models on data to then build thresholds and further models of what you're doing to be
Starting point is 00:18:07 able to discover when things are anomalous and to build behaviors. And they had an integration into Elasticsearch, which was obviously the key point. And so there's obviously a whole bunch of talks behind closed doors. And so Preload joined us, I think it was in October or November last year. And since then, we've been pretty crazy busy trying to rebrand that. So it was previously Preload. It's going to be called Machine Learning. And it's actually going to be integrated into our XPAC functionality.
Starting point is 00:18:41 Okay. I mean, you mentioned about Elastic on there and so on. I mean, i guess i mean elastic has done very well it's very successful and it seems to have kept its kind of like i suppose its roots and its credibility and so on i mean is the community model you've got is the way that you work with the community and so on and the open source community is that kind of key to it really yeah i i think so i i think um you know there's definitely been a change in the market where businesses, I mean, pretty much there's been reports done where it's 90 plus percent of businesses use open source. And I think it's the successful open source company has to be able to engage their community and their users and make sure that their open source products
Starting point is 00:19:26 are compelling enough to use, but then also balance, and this is the tricky thing, balance existing. And I think we've definitely, it is a tough road and we're always constantly adjusting to different bits and pieces and the community and the market and all these sorts of things and even internally and but yeah look I think ultimately the community has really driven the growth of Elasticsearch and Logstash and Kibana and and the the functionality that we're constantly putting back into the open source.
Starting point is 00:20:07 Because we don't, we've basically stated that we will never release something into open source and then pull it back out. So we're not, you know, we don't have enterprise versions where, you know, if the open source has this version and then, oh well well, if you pay us money, you'll get a different version with some added features on top of Elasticsearch. We will put features into Elasticsearch consistently. And then if you want some of those other features,
Starting point is 00:20:35 then that's part of the XPAC, but there's a clear delineation. And I think that's also pretty important for us as an organization, but also ultimately so that the open source users and our community know where they stand. And that's also... So tell us about cloud.
Starting point is 00:20:52 I mean, everybody coming onto this at some point always wants to talk about cloud because it's the interesting thing. And you guys have done Elastic Cloud, which I think it's kind of you know your stuff on on aws and so on there i mean tell us about i mean to what extent you can you know what is the strategy around cloud and and what's the benefit to developers really running it on there yeah sure so we've seen you know historically you would build and manage your own elastic search cluster
Starting point is 00:21:20 and you know we've seen some other entities start up their own hosted Elasticsearch. And you know, only a push these days, like everything's in the cloud and the latest thing is serverless. So, you know, it's essentially services everywhere to handle everything. And we saw that that was a gap in our commercial offering and the ability to make sure that we can offer a service that customers have faith in because we're the creators of these products
Starting point is 00:21:55 so we definitely know how to host them and provide these sorts of services. And so a couple of years ago, there was a bunch of Norwegian, I think there was, I can't remember how many there was, but there was a bunch of Norwegians that had this service called Found. And we spent a lot of time checking the market and we found them to be up to our own standards. And so they were also brought on into the Elastic team and hence Elastic Cloud was formed and so from there we've
Starting point is 00:22:30 obviously rebranded and we've grown the team from a handful to I think there's probably 20 or 30 in the team now It's pretty large. So we're investing a lot of time back into our that service and also interesting we're actually taking the the underlying mechanisms for managing that cluster and the automation the deployment models and all the upgrades and even things like snapshots and we're turning that into a we're calling it elastic cloud enterprise so the ability to take all that automation deploy it in your own cloud or in the public cloud, in your own VPC or virtual network, and then get the same functionality. So API-driven, web-based, and you don't need to worry about, oh, if this Elasticsearch node has had a problem because the underlying host, you know, the AWS host has gone down,
Starting point is 00:23:21 do I need to start up a new one? Elastic Cloud Enterprise and Elastic Cloud handles that. It's all automated. And so we're sort of seeing that the server, the infrastructure as a service or platform as a service really... So where would somebody... So if somebody hadn't... If there's one person in the world that hadn't heard of Elasticsearch and used it and so on, how would somebody get started with the technology? Where would they go to and where would they download things from and so on, really?
Starting point is 00:23:48 Yeah, sure. So the best place to start is to head to our website, which is just elastic.co. And the joke there is that we couldn't afford the M for elastic.com, but it's actually a nice URL. And so you can head to – our documentation is pretty comprehensive. So we actually have a – if you want to start looking at something for logging using the elastic stack then we have a rundown on how to install everything how to
Starting point is 00:24:13 integrate everything if you're a more of a search focus then we have a thing called the elastic search definitive guide and that that basically is from first principles how to install elastic search how to put data in there, how to do searching and everything in between there. And that's a phenomenal piece of resource that people can leverage. And again, it's actually part of our open source offering because it's open source docs. From there, if you've got questions, we have community forums, which is at discuss.elastic.co. And the engineers that write the products hang out there as well. questions. We have community forums which is at discuss.elastic.co and the engineers that write the products hang out there as well. So you can get answers
Starting point is 00:24:50 from the community or also from us. On top of that we have training and so again we have the engineers that are writing the product that are providing this training so we try to keep as close as possible to our users if that's not obvious and so that's worldwide so there's always training going a lot we try to keep as close as possible to our users. That's not obvious. And so that's worldwide. So there's always training going a lot around somewhere. And if you still need help, then we have commercial support where we can, again, we can have our own engineers helping you roll out your platforms
Starting point is 00:25:20 and make sure everything's in the best possible way. Yeah, fantastic. And I guess the point, the good thing is it's so easy to get started with, really. your platforms and make sure everything's in the best possible yeah fantastic and i guess i guess the point the good thing is it just you know it's so easy to get started with really but then i suppose the fact that you've got people like yourself there as well that can can help with the kind of more complex integrations and so on i mean the fact you've got this open source model come up linked to the kind of commercial one there and you guys are doing well i think that's kind of a great testament to what you're doing really yeah yeah definitely i it's i don't think we're going anywhere other than you know onwards and upwards we're not just going to disappear in a
Starting point is 00:25:50 puff of smoke um like some some other products may have in the past so yeah definitely excellent excellent well look mark thank you very much for doing the call anyway i'm conscious you're working at the moment over in australia so um but thank you very much for that and it's been great to have you on and yeah cheers thank you very much thanks for having me michael it was a great time

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.