Orchestrate all the Things - Open source observability marches on: New Relic and Grafana Labs partnership brings benefits to developers. Featuring Grafana CEO Raj Dutt, New Relic CPO Bill Staples

Episode Date: August 10, 2020

The perfect observability storm with open source leading the way, and a partnership that makes sense. Open source is eating the world, and observability is no exception. New Relic and Grafana La...bs just announced a partnership, and we discuss the specifics as well as the broader open source and observability landscape with Grafana Labs CEO Raj Dutt and New Relic Chief Product Officer Bill Staples. We cover everything from the rationale of the partnership, what it brings for users and how the integration was done, to open source, de facto and de jure standards for telemetry, and the use of AI and machine learning. Article published on ZDNet

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amatiotis and we'll be connecting the dots together. Today's episode is about open source and observability. New Relic and Grafana Labs just announced the partnership and we discussed the specifics as well as the broader open source and observability landscape with Grafana Labs CEO Radstat and New Relic Chief Product Officer Bill Staples. We cover everything from the rationale for the partnership, what it brings for users
Starting point is 00:00:30 and how the integration was done, to open source, de facto and the US standards for telemetry and the use of AI and machine learning. I hope you will enjoy the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn and Facebook. And well, congratulations for the partnership and you know even though I'll be honest with you, it's not like I can say that I saw it coming, in retrospect it all kind of makes sense. So on the one hand we we have New Relic, which seems to have been like a reinvention course for a while now. So embracing AI and open source, it's like two key themes, I would
Starting point is 00:01:14 say, in this kind of reinvention. So for example, I recall there was a mention to OpenTelemetry last time I spoke to a New Relic spokesperson and the Cloud Native Foundation project for standardizing collection and sending of telemetry and that seems to have been adopted now even though it's in beta. And Grafana Labs on the other hand already has this Big Ten philosophy which embraces data ingestion from a variety of data sources and there's an emphasis on open source data sources but you know that doesn't mean that you leave out other data sources and obviously New Relic is a key player in the observability space so I would like to start by asking you both to say a few words about how this partnership came
Starting point is 00:02:04 about and what you're hoping to get out of it. Yeah, I'll maybe go first, Raj. I'm super excited by the partnership because, as you said, it feels so natural. And I think those are the best kinds of partnerships when everybody looks at it and says it makes total sense. We, as you know, just last week announced a series of changes to New Relic as we reimagined our platform. One of those major changes is we believe observability should be ubiquitous, that every developer should have access, and it should be
Starting point is 00:02:46 part of their basic tool chest. And so we've eliminated both the functional and the economic barriers for that, including a really generous free tier so that every developer can build in instrumentation into their code and use observability as part of their standard engineering practice. As we thought about that, we realized that probably the most popular and prolific visualization and dashboarding platform in the world is Grafana. And a lot of our customers use it in conjunction with New Relic. And we wanted to have a great story there as well as developers embrace our observability platform and the great economics. We want them to have great visualization on top of
Starting point is 00:03:38 that. And so in addition to our own dashboarding, we wanted to open it up to Grafana. So we reached out and the rest, I guess, is history. Okay. Thank you, Bill. And what about you, Raj? What's your viewpoint? What was your reaction? I mean, obviously, judging from the outcome, I know you said yes,
Starting point is 00:03:59 but what was your initial reaction, let's say, when New Relic reached out? Yeah, I think natural is the operative word, as Bill mentioned. Like you mentioned, George, Grafana is all about our Big Tent philosophy, which is bringing data together wherever it lives to provide additional context and understanding for our users and our customers. But I think the interesting thing within the partnership is New Relic is now offering Prometheus
Starting point is 00:04:36 and PromQL as a method of querying data on the New Relic telemetry platform. And Prometheus is one of the most interesting and the fastest growing data source and community within the Grafana Big 10 ecosystem. In fact, as you know, George, Grafana Labs is one of the main contributors to the Prometheus project. And we are really excited about Prometheus in general. Like I mentioned, Prometheus is now the most popular metrics backend within the Grafana community.
Starting point is 00:05:15 And so, as part of this partnership and as part of the announcement, New Relic is basically providing a PromQL-like interface to the metrics on their telemetry data platform. And so you can now utilize Grafana, which speaks natively, which speaks native PromQL to New Relic and utilize Grafana to bring New Relic data together with data wherever it lives,
Starting point is 00:05:43 whether that's in open source Prometheus, in, you know, tools like Graphite, you know, other commercial vendors like, you know, maybe Datadog or Stackdriver or Azure Monitor. It doesn't matter where your data lives. You know, Grafana allows you to bring it together. And we're really excited that, you know, New Relic has chosen Prometheus as a way of kind of, I think,
Starting point is 00:06:07 recognizing the growth within the Prometheus community and is now essentially a Prometheus compatible backend, similar to other data sources like obviously the Prometheus open source project, our own cloud offering, timescale, influx data, and others. We're really seeing Prometheus become the de facto standard within the observability community as far as metrics goes. And we're really excited that New Relic has chosen to offer a Prometheus-compatible query layer on their telemetry data platform. So New Relic now works out of the box with open-source Grafana because of its support for PromQL. And in addition to that, we offer a free 30-day trial for Grafana Enterprise for those New Relic customers that want to combine PromQL
Starting point is 00:07:07 with NURCL, the native New Relic query language. Okay, thanks. Thanks, Raj. You actually touched on a number of points that I wanted to ask about, so you're saving me some time. And one of the first ones that I wanted to ask was, well, just looking at the points mentioned in the press release.
Starting point is 00:07:26 It wasn't entirely, well the issue of query language actually wasn't entirely clear to me. So it mentions, the press release mentions that Grafana open source users can add new relics, new telemetry data platform as a Grafana data source using Prometheus basically as you just mentioned. And then it also says that for the enterprise customers they can still use the platform and enjoy updates designed to support New Relic's query language. And you also mentioned at some point Prometheus' own query language. So can you clarify what exactly are users going to be able to use? Is it any
Starting point is 00:08:10 of those three languages that they wish? Because I know that you also have your own home-bred query language that you use in Grafana, which as we spoke about last time, at least at the time, wasn't directly, it wasn't exposed to end users, but it was used under the hood for by your own graph user interface. So, are all of those ways going to be possible for users to use the query data in your running? Yeah, so basically Grafana open source ships with native support for PromQL, which is the Prometheus query language. And New Relic's new telemetry data platform
Starting point is 00:08:54 has a PromQL-like interface, which works with the open source Grafana Prometheus data source. So open source Grafana users will be able to use open source Grafana and query and visualize data that they store on New Relic through the PromQL data source, similar to the data that they query from other data sources. And then Grafana also supports transformations that allow you to kind of do calculations and math between data sources. And you can certainly do transformations on any PromQL data within Grafana itself, so that remains true. those customers that want to use Nrcl, which is New Relic's native query language that transcends
Starting point is 00:09:47 beyond metrics, they can spin up a free trial of Grafana Enterprise, which is our commercial product, and enjoy the combination of being able to use both Nrcl, PromQL, and all the other data sources that Grafana supports, such as Datadogs, Splunk, and dozens of others of commercial data sources. Okay, thanks. That's a bit clearer now. I also have a question for Bill. So the question is, what was the motivation for adding this Prometheus compatibility that
Starting point is 00:10:26 Raj mentioned? Yeah, good question. You know, the primary reason is for Grafana, actually. Again, we see Grafana so prolifically used inside our customers' data centers, we wanted to provide them native support for PromQL. Given a lot of the pre-built templates for Grafana have PromQL as part of them, and we wanted as seamless, low-friction support for Grafana as possible. So we built that translator layer in as seamless low friction support for Grafana as possible.
Starting point is 00:11:05 So we built that translator layer in and have been testing its compatibility, primarily using Grafana dashboards and community support. We also are really excited by the partnership and the fact that Grafana Enterprise has built in support for New Relic's NURCL query language, the native query language New Relic has that spans data types, metrics, but also events and traces and logs, as Raj said. I'm really excited to provide the best of both worlds through Grafana and really grateful for Grafana Enterprise
Starting point is 00:11:46 to extend that 30-day free trial to all New Relic customers and we hope they take advantage of it. Okay. Could you perhaps give an example of how using those two different, how using NRQL and Prometheus or even Grafana's own query language in combination would work. I'm kind of presuming that potentially your own query language would be able to
Starting point is 00:12:19 do more sophisticated queries or span the different applications that connect to your own backend and perhaps using Prometheus should be able to work on a higher abstraction level? It's a bit apples and oranges. In terms of querying metrics, there's a lot of similarity. PromQL is obviously optimized for or built for querying metrics as the Prometheus data type, whereas NURCL is a bit more broad
Starting point is 00:12:54 and capable of querying other data types as well. And really, the primary motivation, I guess, to providing both through Grafana is really just to meet developers where they are. Take advantage of the knowledge they have. And if they prefer PromQL, they can use that to query metrics. If they already know NURCL, they can use that and decide what works for them.
Starting point is 00:13:22 OK, OK, thanks. Another question I have which I guess you kind of touched upon previously. I was wondering how exactly did the integration work because I know that in the previous version of the previous release of Grafana, Grafana 7.0, one of the things that was updated was actually the framework for adding integrations. So I'm kind of guessing it was probably used. And I'm wondering if you mentioned community involvement at some point. And I wonder if there was also involvement from engineers from both sides? Well, I guess the integration was kind of very easy
Starting point is 00:14:09 and very seamless because of New Relic's decision to support the PromQL query language on the New Relic telemetry data platform. And so we've seen PromQL kind of become the de facto query language, along with the rise of Prometheus in the observability world for metrics over the last three or four years. So Grafana already has very mature support for PromQL. And there's a variety of backend data sources that natively speak PromQL, including obviously open source Prometheus, Rafaana Labs' own cloud offering,
Starting point is 00:14:50 other open source tools like Timescale and InfluxDB. So Prometheus has in a way become a de facto query language for metrics within the open source observability world. And so adding support for New Relic's new data, telemetry data platform was actually almost, it was extremely easy because their new platform speaks native PromQL. So there really was hardly any technical work that needed to be done to welcome New Relic into the Prometheus fold because New Relic chose to standardize on-prom QL, which has enjoyed tremendous adoption,
Starting point is 00:15:32 tremendous adoption curve, if you will, over the last three or four years. Prometheus was originally created by SoundCloud several years ago. It's now a CNCF project, and it's really become the default way that people interact and query their metrics, particularly in the cloud-native world and particularly for Kubernetes users. So we really, you know, kind of welcome New Relic into the Prometheus fold, and we think it's a great move on their part by standardizing on a query language that already enjoys such wide adoption and developer mindshare.
Starting point is 00:16:11 And if I could just add on to that, maybe one question that readers may have is, does a developer need to choose between Prometheus and Telemetry Data Platform as the back-end store for metrics. And actually, it's not an either-or at all. It's actually an and opportunity because, as Raj said, Prometheus is so popular and prolifically used. It's often deployed as part of Kubernetes and used in that environment and so many others. And what Telemetry Data Platform offers on top of that is the ability to extend retention, provide additional scale, and we offer private key encryption of data, a fully managed solution. So actually what we're seeing customers do is actually just use the remote write capability
Starting point is 00:17:18 built into Prometheus to forward their metrics to the telemetry data platform where they then can enjoy those benefits and correlate that data with their logs or events and traces so they get a full stack view of their digital enterprise. So it's very much a complimentary situation both on the data backend, as well as the query
Starting point is 00:17:46 language and visualization now through Grafana. We'll see customers use all three of those. Yep, absolutely. And the remote write functionality that Bill mentioned is something that Grafana Labs actually developed about three years ago as part of our participation within the Prometheus community. And at this point, there are several databases and vendors that are using the remote write functionality in Prometheus, including, like I mentioned, Timescale, Grafana Cloud's own offering. So there's a variety of ways that now exist within the ecosystem to kind of achieve scalable Prometheus metrics with long-term retention.
Starting point is 00:18:29 And it's really great that New Relic's part of that ecosystem. And it's all for Grafana. As you mentioned, George, we've got a big tent philosophy, and it's all about offering developers choice in terms of different vendors, different open source projects, and different providers within the Prometheus ecosystem. In that vein, I also wanted to ask your take on OpenTelemetry, because I've seen that Eurelix has been quite involved in that and just recently announced the support, even though it's still
Starting point is 00:19:02 in beta. And I think that it's also supported in Grafana in some way. And it's also a cloud-native foundation project. And so I wonder what your take is on that. Was that a question for me or Raj? Both, actually. Oh, okay. Well, maybe I'll start. Yeah, we announced, I guess, two weeks ago now that we believe the future of instrumentation
Starting point is 00:19:32 is open and open standard. And we're the number three contributor today to open telemetry projects right after Microsoft and Splunk. And we're committing engineers and resources to further that project and want to provide a fully supported, out-of-the-box package for it as it's released.
Starting point is 00:20:08 We also did open source our existing agents and integrations, you know, to share the decade plus of IP that we have in instrumentation with the community. We're running those projects now fully in the open. And we'll support those for many years to come with the community hand in hand. So, yeah, we fully embrace the Open Standard and Open Telemetry project
Starting point is 00:20:34 and very excited to see it coming to GA later this year. Yeah. So we're kind of watching Open Telemetry pretty closely, but we actually believe as a company that more in open source than we do in open standards. And so, you know, we're, as Grafana Labs, we're one of the top contributors to the Prometheus project. And within an open telemetry, it seems that Prometheus is actually becoming the de facto standard for metrics. These kind of committee-driven standards, such as open telemetry, tend to be a little bit of a political or vendor love fest, if you will, not to be overly sarcastic, you know, but we really believe that in open source over open standards and, you know, in that adoption and mindshare wins over, you know, kind of committee driven, you know, standards that end up kind of in some ways fracturing the community. Open telemetry is interesting
Starting point is 00:21:45 because it's tried to combine things like tracing and metrics together. You know, Grafana now supports both metrics, logs, and traces as kind of first-party data sources and first-party telemetry types, if you will. But we really believe in pure open source and actually shipping code rather than committee driven standards. And we're closely
Starting point is 00:22:13 watching open telemetry. But, you know, Grafana Labs is really an open source first company. So not only do we open source our, you know,, similar to New Relic, but we also open source all our backends, including Cortex, which is a CNCF project that's a scale-out Prometheus backend that does long-term storage and is kind of a scalable Prometheus offering that we run on a cloud. So I guess I'd describe our attitude as open source first rather than open standards. Thank you. Okay, and since we're almost out of time, just a quick wrap-up question. So a few months ago, I got to look at a survey that was done on the topic was the future of observability.
Starting point is 00:23:05 And there were a few key themes that emerged out of that. One was open source, and there's no need to say much about that because it's a key theme for your partnership as well. The other two were AI and using machine learning for AI ops and automating event notifications and this kind of thing, and cloud and serverless. in learning for AI ops and automating event notifications
Starting point is 00:23:25 and this kind of thing and cloud and serverless. So if you, and then it's a question for both of you, just your quick take on that and whether you see that playing out in the field. You wanna go first, Bill? Should I go first? Yeah, go for it. Okay. So I guess, George, you mentioned open source, and that's kind of needless to say,
Starting point is 00:23:51 but I'll just reiterate it in that our view is that a lot of vendors are starting to open source parts of their stack or play in open standards. But Grafana Labs is the only observability vendor that is a pure play open source vendor that believes that our customers' observability strategy should be owned by them and not a particular vendor. So we're all about this big tent philosophy, and we think that that's the future of observability, which is not controlling where customers store their data and allowing them to use their choice of metrics, logs, and tracing open source projects, commercial vendors, SaaS options, on-prem options, whatever they want. And we think that that's a very big trend within the observability world. We also believe that open source projects
Starting point is 00:24:45 are the future of the underlying telemetry engines. And that's also why we're excited about the partnership, by the way, because obviously it's, I think, a recognition of the relic of the power of Prometheus. But everything we do is open source and we think that's the way of the future. Regarding AI ML, there's a lot of interesting things going on in that arena, but we really
Starting point is 00:25:10 don't think that – we think that there's a lot of hype also, and we don't believe that it's going to be a replacement for talented SREs anytime soon. So we have a joke at Grafana Labs where anytime someone mentions AI and ML, that person has to do a shot of vodka because there's so much hype. But we are investing in those areas and we do think that there's some low-hanging fruit in that space. But as far as the future of observability goes, in addition to open source, we really believe in the idea of optimizing an experience and a workflow for SREs to seamlessly debug and troubleshoot applications under a high pressure environment and transition from metrics to logs to traces,
Starting point is 00:26:06 giving them complete choice of where that underlying data lives and really optimizing for the experience of allowing them to debug an issue, transitioning from metrics to logs to traces within a single leading visualization tool that is Grafana and kind of stopping this workflow that exists today of having to have five different tools, five different tabs, and constantly having to transpose and redefine queries between systems.
Starting point is 00:26:39 And so we're really laser focused over the next few years of creating this seamless and magical experience that works for SREs regardless of where their data lives. Because we really believe that the future of observability is not a single vendor and is really across many, many vendors that all do different things really, really well depending on the vendor. Thank you, Rich. Yeah, just to add a few thoughts to that. depending on the vendor. Thank you, Rich. Yeah, just to add a few thoughts to that.
Starting point is 00:27:10 You know, what we announced last week is a pretty fundamental reimagination of New Relic. In the past, we've offered customers dozens of different products, really targeted at very specific layers of the stack or specific use cases. And what we did last week is really condense that or standardize that to three layers that we believe hold incredible value for customers. And they can be used independently, but it dramatically simplifies the way that any company
Starting point is 00:27:42 needs to think about their observability strategy. But first, the telemetry data platform has been the focus today because it's meant to be this petabyte scale store for all types of telemetry, metrics, events, logs, traces. The multi-tenant architecture that we've hardened over the last decade enables us to offer an incredible scale and economics so that it's very inexpensive for developers and enterprises to adopt. And the partnership with Grafana makes it incredibly flexible and appealing to get the value out of that data. Our full stack observability product on top adds to that then all of the analytics and troubleshooting tools that we've
Starting point is 00:28:26 built up from infrastructure to digital experience monitoring to distributed tracing and application monitoring all in one integrated experience and we think offering that as one package for enterprises makes it that much easier to get a full stack of view and to standardize on New Relic for those tools so that engineers don't have to deal with the complexity of jumping between tools in the middle of an incident when they're trying to figure out what's going on and all of the data is in one place. So there's a single source of truth for what's going on. Our third product, and this
Starting point is 00:29:05 gets to your question around AI, is our applied intelligence product. It's the place where we're focused on AI and ML-based approaches to discovering anomalies, to taking automated actions based on the data. Some of our customers send us gigabytes of data every month. And if you think about the developer workflow, increasingly with that scale of data, it becomes really complex to anticipate and respond to all of the kinds of events that can happen, especially in the
Starting point is 00:29:47 distributed computing world. And while we provide the out-of-box dashboards, and we're obviously embracing Grafana for visualization as well, with that scale of data, we believe it's important to have machines looking at that data in real time and assessing where there may be anomalies, where data can be correlated and aggregated automatically on behalf of the user, so that when the engineer does get involved and they start diving in, they have all the information they need to react quickly and bring the services back online. So I do agree with Raj, there's a lot of hype in this particular space. And obviously, it's new and emerging as a set of capabilities. But we believe it really very much augments what developers are doing today and can make them more productive. To your question around serverless and cloud,
Starting point is 00:30:50 as you know, New Relic embraces all the public cloud providers. We've got integrations with Amazon, with Microsoft, with Google and others, you know, dozens of integrations. And we also have out-of-box support for serverless and AWS Lambda as our premier serverless offering. And we've seen a good adoption of it. Again, I agree with Raj. I don't think serverless fully replaces other approaches to building services, but I think it's a pretty compelling development model for certain
Starting point is 00:31:35 situations, and a lot of customers are enjoying use it, so we want to be there and support their choice and provide monitoring and observability for those kinds of use cases as well. So New Relic takes very much a broad look at both public cloud providers as well as developer choices and strives to meet them all where they are. I hope you enjoyed the podcast. If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn, and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.