Grey Beards on Systems - 67: GreyBeards talk infrastructure monitoring with James Holden, Sr. Prod. Mgr. NetApp

Episode Date: July 26, 2018

Sponsored by: Howard and I first talked with James Holden, NetApp Senior Product Manager for OnCommand Insight and Cloud Insights,  last month, at Storage Field Day 16 (SFD16) in Waltham, MA. At the... time, we thought it would be great to also have him on the show. James has been with the NetApp OnCommand Insight (OCI) … Continue reading "67: GreyBeards talk infrastructure monitoring with James Holden, Sr. Prod. Mgr. NetApp"

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, Ray Lucchese here with... Welcome to another sponsored episode of the Greybeards on Storage podcast, a show where we get Greybeard bloggers together with storage and system vendors to discuss upcoming products, technologies, and trends affecting the data center today. This GreatBird on Storage podcast is brought to you today by NetApp. It was recorded on July 20th, 2018. We have with us here today James Holden, Senior Manager of Product Management at NetApp. So James, why don't you tell us a little bit about yourself
Starting point is 00:00:40 and what's new in OnCommand Insight and Cloud Insight? Sure. Thanks for having me. So yeah, my name is James Holt. I've been with NetApp the last five years as part of the product management team for OnCommand Insight. And in the last 12 months or so, we've also been building out a new product called Cloud Insights. It's a very exciting time for us. Cloud Insights is going to be a SaaS-only offering for performance monitoring of cloud and on-premises infrastructure technologies. It's based on a lot of the IT that we've evolved over the last few years on OnCommand Insight, but really built for modern infrastructures and modern architectures. So, yeah, thank you for having me.
Starting point is 00:01:20 Can you tell us a little bit about OnCommand Insight? I'm just going to call it OCI for now, if that works. Sure, yeah. Yeah, so OCI has been a very popular tool within the NetApp estates and for a lot of very big organizations globally. It's used for a variety of reasons, monitoring, troubleshooting, cost and performance optimization, and for really tying all the kind of infrastructure, inventory and performance data into the wider business processes within organizations. So the largest banks use the tool set, but nine out of the top 10 Fortune 500 companies use it really to kind of get an
Starting point is 00:02:05 understanding of how their storage estates are behaving, performing, and making sure that they're running as optimal as they can be. So is OCI what Ray and I back before our beards turned gray called SRM, Storage Resource Management? Yeah, we fit it into that kind of category of tool set with OCI. Yes. It's a little different from that. It's not the management in the sense of it's creating volumes or LUNs out there. It was the reporting and monitoring and troubleshooting that we really did or do.
Starting point is 00:02:39 And does it work with other storage vendor products as well as NetApp? Absolutely. So we've got coverage of all the major storage platforms, Dell EMCs. We've got everything from Symmetrix to an Xtreme.io, Isilon. And we even go back to the older product sets, the DMXs, the Clarions of old. Oh, Howard, you'd have support. It's a reason to fire up my CX500. I can manage it now. There you go.
Starting point is 00:03:07 Absolutely. Yeah. So all the big storage vendors, IBM, HP, we've got coverage of those. And some of the smaller, more recent introductions to the market, the Pure Storage, the Infinity Darts, we've got support for those guys as well. And then it also looks after the hypervisors. Too frequently, I hear storage vendors say, we support everything.
Starting point is 00:03:28 We support NetApp and EMC. Yeah, there you go. There are a couple other vendors out there. Yeah, and the way that we've built on Command Insight is that adding a new collector is just a patch. So we prioritize patches on a customer request basis. Customers ask for support of Infinidat, we give them support of Infinidat. You haven't gotten so far as to allow vendors to create their
Starting point is 00:03:50 own connectors, have you? We haven't gone that far, but we do work with them. So obviously, some of them are more advanced in their APIs and their access to the guts of their equipment than others. And we work with them on a case-by-case basis to maintain support, especially currency, if they're proxies. What about the server operating systems and stuff like that? I assume you support things like vSphere and things of that nature. Yeah, so kind of the critical piece about the way
Starting point is 00:04:22 that OCI operates is that we see the end-to-end relationship. End-to-end? Application to the storage? So hyperscale, so Amazon, I don't understand. We're trying to understand. So when you say support for hyperscales like Amazon, you're going to support like EC instances and things of that nature? Yep. So we'll discover the EC2 instances. We'll discover the EBS storage at the back end,
Starting point is 00:05:05 the S3 storage. We'll map the relationship between those. Even if it's not NetApp storage in that environment? Yeah. Oh, my God. So do I have to install a collector someplace as an AMI? So the way that we operate, we're completely agentless in OCI. We have data sources that reach out wherever APIs or CLIs they can do to
Starting point is 00:05:27 communicate with the end device that we're talking to. So we just need an IP address, username, and password. So AWS has APIs that will let me see what latency on an individual EBS instance is? You can run the OCI server on your on-premises environment, or you could run an OCI server on maybe on an EC2 instance. All we need is network connectivity to it. So in the situation where it's completely firewalled off, what we can also do is put what we call a remote acquisition unit out there that will then allow a HTTPS connection between the two and allow us to take pushes of that data into OCI. I like this a lot. Oh God, yeah. Especially when you start thinking about an EC2, excuse me, an EBS insured IOPS instance where you pay six cents an IOPS per month for asking them to provision that many IOPS for you. application's actually using so that I can change that provisioned rate to be just a little bit more
Starting point is 00:06:45 than it asked for at the peak. And I'm not spending huge amounts of money to provide performance that I'm not using. And it works like that with Azure as well? It does. Yeah. And carrying on with that use case, it's a great one. We've got other ones that really help organizations understand where they've got waste in their infrastructure because like you say you're paying for this equipment you you want to get the money's worth out of it yeah but even worse so in the public cloud where i haven't paid for it yet and it's one thing in the data center where i wasted money because i bought too big an array but but that money is already wasted and you telling me that i'm wasting that money just
Starting point is 00:07:23 makes me feel bad. But in AWS, when you're telling me you're spending too much on provisioned IOPS, next month I can save money. Almost in real time. Yeah. What we see is EC2 instances get spun up. The EBS volumes that get attached to them are created and they're occurring at quite a substantial amount of cost. Storage costs in AWS are high. It's a bigger part of the AWS bill at the end of the day. Especially if you want performance.
Starting point is 00:07:50 Yeah, those instances that don't come cheap. But when the VM then gets terminated, any other than the default EBS volume, they have the ones get stranded, just get left out there. So AWS doesn't tell you they're out there. It just charges you for them. We can see those and eliminate those from the infrastructure.
Starting point is 00:08:08 And I know in my AWS infrastructure, I rarely delete VMs. I just shut them down because I might need them later. And the storage stays forever. I understand that logic. That's the nice thing about storage. So how does OCI compare with Cloud Insights and what is Cloud Insights?
Starting point is 00:08:31 Yeah, so Cloud Insights addresses a slightly different market space than OnCommand Insight. OnCommand Insight was built for some of the largest organizations. It's still applicable to the smaller estates but what we've seen is that if you have got a smaller environment sometimes you just don't want the effort of having to maintain on-premises infrastructure to monitor your environment it's a lot of organizations are moving to the cloud cloud first policies so consuming a s SaaS offering is something that they can really want to move to, they can handle, it's a lot less pressure for them. So we've come up with Cloud Insights to fill that space. And where we've also slightly changing the,
Starting point is 00:09:21 differentiating between the two products is that Cloud Insights is built for monitoring the real modern infrastructures where it's not just a virtual machine out there and backend storage. It's a microservices architecture that's really highly changing. Mesos, Kubernetes, all that stuff. Exactly, exactly. And there's different challenges that come in those environments.
Starting point is 00:09:43 These microservices come into existence and disappear very quickly. Cloud instances come into existence and disappear very quickly. And the performance metrics that you need to capture and gather and all the interconnecting relationships, because it's so transient and because there's now… Yes, sampling every 15 minutes doesn't work when the average life of a process is 15 seconds, does it? Exactly, exactly.
Starting point is 00:10:11 So it's a new architecture in the back end of Cloud Insights that's going to cope with that sort of situation. But that runs in the cloud where you're running these container environments as well as on-prem? I know it's a SaaS service, so I assume it runs in the cloud, but I don't know how you gather this information. Yeah, so it is a SaaS service. It's gathering from the on-premises and it gathers from the cloud environment as well.
Starting point is 00:10:39 It works in the same way that we have the remote acquisition unit for on-command insight. We have an acquisition unit for cloud insight. That only makes sense. A little bit of lightweight code, you put it on a virtual machine. That then gives you the access and the control and security to push only the data that customers want to push. Yeah, because otherwise I'd have to open 4 million holes in my firewall to let you see everything you need to see.
Starting point is 00:11:03 Yeah, and that's just not practical. There's other monitoring tools out there that go down that route, and you find that you are just making Swiss cheese of your network and your security. So with the simple acquisition unit where it's a controlled, secure connection, all the credentials are stored on your own, customer's own environment, it makes it far more palatable for the security team to allow this. And it's an SAAS offering. So what's the minimum commitment? I run a very small data center. Yeah.
Starting point is 00:11:38 Besides the Clarion and a few other systems. But we don't have a, I suppose there's a minimum commitment. A managed unit is our smallest number that we can go down to a managed unit is a host or five terabytes of storage either one of those so that's how we charge i think i got that much on my desktop here not quite but close so the vmware container services and stuff like that, I mean, VMware has got a couple of different solutions for containers. Do you support all of them?
Starting point is 00:12:11 I guess it's Docker, it's PKS, Kubernetes, and there's the VM container services. Yeah, so we're building out support matrix and kind of our data collectors at the moment. So again, the similar way that we do it for on command insight, we're looking at the market demand and looking for customers are asking for, and we build accordingly.
Starting point is 00:12:35 Right, right, right. So at this point you have like the big guys, Kubernetes, Mesosphere, and those sorts of things. We have the Kubernetes.
Starting point is 00:12:44 Okay. Okay. So it's a piece of yeah it's not out yet um we're in a preview fashion moment so okay um we're actually publicly releasing in october this year yeah, I can imagine interesting problems that we have to address and things like that. Today, you talk about an application and it's a collection of VMs that access a collection of data repositories. In a containerized world, how you know that this container that only appeared for 30 seconds belongs to that application is an interesting problem. Yeah. Yeah, so that's, well, as a product manager, the coolest thing about working on SaaS offering is how fast you can actually develop.
Starting point is 00:14:11 OCI, On-Command Insight, we did three to four releases per year, and that was our case. With Cloud Insights, it's just a continual development cycle with new content drops happening weekly, daily, sometimes even hourly. In the solution? In the solution, yeah. There's no upgrade anymore. Well, in the SaaS offering, Ray. Well, I understand, but it still is in my words, suicidal from a perspective.
Starting point is 00:14:47 If you can do it, it's great, but there's risk there. I guess. The world is changing. Ah. Yeah. The real key for me is that if a new feature arrives kind of seamlessly. I don't see any downtime with that feature appearing.
Starting point is 00:15:30 I've not done any upgrades. I run a Cloud Insights instance that some of the NetApp field folk to actually see and play around with. And I'm continually surprised and pleased to see new pieces being dropped into the product set. Sometimes it's just little simple things like the way you can operate maybe a widget, a visualization in the tool. Other things are fundamental changes, the way that user management works or the way that new data collectors can be added to the system. So obviously the moment is very, very fast because we're ramping up to the public release. Right, right. There's no reason that those new features can kind of slow down as we hit that.
Starting point is 00:16:21 It is a brave new world of new cool stuff every day. Is it a public preview? Is that the right statement and right term? I'm trying to understand the way this all works here. We call it preview. The way that people can actually register for a preview is visit cloud.netapp.com and there's all the cloud services that NetApp offers out there, from cloud volumes to SaaS backup. And Cloud Insights is one of those services.
Starting point is 00:16:54 At the moment, it's a registration form that people can fill in. And then someone from either the product team or from our engineering group will be in contact and we'll help people set their environment up and get them running on Cloud Insights. So Cloud Insights is more targeted to more modern application environments and OCI is more of a classic traditional enterprise application with Hypervisor.
Starting point is 00:17:20 Is that how you'd state the two different solutions there? Yeah, and I'd just kind of add to that as well that the cloud insights is still applicable for all your on-premises environment and the cloud infrastructure. It's aimed at the person or the IT or the operations team that doesn't want to have to maintain their own on-premises infrastructure that are looking for something that isn't as capable as OCI.
Starting point is 00:17:56 There's a humongous amount of features that have gone into OCI. It's 12 years in development. So all the capacity reporting, chargeback, all the kind of integrations connected to service. Now, maybe those aren't as applicable. Obviously, there's a difference in price point. Cloud Insights is where managing,
Starting point is 00:18:18 maintaining, and it's not got all the features that OnCommand Insight has got. So it's a cheaper price point as well. And if you are actually got NetApp storage, and this is any NetApp storage, there's going to be an addition that's purely for those that is actually free.
Starting point is 00:18:37 Oh, excellent. So a Cloud Insight solution, which is free? For NetApp storage, yes. So if you've got an on-tap device, even if it's 7-mode or it's an E-series or FlashFaz, HCI solution, whatever it may be, they will have the ability to feed that data into Cloud Insights and get seven days of performance. I must admit that one of my big takeaways from the session at Storage Field Day was that I need to learn more about ServiceNow because you talked a lot about ServiceNow integration. It's been a long time since I've installed a system like that.
Starting point is 00:19:20 Well, James was talking a lot about how OCI integrates into ServiceNow, and I keep hearing about ServiceNow, so it's just hit my have-to-think-about-it level. And then we talked about chargeback, which I have always hated, just as a concept. As a concept, I've always... Well, I mean, there's certain places where you need to have something like chargeback, right? If you're a service provider, you need it. You are. In the past, if you're an IT department, it's led to more resentment than actual usefulness, in my experience. Advantage.
Starting point is 00:20:05 However, when we're now running IT departments all competing with public cloud provider, then we need to be able to do that. We need to be able to, even if the billing actually doesn't happen that way, we need to be able to say to the CMO, we provided this service and you could have gotten that as SAAS and it would have cost you this much more.
Starting point is 00:20:29 Right, right. As a comparison. The impact of public cloud, which of course is BuildBack, has got me rethinking the whole process. Yeah. It used to be, and given its bad name, it was Shameback. It was a Shameback report that people used to try and produce. But it's now more of a case of a proof point. This is my costs running on-premises.
Starting point is 00:20:54 This is how much it's costing in the cloud. There's an often substantial difference. Yes. Yeah. All right, gents. So, Howard, any last questions for James? No, I think we got it. James, is there anything you'd like to say to our listening audience before we sign off?
Starting point is 00:21:09 Thank you for listening. And please do visit cloud.netapp.com and register for a preview of Cloud Insight. It'd be great to see you on board. Is there a handle where our listeners can abuse you on Twitter or other social services? Yeah. I'll put that in the post, if you will. How's that? If you can send it to me, James. All right. Well, this has been great. Thank you very much, James, for being on our show today. Yeah, thank you for having me. And thanks to NetApp for sponsoring this podcast. Next month, we'll talk to another systems storage technology person. Any questions you want us to ask, please let us know. And if you enjoy our podcast,
Starting point is 00:21:45 tell your friends about it. Please review us on iTunes as this will also help get the word out. That's it for now. Bye, Howard. Bye, Ray. Bye, James. Bye, guys.
Starting point is 00:21:54 Until next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.