PurePerformance - 042 101 Series: Serverless

Starting point is 00:00:00 It's time for Pure Performance. Get your stopwatches ready. It's time for Pure Performance with Andy Grabner and Brian Wilson. Welcome to another episode of Pure Performance. This time, unfortunately, without Brian, who couldn't make it today, but I wanted to have the opportunity to record another one of our one-on-one sessions. And I was lucky enough to actually catch my colleague, Daniel Kahn, in our new Detroit office. And that's where we are sitting, actually, right now. And hello, Daniel. Hello, in our new Detroit office. And that's where we are sitting actually right now. And hello, Daniel.

Starting point is 00:00:46 Hello, Dandy. Hey. It's really, you know, I think a coincidence that we are both here at the same place. And I would really have loved to have Brian on the call as well. But we did, a couple of weeks ago, we recorded a session with you on Node.js, like one-on-one Node.js, which was very enlightening. Yes, and I was too long with that, right? And now we have to make serverless in a new session.

Starting point is 00:01:11 Exactly. So this brings me directly to the topic. So this session is about one-on-one on serverless. And you have been working on the topic serverless. You are the technical lead within the Innovation Lab on serverless technology. And so I guess from the people that I know, you probably have most of the experience. And so I would like to actually get started with enlightening people what does serverless actually mean, what are the main use cases, and then later on we want to talk about the monitoring challenges.

Starting point is 00:01:39 But let's start with what is actually serverless. Yeah, right. So serverless is currently still. So everything I say today might be wrong a few days later because it's just a very kind of fast-moving field. It's very hyped currently. I think also it will kind of find its spot at some point. Currently, everyone talks about serverless serverless let's look a little bit uh at the history of of all those services we know we first we had platform as a service for instance and a

Starting point is 00:02:16 little bit late so this this more or less provided you a full platform that did run your applications and a little later we had something like backend as a service, for instance, where you could do authentication or session handling or state management. So it was already kind of a fraction of an application that was handled by some servers you consume. And now, and this is what serverless actually is, we have something we would call function as a service. Function as a service means that we really kind of

Starting point is 00:02:56 make it even more fine granular, raise the granularity, or what do we do? Yeah, we make it even more small. Yeah, we make it smaller. I understand what you mean. Yeah, we make it smaller. So the trend is always kind of make everything one time more smaller again. So like that's coming from an application to platform as a service, backend as a service, and now we have platform as a service,

Starting point is 00:03:26 backend as a service, and now we have function as a service. So this is really like microservices on steroids, most probably. You can call it like that. And the thing, function as a service, is that you really don't really have to care about the platform or anything.

Starting point is 00:03:42 This thing is running on. All you care is one single function that is executed and that does something for you does uh some business logic work here and and this is i guess most probably the finest granularity by now you can reach when building applications. So if I reiterate what you just said, I think I like the back, you said back end as a service, right? These were services that I would assume were typically provided by some providers like authentication, where you basically made a REST call

Starting point is 00:04:18 and then this was a service that could authenticate, state management and all that stuff. And now we make it very easy for developers to focus on a particular function, keep it small, keep it simple, focus on one particular task. Exactly. Which also means that this is kind of falling into this paradigm of writing microservices that are really small, dedicated to one purpose,

Starting point is 00:04:41 and then you will be able, because they're kind of independent from each other without state, this is a perfect candidate for scaling up and scaling down, the classical 12-factor app that we talked about. Exactly. So this 12-factor idea very much also plays into this serverless topic, after all, especially when topics like that it's, for instance, stateless. So I would say let's start at the beginning. Let's talk about what a function technically is. And I'm talking about Node.js mainly here because it's a very popular platform.

Starting point is 00:05:18 It's also a very good fit for serverless. And I'm mostly talking also about Lambda here. It's provided by AWS. There is also like Microsoft's functions. So technically everything kind of works the same, but let's focus on that. So in Lambda, your function, you create a function that's nothing else like a node module that exports one single function. And this is run on a container. This also means, naturally, there is no serverless as such.

Starting point is 00:05:57 So this container runs a server of any kind. And this server reacts to incoming events. We will cover that a little bit later. So there are incoming events and then this incoming event kind of triggers this function and this function gets the data that is passed in and then does something with the data. So there is a server. You just don't have to care about the server. You just don't see the server as such. One important part of those Lambda functions is that they are ephemeral. This means they really are really short-lived. Here the stateless kind of characteristic plays into that again.

Starting point is 00:06:43 Very short-lived. The container really starts and stops as needed and can be disposed at any time so you cannot you don't know if how long this container will live and it's also really limited by those platforms like aws how long you can keep a container alive. You could, for instance, kind of have a timer task that kind of forces this function to live on and live on, but that's not the purpose of it. A typical use case would be, for instance, I'm working with the Dynatrace Davis team. We are doing natural language processing, and when it's about date parsing, so you can now, meanwhile, with Amazon Lex, what we use for our natural language

Starting point is 00:07:32 processing, you can have one intent, like, or one phrase that comes in that you can then send off to Amazon Lambda, and that does the processing there. For instance, we use that for data processing in a way that if someone says, what happened on Monday, we will translate this back to last Monday because usual date types when someone asks about Monday in usual natural language processing implementations, this always means last Monday. So we

Starting point is 00:08:14 get Monday and then we do the date transformation for that and you can set this directly in Amazon Lex. You can say when a weekday comes in, send this whole phrase off to Amazon Lambda and work with what you get back.

Starting point is 00:08:30 So that's a typical use case, and that's very small. It's a very specific method, very specific function. Very specific function. Or you have, in times of all those single-page applications that just are sent to the browser and then run on the browser, you just maybe need a function to put something in the basket. Or like when you have a shopping system, you have this add basket,

Starting point is 00:08:55 and this calls a Lambda function that gets an item ID or item number and user ID most probably, and puts that into some queue or database that then gets processed asynchronously in a way. So that's the thing, yeah. Okay. So I think I better understand now, keep it small, keep it simple. And a function has typically a small time frame or like a lifetime, right? I think this is also the way we are charged. Exactly, yeah.

Starting point is 00:09:34 So Amazon or I'm sure also Microsoft, the same model. Yeah, that's a very, I would say it's a very economic way of using resources. After all, it's resources. It's also like, let's put it in an environmental context. It makes a difference if I have a server running like 24-7, doing nothing for maybe, I don't know, 30% of the time. Or you have a server that kind of runs every time, only just starts on demand. And this is what functions actually then provide.

Starting point is 00:10:11 I know that this already is kind of also provided by some platformers as a service systems, for sure. With Lambda functions, for instance, this is already kind of part of the dna because you cannot rely on this thing running at all or running but it will start up very quickly so you won't have much of a problem when it it has to start fresh when you do something yeah so and again to reiterate as a developer if i use serverless or function as a service, which I believe is a much better term, because there are actually, obviously, services there which are managed by somebody else.

Starting point is 00:10:50 As a developer, I can write a function. Most prominent language is Node.js. Java is available as well. Java is available as well, yeah. Anything else? Python? I think currently I just saw Java and Node.js. I would assume Microsoft is probably doing something.NET specific, which makes sense. Yeah, but Microsoft also has a – I have to confess that I didn't look so much into this.

Starting point is 00:11:12 That's okay. But I write a function, and the function, as a developer, I should write efficient code because I will be charged by, let's say, Amazon, when we talk about lambdas, by the number of times the method is invoked, but also how long the method runs. And that's a key, that's a critical thing. And it kind of reminds me of, even though I've not lived through that period,

Starting point is 00:11:32 I'm not sure if you have, but through the mainframe period, where the mainframe has been charged by the number of MIPS, the millions of instructions per second, so that means a very inefficient program, also means more MIPS, meaning you have to pay more money. Exactly, yeah. And this is also, I think we covered that or we can cover that now, also a challenge for everything that is monitoring, for instance.

Starting point is 00:11:54 Because when we do monitoring on a Lambda function, we really have to be careful that we don't let the Lambda function run too long just because of monitoring to send off our data to the server or so. So we have to be also, from a monitoring perspective, very quick in kind of releasing that function again. Of course, because otherwise you hold on to it, which means in the end the developer, the company,

Starting point is 00:12:21 whoever runs the function is actually charging extra or being charged extra because of monitoring. Exactly, because the monitoring solution does its housekeeping, et cetera. So we are really working towards that topic currently and really working on doing that right because it's really something you have to find a way to flush out really, really quickly before this Lambda function dies or freezes. So otherwise you can do two things.

Starting point is 00:12:48 You can keep this function alive artificially, which means, yeah, that costs. Or you let it run and then it freezes, but then you don't have the time to flush out the data in time to the server, which means that you may lose transactions here. So that's something that has to be tackled from a monitoring perspective. And there we also see how young this whole topic is because that's something that is still in development also from the vendors. That's something that has to be covered but is not yeah currently not not here

Starting point is 00:13:26 already yeah so that means the only official monitoring that exists right now is through what the vendor provides like in amazon's case it is through cloud watch yeah exactly that's what's there but we already managed to monitor lambda function we see a node application we see also what the running application that's also that That's not really a problem at all. The only thing we are now currently looking at is how to really get, yeah, really be quick in finishing up our tasks to not kind of keep this function alive for an extremely long time. So not to get too technical,

Starting point is 00:14:10 but how do we get into these Lambda functions? Because obviously you don't control the container, you don't control the server, you don't have access to that server. How do we do this? So the good thing is that our one-agent approach already. So usually when you have Dynatrace, you install one agent on a server and it instruments all your applications. And then you also get process

Starting point is 00:14:32 metrics and everything else around that through this one agent. Because behind the scenes, there is a host agent that gets all those host metrics then. When an application runs and it has an agent deployed to this application, you do this in Node.js, for instance, via using an NPM module and requiring this module then, this running agent detects that there is no host agent present and will then start to collect everything that can be collected from the user that owns this node process. So we will then collect host metrics as they come,

Starting point is 00:15:20 as they are available for us, and there is plenty you can, because we have a JavaScript part and a native part, and the native part can, as also rather unprivileged user, you can collect some metrics. Yeah. And there you get all those host metrics. So this requires this one thing, host metrics you have then. What you don't have is this incoming transaction, because the transaction gets visible for us at the beginning of this Lambda transaction so that it gets stitched to our whole pure path,

Starting point is 00:16:10 so that this is initiated. And then you also have to end this at some point and say, okay, this was the Lambda function. So you have the timings and also this whole transactional tracing through this Lambda function. Cool. So I remember two weeks ago I was in Boston at a serverless meetup, and I think I told you about this.

Starting point is 00:16:29 It was more like a hackathon where we deployed what's called a zombie chat application. It was one of their workshop applications, and it was heavily using Lambdas. And so what you tell me right now, because I didn't do that, but this is interesting. So I deployed this app that was using Lambdas. I hooked up Dynatrace to the regular AWS integration, so pulling metrics in through CloudWatch. But additionally, the only thing I would have needed to do in order to get more visibility into this Node.js Lambdas is to simply include our agent, the module, and then when internally AWS is actually launching the Node.js container and loads my code, it will automatically also load our one agent.

Starting point is 00:17:14 Exactly. That's proof. Okay. So the good thing is that those containers are not started for every time you create the Lambda function. This means, so a node application always runs in kind of two stages. The first stage is this whole parsing when everything, also this require lines are resolved and this code is required and started.

Starting point is 00:17:39 And the other part is reacting to events. This means that when you kind of have a Lambda function that is continuously called and the container is not disposed during this time but just freezed, this means that you don't have this agent startup latency every time you start the Lambda function, but just like every time the container creates is really newly created but then you also have other latencies here as well but this does not happen all too frequently so i tried this so i let this i created a lambda function and yeah and accessed it over an hour again and again. And there you really kind of,

Starting point is 00:18:27 the whole agent startup or the overhead we are adding is really like so minimal that it's hardly to measure. Cool. Wow, I just learned. I mean, I didn't want to go too technically, too deep, but this is perfect. So I wasn't aware that we're already at that stage where we can actually do tracing of the node. Yeah, we have to.

Starting point is 00:18:50 We see it so much at customers that it's really like a topic that we prioritize and that we are covering. Can you give me a couple of more use cases for which type of applications applications people use serverless or lambdas because some people that i talk to still have a hard time understanding what type of apps they can build with functions because we're still in this old development model you don't build apps as such with functions it's so building up actually with functions i think it's already a two two two two biggest two big scale after all it's like building doing some tasks with functions for instance like yeah adding something to um a basket like putting writing something into the database pulling some data out of the database uh reacting and that's

Starting point is 00:19:42 i think one thing we have to cover that a Lambda function can be triggered by an HTTP call, but also can be triggered by some event on a queue or something coming on a queue, can be triggered by a database event, for instance, and acting a little bit like stored procedures. So everything that kind of where you want to do one thing in reaction to some event is a good fit for a Lambda function. So, yeah, queuing stuff, for instance, of course.

Starting point is 00:20:13 Transformation, what we say, IoT, for instance. IoT devices sending arbitrary data to somewhere and you need this thing to kind of collect the data and do something, but in a fire-and-for forget way, Lambda functions are a perfect fit because you can let the device send something to a Lambda function, this Lambda function puts it into a queue or transforms it, puts it into a queue, and this queue gets then processed by something else. And also classical batch processing, right?

Starting point is 00:20:44 It's perfect. Batch processing, yeah, something like that. But the problem here as developer, I have to add this here now, is of course deployment and testing is difficult of such things. So deployment you can most probably automate in a way, but testing is a thing

Starting point is 00:21:04 because there is currently no way to have a Lambda environment locally in a way to kind of run local tests. So you always have to have this endpoint right when you do testing, and that's already always a problem when you have a microservice platform that really depends on many microservices and API REST calls it has to make creating a unit test for instance or integration tests is really difficult because you then

Starting point is 00:21:32 have to either have this endpoint available or mock this whole endpoint and that also means a lot of redundant work but if I, maybe I don't understand this correctly then, I assume Amazon gives me the command line interface to deploy a Lambda, right?

Starting point is 00:21:51 So I can automate the deployment. You can automate, but it's a difference. So let's go back to monolith. As bad as there are, but you deploy one thing and it runs. And Lambda function, you really have to, if you depend on 25 Lambda function, you have to keep track of them. Of course, yeah.

Starting point is 00:22:10 But isn't this where the API gateway comes also in, in Amazon, assuming that your Lambda functions can be accessed through an HTTP endpoint, for instance, right? Then you can mock certain services away, and then you could write it like a jmeter test where that just tests the endpoints of this function yeah but of course but that's not really like easy no i know i don't i don't i don't say it's easy yeah yeah yeah it's it's an uh one obstacle you have more while developing.

Starting point is 00:22:45 And coming back to what you said earlier, it's a very young technology still, and there's a lot of stuff that is still missing or that it's not convenient yet. Yeah, there is a lot of tooling missing. Like debugging, right? Debugging is really a pain currently because if your node function fails, you have some log messages but it's really hard to figure out what's really going on but isn't that where we're monitoring tools like yeah sure because we catch those exceptions but if you don't have that you're

Starting point is 00:23:16 really lost it's not like it works on my machine and and works doesn't work on lambda that can really be a problem to figure out what's going on here. Also, maybe a little time reference, just so people know to listen to this. Right now it's May 2017. It's a young technology. A lot of things are happening. So maybe in a couple of months from now, in a year from now, things will change. That's also an interesting field to kind of get started with something

Starting point is 00:23:41 or build tooling around that. So I would say it's it's exciting times for that yeah and are we i assume though we're working closely with the vendors to help them who help us building better software like defining interfaces new new things is there something of course so we have a very close relationship with AWS, and even lately we were on the call with a product manager of AWS Lambda. So we are in touch, and we know what's planned and see things coming. Perfect. Good.

Starting point is 00:24:18 So if I can sum this up, serverless doesn't mean no servers. It just basically better is explained with function as a service. Yes. Building something that is small, that has one sole purpose, perfect examples would be simple things like batch processing, putting something to a database, something that can be used from a single page app, or IoT was another great example where this is used.

Starting point is 00:24:41 And functions run, should be fast, because you also get charged by that. What else did I miss? Is there anything else that I missed, I believe? You shouldn't worry about the container. Obviously, underneath the hood, there's a server, there's a container that runs for a while. It's very close to this 12-factor idea.

Starting point is 00:24:58 It's stateless. Very smaller than microservices. And it's, as we said, the tooling for everything. Tooling, orchestration is a challenge. It's a challenge, yeah. Cool. Do you have any other final thoughts, anything that we missed? No, just as I said, it's an exciting topic and I'm really eager to find out what's coming next here

Starting point is 00:25:22 and how this whole thing develops over the next few months. And just as a reminder for the audience, if they want to find more out about it, especially following you, what's coming out of the work that you're doing, what's the best way to get in touch with you? Via most probably Twitter. It always works.

Starting point is 00:25:39 My Twitter handle is dkhan, for Daniel Kahn. And follow me on Twitter and contact me anytime. Cool, and I guess people can also try out the stuff that we built as a monitoring. So for monitoring Lambda, and this is going? Almost, let's say almost. We are working on that. So it's currently in lab stage, but I expect it also to be GA in a few months. And by the time this airs and by the time people listen to it,

Starting point is 00:26:10 so go to dynatrace.com and sign up for the free trial. That gets you a SAS instance of Dynatrace, and then just follow stuff that you find, obviously, on our online portal about how to monitor Lambda functions. Exactly. Cool. Thank you so much. Let's go back to the cool office here, which is actually

Starting point is 00:26:28 really cool. It's like post-industrial in Detroit. It's awesome. Detroit in transformation. Yeah, absolutely. I like Detroit, by the way. Cool city. Come by if you've never been here. Alright. Bye-bye. Bye.

Your Ad Here

PurePerformance - 042 101 Series: Serverless

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.