PurePerformance - 069 Four Serverless Patterns everyone should know with Justin Donohoo
Episode Date: August 27, 2018Serverless has been a hot topic for quite a while, but we are still in the early stages when it comes to best practices and tooling. Justin Donohoo, Co-Founder of observian.com, gives us the pros and ...cons of 4 architectural patterns that he calls: “Microservice / nano pattern”, “Service Pattern”, “Monolithic Pattern” and the “GraphQL Patterns”. Besides these patterns we also learn about common cost traps and how to “architecture around them”. For more information on serverless Justin also shared his recent Serverless Meetup presentation. And stay tuned – there will be more from Justin around secrets, containers and anything else there is to know about cloud native applications.* https://twitter.com/justindonohoo* https://www.observian.com/* https://docs.google.com/presentation/d/1ffx9wWioFahxQJbhpPhJ3FC-csNXrdEGdufrj5LwV6w/edit#slide=id.p* https://observian.com/tools/secret-awsome/* https://www.puresec.io/blog
Transcript
Discussion (0)
It's time for Pure Performance.
Get your stopwatches ready.
It's time for Pure Performance with Andy Grabner and Brian Wilson.
Hello everybody and welcome to another episode of Pure Performance.
My name is Brian Wilson and as always, I always say as always, my co-host Andy Grabner is here.
Hello Andy.
Hello, this is Andy Grabner.
Hi Andy Grabner.
I don't know why we're doing goofy voices.
It's goofy.
Today is actually International Goofy Voice Day.
I just made that up.
I just declared it.
And since my alias is Emperor, it is true now.
You're back over in Austria now for a little while, right, Andy?
A couple of days.
Well, officially I live here again. Well, I mean, yeah, you live there, but you're never there.
Yeah, I'll try to be here a little
more often because obviously gabi and i made the the choice so the decision to move to austria so
we should stay a little bit more here my parents will enjoy it as well yeah i'm sure they don't
just uh if they actually see me more often but yeah but i still you know flying around educating
the world and trying to um trying to help folks help folks in, um, the stuff that we do, right.
Talking about performance.
And speaking to your mom, uh, everybody remember to call your mom.
I don't know why I said that, but it's something people should always do.
Right.
Cause I'm sure they're anyway.
Yes.
Speaking, running around, speaking of performance.
Right.
And, uh, we got some, are we going to talk about performance today, Andy?
Uh, I would believe so.
Right. We've got some – are we going to talk about performance today, Andy? I would believe so, right?
Because basically everything we really do, not only us, but also our guest today,
I assume that Justin is his name, and I will give him in a second a chance to introduce himself.
I'm pretty sure everything we do in some places somehow should have performance in mind,
whatever that means.
So, Justin, are you there with us?
And would you mind introducing yourself?
Yeah, I'm Justin Donahoe.
I'm one of the co-founders at a company called Observian.
We're an AWS partner and a software delivery company
where we help people focus on just delivering software better.
Been doing serverless for quite a bit.
I've run the serverless framework meetup here in Salt Lake City.
So thanks for having me on today.
Cool. Actually, on your website, if you go to observant.com and then on the About page,
I think you mentioned it, but you actually do cloud better.
You not only help people to deploy better software, but you do cloud better.
And you have some cool things on there.
And AWS, obviously, one of the big cloud vendors and their Lambda service, is this stuff that you do exclusively with AWS?
Or do you help people in general, whatever public or enterprise cloud environment they choose?
So when it comes to serverless right now, from a tooling aspect, AWS has the most tooling, if you will,
and some other cool things that you can do to eat around some limitations that I can talk about later.
We do dabble in the Azure and Google space.
But the bigger shift that I'm seeing is there's a lot of customers that are still trying to do on-prem VMs,
and we're helping them transition into more of a container architecture.
So most of our serverless projects end up being cloud native. Anybody running in the cloud,
we're not running any of the open function platforms or anything on-prem with customers.
If they want to go a more serverless route, we kind of nudge them into the cloud like,
hey, here's why you should go here and here's the benefits.
So do you see this interesting? I mean, obviously, serverless is a hot topic, and we've been talking about it for a while.
And if anybody wants to kind of get a little, let's call it maybe a one-on-one, then Brian, we have an episode with Danielle.
We did a one-on-one.
What was that episode again?
Yeah, let me take a look in this chat here.
That was episode number 42.
Oh, 42. What an apropos that was episode number 42 oh 42 what an
apropos number episode number 42 serverless 101 uh it's funny every time you say 101 i think you
say one-on-one and it reminds me of the hall of notes song um but anyone can go listen to that
but yes episode 42 serverless 101 there's also several other episodes we covered some other
serverless topics.
But I think we're going to be looking at some new things today.
Yeah, so Justin, what I've heard, and I think you just mentioned it as well, right?
People are excited about this technology, and then you realize what they are actually doing right now is a couple of steps away from where maybe this new technology can can
bring them and what are your so how do you approach projects and what are the the lessons learned in
getting people to actually leverage a new technology whether it's it's going to be
serverless in the end and it was you call it cloud native using microservices
or whatever it is how do you what are the lessons learned and how can, how can people get
started? What are the things that you want to, who can tell the audience? Yeah. So one of the
first things I recommend people do is like, let's save yourself a ton of time and heartache and just
go straight to serverless.com and start using that framework. The framework takes a lot of
the complexity and, and makes it easy to organize projects and deployments.
When you're talking serverless, we're mixing a lot of technologies together.
Like we're doing CloudFront.
We're doing static hosted S3.
We're doing API gateway talking to Lambda functions.
We might be doing Cognito for authentication.
Like there's just a lot of AWS services. Managing that in your own repo and doing it by hand is atrocious.
So I just kind of route people to the framework.
It's based on CloudFormation.
If people aren't really that familiar with AWS, think of it as their infrastructure as code.
And what you have is a nice, elegant YAML file where you can define your functions and your API endpoints.
And when you just do serverless deploy, it'll create a CloudFormation stack and send it out for you.
Versus trying to be like, okay, I have five lambdas.
I need five API gateways.
Do I do five separate ones or is it all one?
All that management is a nightmare.
It goes away.
Also, like repo structure, I've seen different projects where some people try to do, hey, we're going to have one function per repo.
It seems good on paper, but it becomes a management nightmare.
You have what I call repo sprawl. There's a couple different patterns we can talk about, but a good repo per service or a repo per logical grouping,
like a user service or a customer, not customer,
like a product service where you're doing logical grouping.
So you might have N plus one lambdas,
but they're all related to this one thing
and they deploy as a unit.
It's a little bit better.
So can I ask a quick question?
So the serverless, I've actually just recorded
a YouTube tutorial last week with Daniel Kahn,
one of our guys who is kind of our technical expert and advocate for Node.js and serverless.
And he also used the serverless framework.
So I was actually very excited to see and completely agree with what you said, right?
You write your YAML file and then serverless automatically creates all the boiler plating for you or the cloud formation template the only question that i had and i didn't i think
get daniel to ask so if i if i if i use serverless and i create my first version i create my cloud
formation template and if i make changes is serverless i assume smart enough to just update
my cloud formation template so i just update stack, or do I have to completely recreate
that stack every time when I deploy a change,
or how does this work?
So when you are doing your serverless deployments,
you have an S3 bucket which basically
keeps the metadata about that deployed unit.
When I do serverless deploy again,
all it's doing is a cloud formation change
set so yeah it'll just update the stack and redeploy the changes that's perfect cool and uh
yeah so i think serverless.com is that the um is that the place to find more yeah yeah i was just
checking that out they also list i'm sure it's probably still um not as heavily used but they
do list the full suite of Google Cloud Platform,
Azure, IBM Cloud, and Kubernetes, along with AWS, obviously. So it looks like they're expanding as
well in some aspects. They definitely offer touch points with all of your major cloud providers.
It's pretty powerful. You have one framework that kind of, in your YAML, you just define the provider,
and it's literally switching between cloud providers.
So you can kind of play with it.
And so to some of the architectural patterns, because you said, you know, on paper,
it may look interesting to start with individual projects, which are individual serverless functions.
But then I think, as you said, it makes much more sense to kind of group them together. How does any additional best practices that you have
around here, especially when it comes to, quote unquote, refactoring, what if you are realizing
that at some point you need to move things around between different projects? So how do you structure
a project then? Is it just from an organizational perspective you structure certain functions together or is there another framework
on top that kind of aggregates functions that logically belong together into a into a single
entity how would that work so one of the things that i try to do is i specifically only do my
serverless in.NET Core in Java i'm trying to show people that you can do serverless in the enterprise.
I also do it in Node, but the main reason I focus on the more compiled runtimes
is because there's not so many functions and, let's say, examples out there.
But if I want a Node or a Python example, you can find thousands of them on your first google hit so for the different design patterns one of the things that i found
that i keep it just logically grouped in my head is i architect my projects almost like it's an
mvc project so your standard model view controller where the handler i treat it like the controller. So you make sure that the controller slash handler is dumb.
It's almost a pass-through.
All it should be doing is initializing business logic classes and functions
that are calling into your model to interact with database, other third-party REST APIs.
And so that's how I structure the project.
So you have, let let's say a user class
it might have a create user method it might have an update user method and then you can have these
different handlers just pass through their logic to these methods um as far as actual design
patterns this is where it's going to be kind of a long-winded answer. So chime in.
I've got, what I've categorized is four different design patterns that people are doing. It doesn't
matter what language you're in. This is purely just serverless topics. So the first one is,
I'd say your traditional microservice, but I kind of coined it a nano service. And what it is is a single lambda per function with a single API endpoint per function.
So the pros to this approach is you've got separation of concerns down to like the functional unit.
It's easy to debug because it's an individual function.
It's pretty easy to test because you just have one input, one output.
And it's really easy to deploy changes to the system to just a single
function. So that's all the pros. Now, if you want to talk about the cons with this pattern,
one of the dirty secrets of serverless is cold starts. When you're dealing with a cold start,
it's pretty noticeable. Like I've seen cold starts take several seconds, but once the container
is warm, if you will, it takes a couple milliseconds.
So when you're running an API that's in this nano architecture, you have a lot of functions as you grow.
So you can get function sprawl.
You have risk of hitting cloud formation limits because there's a file size as well as resource count.
Your full environment from scratch, if you want to deploy the whole thing,
takes longer because you have to deploy each Lambda one at a time.
And this architecture pattern has the highest probability of cold starts
because it has the most endpoints and the most functions.
Does that kind of make sense there?
It does make sense.
And how did you call it again?
So that's a microservice nano pattern.
That's where you do a pure one-to-one.
So I have one RESTful endpoint that maps to one Lambda function.
Yeah. And then the limits that you said on the CloudFormation,
this is because you have so many CloudFormation scripts, obviously,
because for every single function, you have to create or update a CloudFormation template. And these are obviously the overhead alone of that, of the whole orchestration of the CloudFormation template.
I guess that's where the limit is.
Or is there actually a hard limit on the number of CloudFormation stacks you can create and manage within an availability zone?
Well, it's also the fact that if you're using a serverless framework, there's a way around it, right?
So if I'm using a serverless framework where I've mapped out my whole API and it's one-to-one, one function each, the problem is it's actually a single CloudFormation file and you can hit file size limits and resource count limits.
A workaround, I guess, would be one serverless project per function.
And now that you have like 50 projects floating around,
it becomes a management nightmare.
Yeah.
Hey, let me ask a question on it, because you're
talking about the one-to-one, right?
Now, if we take a step back to containers, right?
I mean, not containers, sorry, back to microservices,
one thing we know in microservice design
is that if we have one-to-one service calls,
we really don't want to split those
into separate microservices
because from a network point of view, from a resources point of view,
if every single call hitting service one always goes to service two,
it makes a lot more sense to keep those together.
How do you determine, or what are the pros and cons?
You're saying you're doing a one-to-one, one endpoint.
What are the pros and cons of that, of going out? It sounds like you're taking an opposite approach in in the serverless where
you are splitting them out are there just trying to run my head around you know why that works in
serverless but not in microservices obviously the shorter running your services the less you get
charged but you're also then spinning them both up. So does it really balance out? Do you understand where I'm going with that?
Or Andy probably at least does.
So the main benefit for getting this granular
is as you're kind of doing a sandbox architecture
where you're building serverless for the first time,
there's some different learning curves
that you can't really describe without going through it.
And this allows you to optimize each function to the optimal settings for that function for the
best price and performance. So if I lump, say, five lambdas together, I now have to optimize,
or not lambdas, but you got to mean I have one function that handles five inputs. I have to
optimize to the slowest or the highest memory usage function.
So something that's really lightweight
might just be having this unnecessary overhead
because another function that's inside it
that's a possibility takes a lot more RAM.
So we have to optimize to our slowest
or worst offender, if you will.
So it's kind of like the noisy neighbor idea
comes into the serverless,
whereas in microservices, it wouldn't necessarily, or worst offender, if you will. So it's kind of like the noisy neighbor idea comes into the serverless,
whereas in microservices, it wouldn't necessarily.
But when you move back over to serverless,
you can actually gain from having the quick ones go on their own.
That's interesting. Yeah, I like that.
Because you can control the layer,
the different aspects of concurrency,
as well as memory, timeout execution.
There's a lot of little levers that you can pull on an individual Lambda function that when I'm in a more traditional microservice, I have to pull a lever and it usually impacts the whole service.
Cool.
So the microservice nano pattern, what's the next pattern?
So the next one is more of, it's called a service pattern. This is what I actually call a microservice nano pattern, what's the next pattern? So the next one is more of, it's called a service pattern.
This is what I actually call a microservice.
These are patterns that are referenced in documentation on serverless.com.
I'm just going through kind of one of the talks I did at one of our meetups.
So the service pattern, it's really more of a microservice.
What it is, is a single Lambda that can handle
multiple scoped functions. So you have multiple API endpoints that point to a single function.
So the pros are basically separation of concerns down to the service level. It's easy to deploy
to a single service and it reduces the number of cold start possibilities because we have more endpoints talking to a function.
The cons is that now you have to write your own custom router.
So your handler, the first thing it's got to do is look at the request and say,
which function is this person actually trying to call?
Which leads to increased debugging complexity
because you have one input that can have multiple outputs.
It's a little bit harder to test because, again, I have one function that has four or five different results. And the
function's now grown in size. So there's more physical code. So when you make changes, there's
a higher risk because, you know, more code equals more risk when we deploy. But this is more of your
traditional microservice. So I might have a Lambda that's like user and the endpoints that are pointed at
user are create user, update user,
get user, or list users.
Those handlers are all
inside
one Lambda function. Those
endpoints all point to
just the user function and based off, you know,
is it a post to get put?
I have a router that's routing to the right
function, if that makes sense. That makes sense. And so technically though, get put, I have a router that's routing to the right function, if that makes sense.
Yeah, makes sense.
And so technically, though, I mean, I fully understand the, you know, you would probably use an API gateway, as you said,
API gateway configuration to define all your different endpoints and then route it to, let's say, the same function handler.
But technically, and I'm now just talking about node.js because that's the most recent lambda
function that i've implemented can you tell me if i would be able to export in a simple lambda
project multiple handlers or does it always have to be just one handler it always has to go through
the same handler and then the handler has to decide how to route it based on the request parameters,
or could I expose and export
all of my different handlers
to create the update, delete,
as you said earlier,
as individual exported handlers,
and then use the API gateway and say,
here's this handler,
and here's another handler,
even though technically they're in the same serverless function but i guess it's not possible right well so what you just described is
the nano pattern before right where you have one endpoint talks to one handler so yeah but to
answer your question it is possible um we've run observing.com 100 on serverless and it's no js
background and we've got like a hybrid where some functions are declared in our serverless and it's Node.js background. And we've got like a hybrid where some functions
are declared in our serverless project as a one-to-one.
And like our blog that we're just rolling out
is more of a service pattern.
We have a blog that has all of the endpoints
go to this one function.
So the lambdas themselves have their entry point,
which is called the handler.
And you can make it whatever you want.
But this design pattern is specifically calling for we route everything to one handler because one handler in this case means one Lambda function.
So the first call, it has a cold start.
It's up and running because we listed a user.
Now we want to add the user well as long as we do it within 15 minutes because that's how about how long you've got a container inside lambda going for it's going to hit that same pre-warmed container
it's going to be much faster so if i'm adding and editing users that first initial hit was a little
slow but each one after that is much faster and how many how many parallel requests how does this
work from a scalability perspective let's's assume I have a hundred logical functions
or endpoints that are all calling into the same handler.
Does this mean, how would a runtime like AWS
be able to correctly scale this?
It's probably pretty hard.
You can just scale it based on-
There's literally a number for concurrency. just say 10 right so now it'll keep 10 of your lambda containers warm
if you will so 10 people at the exact same time hit it they're going to hit their own instance of
a container the 11th guy might queue up for a second when the other one's finished he grabs
the next one on the stack or if you have it say set to you
know 15 and you have 10 people using it that 11th person gets a cold start and then they're going
it's pretty crazy you have to be careful when you start doing serverless because of how easy it is
to scale you can ddos people if you're not careful if you've got a high concurrency rating talking to
a third party api that doesn throttle, you can literally take them
down. Yeah. I was thinking too, besides taking them down, that's a great way to make them lose
a lot of money as well, right? If you're constantly making them spin up more and more long running
lambdas, um, that's going to cost a lot as well. So it's kind of, it also becomes a security
concern besides the DDoS, the financial impact as well.
Oh, yeah.
It gets wild. So the mystery of, oh, where are the servers go?
It means it's super secure, right?
It's like, no, you still have to deal with SQL injections.
You still have to authenticate your APIs.
You still have to do smart throttling.
Or, yeah, you can spin up thousands of dollars, but nobody's watching it.
So you still have to use your brain, basically.
That's what you're saying.
If you architect something.
That's what I was saying in a nice way, I guess.
Yeah, exactly.
Like, don't be stupid, people.
One of the other things that I've ran into is,
hey, all of these permissions in AWS are complicated.
I'm just going to give my function admin access.
What could go wrong?
It's like, well, a lot.
So there's some interesting things going on in serverless.
PureSec has got some cool stuff in the space where they've come up with a runtime that's doing inspection.
It's a cool landscape.
I'm kind of watching it as it emerges.
One of the former Akamai execs, he's the CTO over there,
he does a web series every once in a while.
It's pretty cool.
I would definitely check out PureSec if you're interested in security around serverless.
It's not my forte.
How do you spell that? PureSec?
Yeah, like Pure, P-U-R-E, and then Sec, S-E-C.
Look at that, Brian.
This is like pure path and pure stack.
Now we have pure Sec.
Right.
And pure performance.
And pure performance.
Look at that.
Hey, I got one more question on this. So does this mean that the container, the physical container that is actually a virtual container, whatever it is,
that is actually spawned up by, let's say, AWS, can only handle one request at a time
and then spins up another container in case I have concurrent requests?
Or depending on the runtime, Node.js, Java, or whatever it is,
there's a multiple worker threats that can handle requests?
If you've dealt with like an IIS web server
or like an Nginx, you know, like when you're doing Node,
you have your different child processes
that are basically, you know,
instances of your service that you pass through
and that's how you can have multiple pools, right?
Think of a virtual container.
That's how I describe it to people.
It's like a managed virtual container.
It's really what it is.
You're not doing anything with it.
There's servers behind the scenes, whatever.
That's the nitty gritty.
It's the same thing.
If you have one of those virtual containers running, yeah, it can only handle one request, but it's not that big of a deal.
You have your concurrency set to 10.
It's like you can have 10 concurrent users.
You can crank it up to 100.
You have 100 concurrent users.
The beauty of it is you're only paying for what you use.
So it's like if you have tens of users coming to your website and your concurrency is 10 you have what one or
two people at any given time if you start getting into the thousands of users okay you might have a
couple hundred concurrent connections at once you can just keep growing i have one architecture that
it grows with me the cost grows with it but if i don't have traffic i don't pay that much
so as i get more traffic i'm making more money much. So as I get more traffic, I'm making more money.
Yeah, it costs me a little more, but yeah, I'm making money.
It kind of delivers on the promise that cloud pretended to be for a long time,
which was pay for what you use,
and it's really always been pay for what you provision.
So this finally brings pay for what you use to a reality
for virtualized computing in the cloud.
I love that way you just said, pay for, that's brilliant.
Pay for what you use versus pay for what you provision.
I never, I just wanted to repeat that
because that's really, really amazing.
Yeah. And isn't that the same as in the mainframe times?
Yeah.
Going back to mainframe, right? Yeah, I'm not, yeah. I mean, in back to mainframe right yeah i'm not yeah i mean in
the end i mean i'm not familiar i mean i didn't grow up in that time but based on what i know
is that you obviously get the hardware and then you pay by mips right whatever you use
yeah and uh it seems like a similar model cool hey um so we had the microservice nano pattern
we had the service pattern yeah i gotn. We had the service pattern.
Yeah, I got two more.
Yeah, I got two more.
Come on, shoot over.
Okay, so the one that is called monolithic,
it probably turns people off just because of the name monolithic, right?
So what is it?
It's a single lambda that basically handles everything.
And that might sound crazy, but it's really not if you think about it. So it's all of your API endpoints point to one function.
And if you think about how your traditional REST MVC frameworks work, this is literally what they're doing.
You have one endpoint that's all funneling to one service, and you have routing that passes off to the different controllers, which does the logic, right?
So this is the fastest way to deploy an entire system because it's one Lambda.
So if I go from zero to hero or I switch regions, I'm deploying one giant Lambda that my website points at.
It also has the lowest probability of a cold start because all of my API functions are going through the same Lambda.
But where it starts to have cons while it sounds great
is your custom router is getting pretty nasty because not only do you have to look at like
your http methods to see is this a put post get delete but is the url user is it product is it
login you definitely risk hitting all the cloud formation limits that we talked about before
debugging gets much more complex because of all the different formation limits that we talked about before debugging gets much more
complex because of all the different inputs coming into one system testing becomes a lot of mocking
that you have to do to make sure that you can handle every use case and performance monitoring
is all over the place and this is probably the main reason i don't ever do this one but it's an
interesting concept that people do i can't't tell what's, you know,
good performance in this pattern
because a login method that takes a few milliseconds
versus something that's, you know,
running a database query,
it's all in the same function.
So you're going to have peaks and valleys
and you can't really tune your performance.
Again, it goes back to tuning to my slowest running one.
I mean, that's where I want to interject quickly here,
but that's obviously where companies like Dynatrace,
we are investing in automated tracing of your Lambda code as well
so that you can actually install our, what we call,
one agent on your Lambdas,
and then we automatically tell you which execution paths,
so which traces consume how much time
and which trace is actually executing a database statement
versus not doing a whole lot.
But I think we had a conversation at the beginning
before we started recording.
You also said that the tooling,
there's a lot of tooling around serverless
that is still in the very early and immature phases.
Obviously, we try on our side to do our best to fill that gap.
But I agree with you.
If you have not seen some of the stuff we've done in the past,
and I'm sure we're not doing yet the perfect job on every runtime that is out there.
But I think just to let you know, there's stuff coming on our side to address this.
So there's stuff out there already.
Tell me what I'm looking to.
Yeah, in this space, I mean, the APM players,
Abdi and Dynatrace,
they both are doing a pretty good job at this.
But the ecosystem as a whole, it's just lacking.
It's just such an emerging tech.
It's just like when containers were the
new hotness and everybody was trying to do it and it required complex uh sys admins and then
kubernetes comes out right google's like hey here you go just use this and then it took off like
wildfire because all i need is a kubernetes endpoint i can start shipping containers
yeah so i think the serverless technology it's probably
like two years out where you really see the tight integrations in the suites i mean there's you know
random companies popping in the space like dashboard that does a pretty good job of giving
you dashboards of what's going on in your lambda functions because there's there's got to be a
balance right i don't want to have 50 tools in my system just to monitor what I'm doing.
But I also don't want to be digging through cloud trails and event logs to try to figure out what's going on.
It's like we need a happy medium.
So like central logging, how do we do that?
How do we get something deeper than X-ray?
Is it Dynatrace?
I'm leaning more towards yes, something like that.
And APM would be nice if I'm in a hybrid architecture where I've got some serverless apps, I've got some container apps, I've got some traditional interior apps.
I've got one pane of glass.
I've got one agent that can handle it all.
I think that would be amazing.
We just got to get there.
So it's just time, right?
Cool.
Hey, what's the last pattern?
We've got three so far.
The last one is called Graph. So it's using GraphQL.
It's basically the same thing as we just talked about with the monolithic,
but it's one or two endpoints that are your graph query endpoints.
So you have two endpoints pointed to one function,
but because you're using GraphQL, it gives you that query syntax
so you're actually passing in queries
and telling the API what you want.
The pros is low cost of ownership
with a scalable Graph API.
It's pretty fast to deploy.
It's another low probability of a cold start.
But the cons are if you don't know GraphQL,
that's your first barrier of entry.
You got to go learn GraphQL and then you got to figure out how to apply it into a serverless architecture.
And again, because it's a single function, you can hit CloudFormation limits.
Debugging and testing becomes complex, and your performance tuning and monitoring is, again, your slowest user is what you've got to basically tweak towards your slowest invocation times.
Can I ask a question on this?
Because that means your Lambda functions basically just return the data and then use GraphQL on the front end.
Oh, do I get this incorrect?
I use GraphQL on the front end to then extract the type of data that the backend provides to me.
Or is this, am I getting this wrong?
It would be like your front end is talking to a single API endpoint and it's using like GraphQL queries to pass in a request.
So it's still executed server side.
It's not like giving you all the data and say filter at client side.
It filters at server side and returns the results of your query to you.
But GraphQL, obviously, then the framework, you know,
translates that query and then I'm probably, I can then, I don't know,
call certain functions that then really then provide that data set
for that particular query.
And GraphQL takes hopefully some of the burden of me
to do the filtering and returning the right pieces
of data elements, right?
Yeah.
I basically set up, you know, here's my query endpoints.
I route, I basically map these functions
or class methods are to this type.
And then it does kind of the routing for you.
It gets pretty complex though.
You're mixing two frameworks and an emerging tech.
I feel like if you're wearing a fedora and you got a scarf,
this is for you because it's super niche
and nobody's going to know how it works but you.
So you might take pride in that.
So I don't want to bash GraphQL too hard.
I actually think it's a really cool technology.
It's just, it's easy to be stupid when you do something
like this. It gets really
messy if you're not, you know, controlled
in your software architecture.
High level of discipline is what we should say
in a nicer way, I guess.
Cool.
Yeah. Go on, Eddie.
No, I wanted to say this is,
I think this was extremely valuable
because I think we never, in any of the episodes that we had on serverless went into that much detail of the different design patterns and also in special things for discussing the pros and cons of every approach.
Brian, do you have something on this? Because I have another topic I want to bring up. Well, no, there was a topic I wanted to go to.
So it was number three about how they get you.
But I'm not sure if you have a different topic.
Let's go there.
Yeah, I wanted to...
Do you want to talk about how they get you
in performance instead of going serverless?
Well, yeah, you had mentioned a bullet point
in serverless cost optimization
and how they get you and how you can get around it.
Basically, knowing that every business model somebody creates is going to be hopefully to the benefit of their customers, but it's also going to help the business make money.
Right. And in serverless, there's a lot of pitfalls and a lot of ways you can really get taken.
Well, I shouldn't say get taken advantage of, but a lot of things you do to really um not be smart and spend a ton
of money right so that's kind of the how they can get you and i think you were going to address a
little bit of how you can get around it how you can make sure you don't do things um maybe from
some from standard points but also from maybe some after you do a lot of serverless what are some
tricks of the trade in order to help reduce costs? And that's where you might've been going with this. That's, that's how I'm interpreting that
bullet point at least. Okay. And then the other one that you guys wanted to bring up,
cause I can shoot two off real quick, but let's see what the other one is.
Well, the other one was the fourth bullet point that we had in our preparation list,
which was talking a little bit about secrets. Um, but I know you also said this might be a little longer topic on its own.
So maybe we'll take this aside for another episode.
Let's go on the cost and monitoring, because I think that's obviously very important.
You know, is it, do we shoot ourselves into the foot if we're doing something wrong
and then wake up with a big bill from Amazon or whatever other cloud provider?
So maybe let's talk about this first.
Yeah, I'll do two quick points.
And then maybe if we have time, just a high-level secrets management stuff.
And then maybe we come back and do that as an episode of It Sound Like You Said.
So the two points I was really talking about is the first one is API gateway.
It looks cheap on paper, but you start running
the numbers and it in proportionately grows your API costs. I'm not sure what the limit is where
it's like, hey, you might be better off just running this in containers. Obviously, it's
going to require a very high load. But let's just compare costs real quick lambda i get a million invocations a month for
free after that it's 20 cents per million invocations api gateway is three dollars and
50 cents per million you start getting into the billions of api events a month and it starts to
get pretty expensive this what once was super cheap uhEST API running on completely serverless, it can scale.
It starts adding up.
So you got to really kind of figure out, you know, if it's running 24-7,
so my containers are never really turning off because I've got such high load.
I'm hoping I'm making enough revenue that I don't care at this point.
But these are just, you know, things to think about. What is the point at which a 24-7 Lambda service running through API Gateway
becomes cheaper to just host yourself in, say, Fargate containers? Obviously, you're adding
more complexity and you're at the nth degree of optimization. It's just something to think about.
If you have a very API-heavy application, you're going to get some extra costs that you might not have thought of.
Everybody sees the 20 cents and thinks it's going to be super cheap.
That's interesting.
Cause what you're saying,
it sounds like is that it's not the serverless component that's going to cost
you in the end.
It's the API gateway and that's a necessary part.
You can't have one without the other.
Yeah.
It's kind of my plea to Amazon to fix their pricing.
Like,
can we make it a dollar?
Like,
come on, let's make it a dollar per million. What is it now? He said plea to Amazon to fix their pricing. Can we make it a dollar? Come on.
Let's make it a dollar per million.
What is it now?
You said $3?
$3.50 per million as of July 24th, 2018 for anybody listening to this in the future.
What if they make it $3.49?
That'll sound cheaper.
I mean, when you're dealing with billions, one cent adds up pretty quick.
Yeah, I know.
It's like a billion pennies any day.
So, but if you would move, I like that.
You know, you say basically,
let's figure out which Lambdas are running constantly
and then put it to a container and let's run it in Fargate.
But wouldn't you still kind of expose your Fargate APIs
through an API gateway or would you then do it?
Yeah, that's why there's two levers, right?
There's the Lambda runtime.
If it's invoking literally all the time,
that's going to be an up and to the right,
and there's got to be a crossing point
where hosting it gets cheaper.
Then really my beef right now
is just the API gateway pricing.
It's a powerful tool, don't get me wrong,
but I have virtualized on-demand compute for $0.20 per million,
but an HTTP proxy is $3.50 per million.
It just seems a little skewed.
Yeah.
No, I think my point was, and maybe I get this wrong,
my point was even if you move from Lambda to, let's say, Fargate,
wouldn't you still have as a front door gateway the API gateway?
Yeah.
Why not?
So that means you're not saving the cost on the API gateway,
even if you move from Lambda to Fargate.
Not necessarily, because you could do a traditional ALB or ELB in front of the container.
And you could do like, to really be optimizing the cost,
you'd be taking your logic out of your lambdas
and sticking it to more like a containerized microservice
and throwing it up there.
So it's handling all of the routing for you.
Does that make sense?
Yeah, that makes sense.
Yeah, I completely, I understand.
So basically what you're saying is you would replace,
and this was like what I got wrong,
you would replace the API gateway with your own routing
that you may run in a container
and then expose the container through some easy mechanisms.
And with that, you can save the API gateway for,
yeah, I understand that.
Cool.
I'm curious to see if people are actually hitting these problems in the real world i mean there's
a lot of people doing serverless like you know a cloud guru for example if you're looking for aws
training you probably go to a cloud guru they're 100 serverless they even run serverless conf so
these guys are pretty good at it i don't see them you know calling the sky is falling
here's the dirty secrets of serverless let's get off it no they promote it these are just
things to think about i'm not trying to you know be a naysayer a doomsday person but you might get
yourself in a situation where this is something to consider but it's not a red flag as you're
going to run into this and it's going to cost you a fortune it's correct keep an an eye on what you're doing, just like anything, keep an eye on what you're doing
and what those costs are so that you can figure out, hey, do we need to optimize and rethink so
that we can bring our costs down? Yeah. It's really just, you know,
understand how it works before you just do it because it sounds cool, right? If you understand
how it works and how you're being billed, you don't have surprises because having, you know,
a billion requests a month going through a serverless architecture
is really cool.
It's, you know, it's not going to be that expensive
relative to the amount of revenue you have
and the complexity that you lost from having to deal
with all the orchestration of servers and patch management.
You're gaining, right?
It's just, you know, things to be, you know, cognizant of.
The other one that I wanted to talk about is there are some limits in Lambda.
I can't run longer than five minutes.
So there's a couple different approaches to how I get around that.
The first one is, here's a good example of a product I was working on where I have basically a Bitcoin address. You give me, I go pull your
transactions. I look it up. I'm basically just trying to do an accounting app for crypto miners
so that we could, you know, keep our taxes so that the government doesn't come after us.
Basically, when I would pull your crypto wallet, if you had a ton of transactions,
if my next thing, my lambda was,
I'm going to go for each one of these and unpack the transaction and figure out how much you gained
or spent. Well, that very easily hits a five minute runtime. So a couple of things you can do
is first, let's just break a function out that handles the unpacking and inspection into its own function we can do
a simple for each with parallelism concurrency controlled by lambda and call each one of our
transactions into our function but if you remember back to where i said be careful with concurrency
you can topple people over so that's not you know the most elegant. We could also implement SQS.
We could queue something.
We could have a scheduled Lambda that it starts up.
Is there anything in the queue?
Grab the first one, do it, go back to sleep.
And the more elegant solution that I've landed on that I'm really liking,
and it's a good hybrid between serverless technologies, is I queue the record but then I can invoke using the
AWS API is a Fargate container that comes online and it's one job in life is
to watch this queue process anything in the queue and then kill itself when it's
done and so now you're you got serverless containers with a serverless
API and long-running background jobs you're doing in Fargate. It's awesome.
Yeah.
And basically what you're saying, and this is I've mentioned this in previous podcasts
when I started building my first Lambda function and I basically made the mistake of applying
my 90s mindset, which was like I learned coding in the 90s and I did not learn how to do-driven programming and how to break things apart and chain them together, as you said, through SNS or any other means or cues.
I basically ended up with functions doing something very stupid, calling other resources, waiting.
And then basically I built a string of synchronous functions that I called and ended up paying too much money and it took too long.
And basically what you're saying is there's a lot of smart ways of breaking a big problem into smaller pieces.
And then if the smallest piece is still too big, you can still outsource that into a container
and let the container duty the heavy lifting for these,
let's say rare occasions or certain occasions where you actually need to execute longer than
five minutes. Yeah, exactly. Could we call that container as a function?
Hey, you know, Andy, I've, I've heard that story so many times now that I've actually started using
it in front of people and I'm, I people and I'm doing so well with it.
It sounds like it's my own story.
So I appreciate that.
Yeah.
Now I say back when I created my first function, which I've secret, I've never done any.
I actually just taught myself.
Just so you all know where I stand.
I just taught myself Python, which is the first language I really delved into since basic in like the mid eight, mid to late eighties,
whatever.
So yeah,
I'm a bit further behind on the functions,
but,
uh,
I can sound like I'm smart with it.
Thanks to you.
I mean,
I feel like I just need to follow me around.
He's got the sweet accent and he just recapped what I said.
And like a more elegant,
I'm like,
Oh,
I speak cyborg.
He's like,
Oh,
for you other people,
here's what he actually said.
That's a
talent of andy's all right these are these these have both been really awesome um and i just want
an interest of time i want i want to push on to giving giving a kind of an intro to the next topic
if you can give us a higher level of it um but those are some i think some some great tips on
that uh it's amazing how serverless is still in its infancy,
but really quickly getting more complex and more adopted.
Yeah, the traction's been insane.
Yeah.
So there was one last topic, right?
These serverless secrets, right?
Do you want to give us a high level what this is?
I'll just talk about secrets management in general.
It doesn't have to be serverless,
but where serverless but where
serverless made it exposed you know some opportunities to me it's like i'm sick of
people checking secrets into github repos and then they mess up and they check into a public
repo and wonder why they got hacked it's like the most common way people are getting credentials
lifted is they're just giving them to people for free or they've got system files that are storing secrets on disk and somebody gets on the server it's like stop and what is a secret and
joe can you just even say define what a secret is because you kind of gave some examples there
but just i mean before you mentioned it i had no idea like i'm like secrets what are you talking
about can you just even give us like a high super high level starting with what is a secret in this
context to be fair i had the same like what the
hell are these people talking about when i saw it i didn't realize the industry was calling it
secrets management i've always called it like application configurations right so basically
it's you know your database connection strings your you know api keys anything that's a secret
that you don't want somebody that you don't know to have access
to. You want to keep it a secret, right? So there's a couple of different approaches. This
is more of a, I think almost a one-on-one conversation for another day. But the three
things I want to touch on is there's inside AWS, there's what's called SSM Parameter Store. It's like a hidden gem in
AWS that has become one of my favorite things in AWS. I don't think Amazon realized what they
built until it was too late. And the pricing model is free. If you have an AWS account,
you have access to SSM Parameter Store for free. The only thing you get charged for is the api request and it's like fractions of a cent
i think amazon finally figured out what they did and they tried to commercialize and make it more
of a product and they just announced recently aws secrets manager which takes kind of what you can
do in ssm to an extreme it's a lot more expensive but it features things like auto rotation of secrets between RDS instances.
And then the third thing I want to talk about is one of the things I like to do in serverless is multi-region architectures.
And when I'm doing secrets management inside of SSM Parameter Store, the UI is really clunky.
And for the console in AWS, it's region-specific.
And doing everything from the command line can get tedious.
So we built an Electron app at Observian.
One of our engineers, his name's Jeremy Zercy, put together an Electron app.
We call it AWS Secret Awesome.
It's an Observian app.
It's free.
It's on our GitHub.
It's just github.com slash Observian.
Anybody can use it. And what it allows you to do
is easily copy and paste secrets between regions with just a UI. You click, I want this one in
Virginia, hit save. It's there. We're also, you know, working on it as we go. It's an internal
tool that we use for our own dev and we give customers, we want to be able to do secrets
promotion and some other stuff. It a free app so i mean people
can't complain too much they're not paying for it but if anybody likes it feel free to use it and
hit us up with some feedback i'd be i really want to get the community using ssm parameter store and
evangelize stop doing this stuff in files um how i got to this was in lambda you can use environment
variables that's the short version i didn't like that use environment variables. That's the short version.
I didn't like that the environment variables are sitting right there in the console
and you can scrape secrets out of it
if you have access to somebody's AWS console.
So that's the intro, if you will.
We can dive into it deeper another day.
Definitely.
We should definitely do this.
Definitely want to welcome you back
because I think this was a wealth wealth
of information for many people that are still not yet sure uh how to get started maybe before making
uh mistakes or going down the wrong path you know i think it's very valuable information
valuable information that was just uh which was the duke that came out of obviously your experience. Brian. Yes. Shall we?
Shall we do it?
Go on.
As always, right?
I'll try.
So I think what I've learned is while serverless,
you know, it's still, it's a hot technology and has been for a while.
It still feels like we're still lacking
a lot of the best practices to stories
and also the tooling around.
But thanks to people like you
and obviously
others in the space that go out there run the meetups do the conferences actually do implementations
and share the things that went well that things that don't went well what we learned today are
four design patterns i really liked the microservice nano pattern, the service pattern, we had the monolithic pattern,
and we have the GraphQL pattern, where each of them has obviously its own pros and cons.
Thanks for giving us the details.
So there's a lot of information out there for people that want to get started.
So please have a look at what people have already done and what they've experienced.
I really liked in the end
when we touched based on the costs
where I always thought,
you know, on the cost perspective,
it is the number of Lambda invocations
that potentially will kill you if you grow,
but it's actually the API gateway
that is charged factor,
more than factor 10, factor 15 more
than what the individual Lambda invocation is
per million, obviously.
And you gave some great advice on how to offload
maybe some of these constantly running Lambda functions
behind an API gateway to a container running in Fargate
or building your own router in a container
and then basically replace some of the API kit with functionality with
your own implementation to save costs here.
And yeah, I think, you know, for me, this was amazing, even though I also have to say
that everything we learned so far from our other guests was pretty phenomenal too.
But this was very, very practical, something that we can apply.
And some of the websites
I think we want to highlight
is serverless.com.
Then obviously you,
Justin, maybe you can remind us again
about the GitHub page
of the secret tool.
What was that again?
Yeah, so one of the things
you can just do is go to observian.com and in the right hand corner just click on tools it's got a secret awesome page
right there that links up to our github page it also has your blog there as well your observian
blog which has some cool looking articles yeah and you can also follow me on twitter at justin
donahoe my name is pretty unique d-o-n-o-O-O. Just put that in there. You should find me.
Cool.
Yeah, that's it from my side.
I definitely want to have you back.
There's more to talk about on secrets,
and I'm sure there's a lot of other stuff
that you know that will be great information
for our listeners.
Yeah, for sure.
And thanks for having me.
If you guys want to talk about secrets,
you want to get into containers,
there's a vast topics of Wikipedia in my head that we can talk about when it comes to tech.
I like to tell people I'm like a technology therapist.
Tell me what your problems are and I'll help you make them better.
Or maybe I'll make them worse and you find somebody else.
I've got some secrets I want to talk about, but I don't know if I'll get to stay living in my house if i do yeah
so you want to protect those secrets yes i do um yeah and i want to thank you as well because this
is i think every every serverless show we have is completely different um and this yeah as you
said andy this is going into a whole new era of it. Everything is so brand new still with serverless, even though it seems like it's in, I guess, today's pace.
It seems like it's been around a while and people are starting to really talk about it.
There's still so much knowledge to share.
And thank you for sharing that as well.
In terms of the tooling, I think we can all agree that there's got to be a lot more for fans of Dynatrace.
You know, we've got some stuff going on,
we've got more coming all the time. So keep an eye on that because obviously if you, uh, you know,
here's, here's me, Andy, the first time ever, I think maybe promoting Dynatrace on the podcast.
Um, you know, just, yeah, as we said before, you can have a million tools. Um, but if you can have
it all integrated into maybe one or two platforms, it's going to be much better. So keep an eye on what we've got coming out.
But in the meantime, everyone's got to get up and running
and as efficient as possible.
And hopefully we can have more and more people sharing little tips and tricks
and we all get better as an industry and as all individuals we share
and be happy.
Yeah, and I think it was awesome.
And that single pane and glass for Dynatrace is kind of like the
dream scenario and you know there's when we say serverless it's not always just lambda functions
there's containers web storage compute database analytics there's a lot of things to talk about
so people just got to be careful when they say serverless you could mean like 50 things so it's
make sure that people are communicating clear ideas. Awesome.
All right.
Well, thank you for coming on.
We will,
uh,
we look forward to having you back on and we're going to discuss some
secrets.
Um,
maybe we'll have to get some sort of a dear diary music for that one.
Andy.
Yes.
All right.
Thank you very much.
Uh,
we'll all talk to you soon.
Anybody has any questions or feedback,
please give a shout out on Twitter,
a pure underscore DT,
or you can be old fashioned and send an email at pure performance at
dynatrace.com.
And you'll be the first one to do it if you did,
because I don't count,
I don't count our old friend Rick's incorrect question way back in the first
few episodes,
we had some trivia questions, and he guessed wrong.
But any feedback or topic ideas,
we'd love to hear them. And thanks
everyone for listening. Bye-bye.
Bye.