PurePerformance - Perform 2020: AI Assisted instance tuning with Akamas
Episode Date: February 5, 2020...
Transcript
Discussion (0)
Hold on, I'll give the intro.
Coming to you from Dynatrace Perform in Las Vegas,
it's Pure Performance.
Hola amigos, here your friend Leandro Melendez,
aka Señor Performo,
broadcasting from Dynatrace Perform in Las Vegas 2020.
Is that good?
Yes, and all PerfBytes and Pure Performance.
And together, multitasking, broadcasting with PerfBytes.
I'm really excited for the intro.
Yes, I like the intro.
So we'd like to say ciao, ciao to our guests today.
We have, was it again?
It's sorry.
Oh, Andrea and Luca from Akamas.
Yeah.
And we've had, who was the one Andy and I had on the show in the past?
I think Stefano Doni.
Yes, Stefano.
Stefano.
Yes.
And so that was a really cool episode.
I really loved hearing what you all do.
So for any listeners in the past, you already know what we're talking about,
but why don't you all explain what it is that Akamas does initially,
and then we'll go into exploring where you're at now.
Yeah, so if you already follow Stefano's speech,
basically the idea is that Akamas helps to close the loop
of continuous optimization.
Our goal is to help companies
to making sure that they're running their application
on their any IT stack
configured in the best way possible.
So we work in the space of
any companies
run their technology stack
using default value and keep increasing
the technology depth and technology stack using default value and keep increasing the technology
depth and technology stack using default value because that's what the market says.
Right, right.
Until something crashes.
Okay?
What we do is basically we help companies to optimize that and making sure that whatever
value they are using, it's the best for their specific workload, for their specific technology
stack.
Right.
And I believe we spoke with Stefano about the JVM tunings,
and there's like what, maybe 300 or 700 different settings,
somewhere in that.
Last one was 770, close to 800.
Yeah, different settings you can make on a JVM.
Most people know about their GC, right?
Very few of them.
But the idea here was it's going to run,
and it's going to start tweaking those based on how you're running
and see what's going to perform best
and use the AI to leverage to optimize that.
Because most people, you know,
I had no idea that there were that many options.
But this extends a lot further than JVMs, right?
Yeah, exactly.
So the main concept is that
even the 800 parameters that we just matched with the JVM,
most of them, most of the people don't even know that there are 800.
And even experts may not know everything on what every single flag does.
So the problem is not only with a single component like a JVM, but if then you work at the same level to have like JVM and operating system at the same time,
and you have to interlace different technologies.
At that point, the question is that who you are going to talk to,
to your Oracle expert or to your operating system expert?
And generally, that leads to a lot of interesting conversation.
So Arcamast, what it does is
go over this type of distinction
and optimize whatever parts of the stack at the same time.
So we can work optimizing JVM
and operating system at the same time. And we can work optimizing JVM and operating system at the same time.
And if I may, how does this optimization work?
Like, you have this 800 list of settings
that you can go through and optimize, check.
What do you do?
Just, like, tweet it up a little,
tweet it down a little, and start to...
How does it...
So it's actually much smarter.
So what we do is that we use a loop of configuration, measurement, test.
Sorry, a configuration, test, measurement, and reconfiguration.
So we use, for example, technology like Neotis for running performance tests on the application.
We use technology like Dynatis to give us an understanding on how the application reacts and performs.
And then based on this information
and deciding what type of goal,
like where do I want to go,
do I want to increase the throughput,
minimize the CPU utilization,
minimize the memory footprint,
the AI engine behind the scenes
decides what is the best next set of parameters.
And we explore the entire space
of possible configuration.
Right.
And I think it's really interesting, too,
because you give, when you talk about the 800 parameters
or thinking about tuning your JVM
between the JVM settings and the OS settings,
these are probably things that people were never doing to begin with.
Exactly.
It was, we think we've maxed out the performance of it
or this is what it is, so we'll just start a new instance,
where now there's this option of we might be able to squeeze another 30% performance
into this if you use a tool like Akamasa to improve the settings, and suddenly you're
going to get that boost of performance.
They really just have the JVM out of the box and running the data.
Yeah, or maybe they'll say, I'll take a different GC strategy, maybe, right?
A couple little things they'll try, and then they throw up their hands.
Like, that's it.
We need more machines.
Yeah.
So this opens a whole new world of, yeah, really, really awesome stuff.
And actually, it's completely aligned to the new idea of DevOps and AIOps.
So things about even solutions like Kapton.
So you get in the point of deploying the new application,
do the check on the quality gates,
and then you can even run an optimization stack
so that you know that you're running
with your very last code release
the best solution possible.
Getting the very, very best bang for the buck
that they have in settings.
So think about doing the same solution
that you just said of changing the garbage about doing the same solution that you just said
of changing the garbage collection
for every code release that you're having
because you're changing the code
and you're going to change the workload
on what the application does.
Nobody does that.
And also that could be,
those settings should not be static either
because your utilization patterns,
the way that everything comes into your application
might be changing, might be experiencing.
You need to always be.
Yeah, actually, customer is asking us, how many studies should I run?
And the question is, well, it depends how many configurations you want to have.
Maybe overnight there are less people going to your service.
You can decide to use a more conservative way to reduce, for example, costs running on your AWS instances.
And during peak, you change the configuration because you know that you want the most performance possible.
It would be a little bit like your thermostat settings.
Exactly.
If you're not home, just let it cool,
let it not work so much, and if you're home
and it's summer, you want it cooling up stuff.
Exactly.
I'm from Boston, so I have the opposite problem
when I have to start the heating.
But it's important that you have something monitoring
again with the thermostat example you have something like saying hey i probably movement
sensors that there's people at home or you're coming home in a couple hours i need to heat it
up a little bit same with the system you you will know that you have like a black friday event or
you have you are going to have a create a dedicated configuration exactly for that specific case that's really cool so what's been going on since last we spoke with
Stefano I let Luca reply because he's the news certainly news for sure we extended the scope
of technology that they were able to to address with Akamas so at the beginning we start with
Linux and OS that are one let's say the and Java, the most common on our customer base.
But nowadays, we work also on databases, on application server, on Spark, on big data platform, on Elasticsearch, several technology, and also now AWS, which is really interesting.
Because with AWS, we were able to pick
the right instance and reduce
the cost of our customer,
guarantee the same service level to
the end user. So,
at the end of the day,
each instance can
be a parameter because
you have to pick up the amount of
CPU, the amount of RAM, the kind of disk.
So, all of them are parameters.
You can have a nice Ansible playbook
able to engage through the API AWS and spawn an instance.
So Akamas was able to integrate with all those aspects
and do a study in order to pick the right instance,
pick the right amount of CPU and RAM
and reduce the cost at the end of the day for our customers.
That was really nice.
It's a kind of new goal in Akamas
with respect to the traditional performance throughput
or response time-based approach.
The interesting piece that we are finding,
the more we are using Akamas,
is that originally we started with the idea
of a transactional type of workload,
like a number of users connected to the same service
and running performance tests.
But the more we are using it, the more we realize
how broad is the scope.
So Luca mentioned, for example, the AC2
instances. He mentioned Spark, but
for example, even all the batch job.
There is a huge amount of
requests on how do I know, how can I
squeeze the jobs
in a shorter amount of time by
changing what is the order that I'm executing those
jobs. And we run
a scenario and basically
keep changing
parameters and the execution to making sure
that you are running at the best way possible.
So the more we are using, the more
we find out there are more interesting things that we can do
about it. Yeah, you keep like
no pun intended, AI
learning how it is, how can you
expand, what other things can you start to tweak.
Exactly. And who knows
probably you'll get to a thousand settings
that you can start playing with.
So there are a lot of interesting
new features. And it's great that
you talked about the AWS side because
one of the goals of performance in the
cloud isn't just to get a better
performing application, it's to reduce your cost because most people when they go to the cloud isn't just to get a better performing application.
It's to reduce your cost because most people, when they go to the cloud, they just throw everything up.
And developers have unlimited resources, so they just do whatever they want.
What do you mean I have to do tuning?
It's just a cloud.
I don't need to do that anymore. We don't pay for it.
But the finance is getting the bill, and it's like, oh, my gosh, what are we all doing here?
So that's really great that you can look at it and see we can reduce these and pull them in and even just automate that process.
I mean, we've seen in GCP, even though if you have a Google instance, they'll kind of say you're oversubscribed on this VM, but they're not going to do anything, right?
So you're taking that extra step of saying we're going to tweak it and change it down.
Now, earlier when I came by the booth to say hi to you all, hopefully I heard this right.
I thought you said something about Kubernetes as well.
Yeah.
It's one of the brand new.
We just had someone from Red Hat over here.
We were talking about a lot about Kubernetes.
And I wish you could be here to hear what you're about to say.
So let's go ahead.
No, Kubernetes was, I forgot it, but it's one of the new technology
that we are able to address.
And we were able, and a customer,
even to increase the throughput of the application,
reducing the footprint on the data center
working with Kubernetes.
From Akamas, it's really easy to work with Kubernetes
because it's CLI, it's API,
it's really a nice way to integrate with that technology.
And it was a really good experience
because at the beginning, the idea was just to increase the throughput.
But working with the customer on their Kubernetes,
we found out that there was really nice configuration
reducing the amount of the size of the pod, the memory,
so memory for pre-introduction.
But at the same time, working on the JVM side,
we were able to increase the throughput.
So the two layers combined bring us really great results.
And they were running, it seems to me, a decrease of 20% of the memory footprint on the single pod.
But they were running 100 pods. So at the end of the day, something like 300 gigabytes of memory reduced on the data center was really a nice outcome from our perspective.
And that works as well, like on-prem Kubernetes as well as the cloud solutions like EKS and all those?
Yeah.
Akamas just needs to be able to apply the parameters, generate workload, and measure how the test goes.
And we can even work in production.
This is a new feature that we are working on. In our vision,
let's say that it's a sort of canary
deployment in which the canary
is not a new release, but it's a new configuration.
So we can just run maybe one
microservice with a specific configuration,
compare how it behaves
with respect to the old
configuration running for most of the service.
And then Akamas can, it's the
same approach, but in production. And you can decide to roll it of the service. And then Akamas can, it's the same approach,
but the production is set.
And you can decide to roll it over.
Yeah, and without the test.
So you can actually measure how users expect that.
That's great.
I started to look at the deviation before I started to fully step on the next.
Everything, so one of the things
we were just talking about with Justin at Red Hat
was one of the changes in CoreOS
is that you can now
manage the deployment of it as if it was code.
You're just pushing it. You're not actually going in, logging in,
making updates. You have an OS update
that's being pushed as code.
So this idea of
everything as code, where everything has a push
event, treating your
settings as canary releases. This all
fits in together and it's just
trending everywhere. From OS updates to your code updates, obviously.
We had on, I forget who it was a while back,
talking about network as code,
putting out your network configurations as code,
which you can then, again, in a Canary or blue-green style,
do all this stuff with.
And it's amazing how this idea has taken off,
and it's great to hear that you're right on top of it
with all this.
Anything else? Yeah, any great to hear that you're right on top of it. Anything else?
Yeah, any news or things that you see coming or happening to
Akama soon that you're
excited about?
We were saying earlier like
if you cannot give spoilers but
Well, there are some things that we
can say, some things that we cannot say.
Yeah, understandable. But of the ones that you can
and you're excited that you see happening soon for for akamas so one of the things that i'm
more excited about and then luca you can go with yours is that we are expanding the ecosystem of
of partners of partners that we are working with so akamas is we are we are investing a lot of
time to work with dynatrace both from a monitoring perspective as well as captain we're working a
lot with neotis to have an integration in place,
because what we want to do is to have the chance to have Akamas part of your cycle
and make that as easy as possible.
So one of the things that we have is that Akamas is actually on the Dynatrace marketplace now.
So there is this continuous involvement of being part of this closed-loop methodology
that we keep saying to our customers,
both from a performance engineering perspective other than Akamas,
and that's amazing.
So the ecosystem of people that we keep working with
keeps us to find new ideas and where we can go.
And the partnership that we have, it's amazing.
Yeah, I do believe it's like a great mission what you're
aiming to because in the same way as monitoring that i personally probably not many share my
thought should be everywhere when you have a system that you know what is going on for what
you're describing what happens us should also be on every system so that you are able to tweak and
tune and have the best shape possible yeah everything and it's a good purpose that you are able to tweak and tune and have the best shape possible. Everything.
It's a good purpose that you're trying to serve.
It's pretty cool.
Hopefully you'll reach.
Before we get to Luca's answer, I just wanted to also mention last year the trend was we
were talking a lot about API integrations with different tools where you're getting
tools to work and do things that they weren't initially designed to do.
But because that API exists, someone comes along and says, ah, I can use that and do this.
And it's like, wow, what a great idea.
And when you talked about the trio of Neotis, Dynatrace, and Akamas and how there's all these API connections
and, you know, you have your observability piece,
you have your load piece,
you have your tweaking,
the tuning piece all working together.
Behind the scenes,
once you have it all connected up,
you don't have to have a human being in there
tweaking things and doing stuff.
It's just like an endless loop of robots
talking to each other.
Exactly.
Suddenly becoming aware and alive
and then the Terminator comes and destroys us all.
Becoming Skynet.
That's not a good point.
The most optimal Skynet.
Yes, exactly.
But Luca, what were your thoughts?
For the future, okay.
For sure next week there will be a big announcement.
The official integration will be public with the Neotis.
Now it's working mostly on our labs and, let's say, beta customer,
but it will be officially public next week.
So we're quite happy about that.
And what I really
another interesting point is the
approach that we are starting to have with our
customer, the idea with this
is to build a community where everyone can
contribute, add new
technology to the scope.
So we have this concept of optimization
pack, which is a
piece of knowledge where we put all our knowledge about a given technology for
instance all the parameters that are relevant for a technology all the
metrics and the idea is to make this sort of public space where anybody can add
new technology in order to help Akamas to improve, but other customers to work with the same technology.
And another news could be, I'm not sure
by when it will be available, but it is to have a way to.
Pay attention what are you saying.
Yeah, because dev team is listening.
I know that dev team is listening,
so I know that when I come back,
they will say, no, no, no. You just revealed our secrets.
It's a way to simplify a sort of trial or adoption
because right now we have not a trial.
And customers are keeping us.
It's your initial integration or the hookup.
Here's a taste.
Yes, exactly.
You pay for the next one.
It's almost like a drug dealer does, right?
You give your little taste, you get hooked on it.
The first one's always free.
Yes, exactly.
Pretty cool.
Anything else that's your just general technology side,
things that are happening in the tech world
outside of what you specifically do that you see going on
that you're excited about.
And smile for the camera.
Which camera?
That camera.
There's also a live stream.
Just in general, obviously you guys are
outside of what you're specifically doing.
When you look at what else is going on,
what do you see that you just are like,
that's really, really awesome and I really want to see where that goes in the world.
So one thing that I'm more interested about,
and I see a lot of customers going into that direction,
it's basically what has been mentioned also today during the main stage.
It's all the concept about automation, which Akamas fits in,
but it's not just a matter to promote even more the concept beyond Akamas,
but it's something that every company,
every of our customers, so Akamas is part of a broader group
that does performance engineering as consulting services. So we work
with customers to help them to optimize their performance engineering
practice, either APM or load test, capacity optimization,
and so on. And everywhere that we go, the concept of automation and thinking what's going to happen
in the next five years start becoming the late motive where everybody keeps, how do
I make that so that I automate as much as possible and I have to have less people involved
in my operations?
Which means, and that is important i i see this i see this shift
a couple of years ago having this type of conversation leads to nothing because people
were scared of what i'm gonna what i'm supposed to do when everything is automated right now
instead every start thinking it's not i'm losing my activities but i will shift to something at a
higher level right so i don't spend time on doing performance analysis tests,
but I educate other teams.
You can do more of those activities or better.
Or even managing all the automation.
Someone's got to do that too, right?
And educate how do I embrace, how to make sure that the application developer,
which is actually doing the job of pushing new code,
are actually aligned to the practice that I am suggesting them to use.
And this shift is becoming more and more prominent
above all here because applications
are starting to get pushed much, much more frequently
and people cannot keep hiring new people
to increase the team.
So the problem is automation
and try to make that as easy as possible
to be embraced by anyone in the company.
So automation and visibility,
those are the two main thing that i changed that i that i've seen even done this start from being a performance
engineering tool to uh everybody in the company has to see everything yeah would you any anything
on your side luca yeah let's say that i'm based in europe so my perception of the market, of the IT market, could be a bit different. I am the lucky one.
But what I noticed over the last months,
I'd say that finally, even in Europe,
and even in the enterprise company,
those approaches that Andrea just mentioned,
even Kubernetes and microservices are becoming a reality.
Maybe it's just a pilot project,
some experiment.
They're putting their foot in the water, right?
Yeah.
Attending conferences like that,
I've been learning about microservices,
Kubernetes, for years.
But in my daily job,
I never had the chance to work with my customers.
In our lab, we have it,
because with Akamas,
we need to work with this kind of technology, but I never had the chance to work with my custom. In our lab, we have it because with Akamas, we need to work with
this kind of technology, but I never had the chance
to work in reality. But now it's
happening, and I'm quite happy with that because
we'll enable a new kind of approaches
to testing, performance,
optimization, and so on that
were not possible with the old approaches,
monolithic application, long
deployment process, and so on.
So I'm really glad that also in Europe we are getting there.
Where are you getting, like, with some customers that were early adopters
or when you started to speak about it,
did you get, like, the expressions of the people that you were exposing that,
like it was science fiction or something that you were doing?
Some of them, yeah.
I have to admit that for them, it seems okay.
You can automate all that?
Yeah.
Yeah, it's something that in some regions, as you mentioned,
everything seems to be flowing,
and some others are just like ahead of and out of something,
and I don't know if I can adapt it and try to customize.
Yeah, it's true.
It's also true that a lot of companies now are starting moving
with the idea of being disruptive
if they don't do make that type of changes.
So what basically happens is that they are more,
they are planning for the transformation to the cloud
and to our Kubernetes infrastructure
because if they don't do that,
that will become the way that they don't stay in their business.
So entire organization restructuring
to move from a monolithic approach
with the database team and the application team
and the network team to a full horizontal.
So there is a team that supports the entire business process.
I see also those as well.
Again, probably in the U.S. it's much easier than in Europe
for the type of way of understanding business, of running
IT and running business. But I've seen
those type of changes and it's happening more
and more and more frequently.
Great. And how long are you
out here for? Sorry? How long
are you in Vegas?
I've been here since Friday and
I will be flying back. I will be attending
the CMG Impact event next week.
We'll be at the Westin.
So it's been two weeks here, and I'm not ready for that.
Oh, wow.
Is this your first time in Vegas?
Yeah, first time.
Okay.
Are you enjoying it?
Yeah.
Let's say that I'm starting to get rid of the jet lag today,
so I'm going to understand where I am, actually.
Do you have to head back right after the show,
or are you going to spend some time to check some things out? No, I will be back at the end of next week. Oh, you're going to CM where I am, actually. Do you plan on, do you have to head back right after the show, or are you going to spend some time to check some things out?
No, I will be back at the end of next week.
Oh, you're going to CMG as well.
Okay.
So you're two together.
Well, we're working in Boston anytime soon,
so if anybody wants to visit us in Boston, please come on.
We have, yeah, we're in Waltham.
You've got to say Waltham.
Waltham.
As they say it up there.
It's not Waltham, it's Waltham.
Two A's.
But, yeah, that's awesome.
Boston accent.
You have to apply to that.
You know, I like to do accents,
and I like to do other, you know, be silly and all,
but the Boston accent is a very tough one for me to do.
I am refusing to do any possible sentence
that is approachable to that.
Yes.
I don't have a Boston accent.
All right, excellent.
Well, really, thank you guys for coming by today,
and awesome to hear all the updates.
So we look forward, you know,
knowing that we have all these integrations,
I look forward to seeing and we'll see more of what
we're doing together.
And also, would you like to give a heads-up
where can people find out more about
Akamas or hear from you?
So for people here that perform,
they can come at the booth of
Moviya and Akamas where we can actually show the tool and show how that result can be achieved.
And otherwise, I invite them to visit akamas.io, pronunciated as it spells with a K.
Okay.
So we can find resources and all information about how Akamas works and how to get in contact with us.
Excellent.
All right.
Thank you.
Thank you very much, guys.
Thank you so much.