PurePerformance - The Future of Ops is Sleep with Amit Chiba from Nedbank
Episode Date: September 25, 2023I was fortunate to travel to South Africa and meet many tech leaders in Johannesburg and Cape Town to talk about Observability, Security, Automation, Platform Engineering, DevOps and FinOps. One of th...ose leaders is Amit Chiba, Multi Product Specialist at Nedbank. I sat down with Amit to discuss his personal journey and his projects at Nedbank, one of the leading financial institutions in South Africa. Tune in and hear from Amit how self-service platform engineering helps them to scale observability, how they tackle cloud costs and why he thinks that the future of IT Ops is more Sleep!
Transcript
Discussion (0)
It's time for Pure Performance!
Get your stopwatches ready, it's time of Pure Performance Cafe.
It's a special episode because I'm traveling, finally traveling again.
I'm actually here in South Africa, in Cape Town.
Spent the last couple of days here, went to Johannesburg first,
and now I'm here, but I'm sitting with one of the speakers at the Dynatrace event, Amit.
Amit, thank you so much for doing this interview with me. How are you?
I'm very well, thank you. And thanks for having me. It's an honor and a privilege to speak
to you as well as the audience.
It's really great because I think for many of our users or listeners in the podcast,
South Africa or Africa in general, I think it's just an unknown territory. And also for
me, I only traveled, it's my third time time here but I know there's a lot of things happening especially in the observability space
now I mean could you tell me a little bit more about your background and what
observability means in the organization that you're working in absolutely so
I've been in the IT operations IT service management space for the past 23
years worked with multitude of products and tools.
Initially, my background or the first immediate task was, well, let's just do infrastructure
type of monitoring because that's the only thing that we knew at that particular stage.
So, started off my career at NetBank itself, spent a couple of years at an international company, IBM, but
also servicing multiple customers, both locally as well as internationally.
Carl's been one of those accounts as well.
So, yeah, I mean, I've had a good understanding and exposure in terms of what other companies
are doing. I've joined NetBank back again since 2014,
also in the observability space.
So it's been an interesting journey for me up until now.
And I've had the privilege to understand
and see the evolution of monitoring into observability.
So if I look at observability at an organizational level, what it means to NetBank,
well, NetBank has a centralized IT structure.
What that means and the benefits that we've seen out of having a centralized IT structure
is that when it comes through to implementing
observability and monitoring, it's not done in specific silos.
It's done across the entire organization itself.
So Dynatrace is one of the tools that we're actually using in our observability space.
It is deployed across our entire environment.
And really, one of the big benefits that we've seen
is that it's been able to provide us with that single pane of glass.
One question that I think also asked you in the group of your colleagues.
Now, the beauty of centralizing things is that you can enforce standards,
you can provide templates.
But some people say, you know what, everything is centralized.
If I need something from that central organization,
I'm just one of many that have to then put in a ticket
and then have to wait.
How do you balance the being central
versus giving people the freedom and autonomy?
Well, that's an interesting question
because that was definitely a challenge that we had in the past.
We came from a background where
even when you wanted to deploy an application, you'd
often have to request for services from the server operations team, which would typically
take weeks, if not months, to actually get deployed.
So obviously from that perspective, it was a big challenge for us.
One of the ways that we started looking at it was, well, how can we make it easier for
our customers to actually consume these particular products?
How can we become more agile as an organization and start delivering so that we can reduce
the time to actually go live into production as a subset of even specific services.
So in that particular regard, what we've done was,
because of the centralized model,
having those guardrails in place from a platform engineering perspective
really allowed us to start, firstly,
making the capability available
so that months that it used to take to provide a service started getting
now broken down into weeks right weeks and days it's gotten to a point now where that was all
good and well to have have your turnaround time in days but then we said well how could we further
optimize everything and potentially bring it down to ours?
So we looked at sort of having a one-click button, sort of being able to go into a specific catalog and pretty much request those particular services.
So as a result, as part of one of the strategies within NetBank was to provide a hybrid cloud environment.
The name for our hybrid cloud journey is called NetVana.
So that's our NetVana area.
So we created a NetVana marketplace, which is essentially a service catalog hosted on a specific platform,
accessible by all of our staff, our group technology staff itself,
where they can pretty much go select a particular server
or deployment requirement,
and after specifying a couple of parameters,
go and have the deployment happen.
Our server deployment now, or VM deployment,
is now less than in
our itself. That's cool. I mean, I love this so much when you showed me
your platform, right? We talked about platform engineering and then I asked
you, so what do you use and how does this look like? And then you showed it to me
and basically you phrase it very well because platform engineering, the goal is
really to, while you have a centralized unity, you really want to make sure that developers or your customers, as you told them, as you
call them, that they can do things in self-service so that they don't feel like they have to
be waiting in line to get things done from a central unit, but you're providing things
as a self-service, everything fully automated as possible and making in the end them more
productive.
Exactly. everything fully automated as possible and making in the end them more productive. Exactly and just in that particular regard whenever any resource is requested by the marketplace the necessary permissions are provided so
that they automatically get the access that they need not only to deploy
directly to the to the infrastructure or the services that they've just requested,
but it provides them with the necessary permissions,
as well as having all of those particular guardrails in place.
So our monitoring agents, the Dynatrace One agent,
is embedded as part of the entire deployment,
as well as some other day-to-operational type activities as well.
If I may, one of the other sort of things that I've seen across organizations is the fact that day-to-operations is always seen as an afterthought.
It's always, well, let's slap on a monitoring agent after the deployment actually
happens. But really, if you look at the entire cycle, your day two operations is at the heart
of everything because that is the cycle that takes the longest to actually complete. So it needs to
be planned. And from day zero, the actual design, that's when things should actually start happening.
So you have to design for operations,
you have to design for resiliency.
And I really liked, actually, in your slide,
you showed the four steps in the end
and you said you call it actually
site reliability operations, which I really like.
I mean, I always talk about site reliability engineering,
but really in the end, you are obviously
need to encourage your engineers to think about how
to operate or how to operationalize reliability and think about it in the end, you are obviously, you need to encourage your engineers to think about how to operate
or how to operationalize reliability and think about it in the very early days.
Exactly.
And I also like your slogan where you said,
the future of IT is sleep.
Because if you do the job right, they can sleep more.
Not on the job, as you said.
Not on the job.
But not getting woken up.
Hey, the last topic I would like to cover,
because you also brought it up as one of your projects.
If you're allowing a lot of people to, you know,
get a server here, get a server here,
Kubernetes cluster here,
the whole cost topic, right?
I mean, it will cost a lot potentially if they do things.
FinOps, obviously, is a big thing.
Like, how can we make sure that we are financially sane
in what we're doing?
What are your projects there?
How do you see FinOps?
Yeah, so FinOps is a very topical discussion
within the bank itself.
We also place a lot of emphasis on it.
One of the things, apart from just trying to report
on the current usage within our cloud environments is to try and make the
visibility available to our users itself.
In that particular regard, Dynatrace with its carbon footprint dashboards provides us
with the capability to actually understand and see which particular resources are overutilized.
What we've actually done was, using the power of Grail,
we've adapted some of those dashboards to actually provide it at an application level so that at any given time we can see what is the consumption being like over a period of time, as well as which particular VMs or servers actually contributing
most and which particular resources can actually be scaled down.
One of the other projects that we're looking at is to see, well, how can we actually bring
more of that cost analysis within Dynatrace itself?
So that's currently something that we're also working at, looking at at the moment.
And I'm also very much
looking forward to,
because you promised me,
now we have you on record,
that you will present
the FinOps topic
on one of our
Automation Guild meetings
in the future.
That's really awesome.
Awesome.
Hey, thank you so much, Amit,
for sitting down with me,
for traveling from Johannesburg,
where you also did
the presentation
down here to Cape Town
for enlightening people
with your stories.
And I think now it's time to get some food.
Awesome.
Thanks, Andy.
Thank you.