The Pragmatic Engineer - Developer Experience at Uber with Gautam Korlam
Episode Date: March 12, 2025Supported by Our Partners• Sentry — Error and performance monitoring for developers.• The Software Engineer’s Guidebook: Written by me (Gergely) – now out in audio form as well.—In today�...�s episode of The Pragmatic Engineer, I am joined by former Uber colleague, Gautam Korlam. Gautam is the Co-Founder of Gitar, an agentic AI startup that automates code maintenance. Gautam was mobile engineer no. 9 at Uber and founding engineer for the mobile platform team – and so he learned a few things about scaling up engineering teams.We talk about:• How Gautam accidentally deleted Uber’s Java monorepo – really!• Uber's unique engineering stack and why custom solutions like SubmitQueue were built in-house• Monorepo: the benefits and downsides of this approach• From Engineer II to Principal Engineer at Uber: Gautam’s career trajectory• Practical strategies for building trust and gaining social capital • How the platform team at Uber operated with a product-focused mindset• Vibe coding: why it helps with quick prototyping• How AI tools are changing developer experience and productivity• Important skills for devs to pick up to remain valuable as AI tools spread• And more!—Timestamps(00:00) Intro(02:11) How Gautam accidentally deleted Uber’s Java Monorepo(05:40) The impact of Gautam’s mistake(06:35) Uber’s unique engineering stack(10:15) Uber’s SubmitQueue(12:44) Why Uber moved to a monorepo(16:30) The downsides of a monorepo(18:35) Measurement products built in-house (20:20) Measuring developer productivity and happiness (22:52) How Devpods improved developer productivity (27:37) The challenges with cloud development environments(29:10) Gautam’s journey from Eng II to Principal Engineer(32:00) Building trust and gaining social capital (36:17) An explanation of Principal Engineer at Uber—and the archetypes at Uber (45:07) The platform and program split at Uber(48:15) How Gautam and his team supported their internal users (52:50) Gautam’s thoughts on developer productivity (59:10) How AI enhances productivity, its limitations, and the rise of agentic AI(1:04:00) An explanation of Vibe coding(1:07:34) An overview of Gitar and all it can help developers with (1:10:44) Top skills to cultivate to add value and stay relevant(1:17:00) Rapid fire round—The Pragmatic Engineer deepdives relevant for this episode:• The Platform and Program split at Uber• How Uber is measuring engineering productivity• Inside Uber’s move to the Cloud• How Uber built its observability platform• Software Architect Archetypes—See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Transcript
Discussion (0)
Vibe coding.
This is a brand new term.
What do you think it sounds for it?
Vibe coding is just you trying to figure out how the system should behave when you're prototyping.
The good thing about vibe coding is you're able to like basically iterate much faster with
this agentic loops.
That now you're just focusing on how would I actually achieve my outcome rather than the exact way
to do it because you can always change those details later.
If you have the right abstractions, you could swap layers out.
So I think that's changing the way people think about software because if you're someone who has
taste and knows this is the outcome I want to see on the U.X.
You may not even need to know a framework.
You should experiment with it till it feels right.
And of course, you need to massage it to make sure it doesn't make it harder
to your wall later on.
I think a lot of the software today gets you to the initial.
It looks great, but then how do you maintain it is basically sort of left for later time.
And I think a lot of people who are still coding with AI are early in the prototype phase.
Gotham Coral and worked at Uber for almost 10 years, joining as Android engineer number
8 and was the founding engineer for the mobile platform team. He grew from software engineer
2 level all the way to principal engineer and worked on developer experience and scalability
challenges throughout his decade at Uber. Gotham is now the co-founder of guitar and a Gentic AI
star that automates code maintenance. In today's conversation we cover how Gotham accidentally
deleted Uber's job on Monor repo, which had thousands of developers committing to it. Uber's in-house
engineering tools like Monor repo, Submit queue, local developer analytics and death pods.
AIs impact on software development,
why vibe coding will spread
and why Gautom believes Junior Degers will thrive with AI tools.
Uber's engineering culture is very unique,
even across tech companies,
and this episode is a perfect inside look of its early days.
If you enjoy this show,
please subscribe to the podcast on any podcast platform and on YouTube.
Welcome to the podcast. It's great to have you.
Oh, thanks so much, Gary.
Glad to be here. Thanks for having me.
It's good to reconnect,
because we talked a lot while you were at Uber.
You know, you were on the platform team and your work touched a lot of the systems that we worked on.
Every engineer causes outage that Uber.
I did it.
People on my team did.
What was a memorable outage that you might have caused accidentally or deliberately?
So this is a funny story.
Towards the end of my career there, I accidentally deleted the Java monor repo at Uber.
And there were thousands of devs committing theirs.
Yeah, the Java
Monor repo, that was
one of the, we had two
two big back in monoripos, right?
Go and Java.
Go in Java, yeah.
So this was after we did the migration
on, like almost everyone
using the Java Monor repo.
Yeah, it was pretty funny because I
what the story is essentially,
I don't think many people know about this because
I record pretty quickly, but
I was trying to test
something out like a test
repo and I was like, hey, I need to copy like the Git push command because we created the
repos on push.
So I copied the URL from the Javamon repo.
I forgot to change it.
And then when I pushed, usually at Uber, they prevent you from force pushing.
And it said, hey, I can't force push.
And I didn't really read the message.
Then I was like, why does this.
I should be able to force push.
And I did with some special flags.
And there's only a few people, maybe like three people.
at Uber who can force push with that flag because I was part of the platform team and we
didn't want anyone else to do it and I pushed and then someone brings me on Slack, hey, so
the repo says initial commit, what's going on? So you were at a principal engineer at this point,
right? Like you were the most senior people and you were there for a long time, right? This is like
not some, you know, you didn't know what's going on. So I mean, that's why I had the permission to
to push. So whole repo gone for until you reverted minutes later. It was. It was.
It was tricky because we had to get back from a backup.
Yeah, it was not too bad.
It was like a few minutes.
And I think we had an instant review.
It was luckily at a time and we had a submit queue, so the queue would pause automatically and said, hey, no work is lost, right?
We had really good backups in place.
So we reverted quickly.
And I was like, ah, got to be very careful with those like copy paste commands.
Wow.
Well, but I guess you tested the backup actually worked.
You should have just said that.
It was just a test.
I want just to see if the...
Stress test.
...actually worked.
Yeah, chaos engineering.
This episode was brought to you by Sentry.
Buggy lines of code and long API calls are impossible to debug,
and random app crashes are things no software engineer is a fan of.
This is why over 4 million developers use Sentry
to fix errors in crashes and solve hidden or tricky performance issues.
Centricos debugging time in half,
no more soul-crushing lock sifting,
or vague user reports like, it broke,
Fix it.
Get the context you need to know what happened, when it happened, and the impact, down to
the device, browser and even a replay of what the user did before the error.
Central will alert the right div on your team with the exact broken line of code so they can
push a fix fast or let Autofix handle the repetitive fixes so your team can focus on the real
problems.
Sentry help Monday.com reduce their errors by 60% and spend up time to resolution for next door
by 45 minutes per dev per issue.
get your whole team on Century and Seconds by heading to Century.I.O. slash pragmatic.
That is S-E-N-T-R-Y-O-S-Pragmatic.
Or use the code Pragmatic on sign-up for three months on the team plan and 50,000 errors per month for free.
You reverted, thankfully, not, you know, like nothing crazy happening, but this is a pretty big deal.
How, you know, like, how can we imagine?
Like, what was the response of, like, your manager, the incident interview?
I mean, yes, sure, you were an experienced engineer, but come on.
This was a big deal, right?
So the good thing was since we had the summit queue,
and we can talk about it a little bit more,
the only thing it affected was the amount of time
it would take for a commit to merge onto Main.
So usually everything would get serialized.
So all that it looked like was there was a little 20-minute delay
where we recovered everything,
and then everything just floated naturally.
So it was just like a random CI failure you would see,
but we had so much recovery automated in place
that it was almost a non-event.
and the only SLO we broke was the latency SLO,
no reliability SLOs were broken.
So that was pretty good.
Like all the systems we put into place
in the many years actually worked to save me that day.
Okay.
And this is a good way to say into the next one,
Uber's unique engineering stack.
I worked at Uber.
I worked for a lot shorter time than you,
but we overlap.
You know, you were there when I started
and you were still there when I left.
Uber had a lot of unique systems.
Can you talk about some of them
and how they came about it, and why?
Why did Uber build so much of its stack?
Well, Uber, and also you built a lot of that stack, right?
Yeah, absolutely.
So when I joined Uber in 2014, I was like Android engineer number eight, I believe,
if I remember correctly.
And there was no unit test.
So I wrote the first unit test for the Android code base.
Then I set up an artifactory.
Yeah, it was pretty funny.
We were shipping so fast that people were just, you know,
integration testing using their mobile phones,
because it was just very, very fast.
growth.
This was in 2014, 2015?
2014 is when I joined, yeah.
So then I brought in artifactry so we could share stuff.
And over time, what ended up happening was a lot of the stuff you take for granted
today, the cloud native SaaS products for observability or hosting source code were
just not built for our scale back in the day.
So we had to build a lot of stuff in-house to solve some very crazy problems.
So we had commits going in almost one, every single.
minute, I think, and that might not seem like a lot, but when you have thousands of engineers
working on code basis where you can easily go into merge conflicts, that was a big deal.
I hear a lot of companies saying it, or at the time it was popular to say, it's not built
for our scale. What was that scale? I mean, you mentioned the one one commit per minute, which
I can actually attest to that, that build systems could not handle, at least mobile bill systems
could not handle that, but what other scale was considered just like way bigger than what
commercial vendors supported.
Yeah, absolutely.
So I think the bill system part was definitely one because bills are taking just way too
long.
I think we embarked on like a big app rewrite, I think, when you were there.
I think you talked about it before.
He was funny.
I was actually in Amsterdam and that happened in between.
And then my boss calls me up, hey, this build time is like getting crazy.
We need to somehow bring it under control.
Oh, that's you.
Yeah, I was there.
And it was almost an hour.
something plus. So if we actually sealize all the commits one by one to make sure there was no
merge conflicts, it would just take forever. And if you did not do that, our main bench would be
red. We had this graph where it was like red, green, red green, red green, and it was impossible
to work because there were so many teams trying to hit their deadlines. And you were depending
on so much work from the other teams, like the networking library or the platform experimentation
or a feature library from some other team. It was just not possible to do this with our green main,
essentially. So it's not just the
build system time, but also the
fact that when you push something, if you don't run
tests against the other part of the code, then you're just not
going to have a good time because you will just be stuck
reverting stuff. Someone would be on call all the time.
So it was a combination of factors. It made it very hard to work
with. Yeah, and also, like, we did do that on that project
specifically. We turned off end-to-end test, I think, for two or three days
to speed things up and it sped things up.
And then we had so many regressions.
It took more time to, I think, you know, for two or three days, we turned it off.
And for the next week, everyone was fixing the regressions that kind of sneaked in underneath.
Yeah, I think we got a request at some point, can you just turn off CI?
And I was like, you know what's going to happen if we actually did that and I had to really talk people down saying you cannot just turn things off.
It's going to make things worse.
So there was the submit queue.
Can we talk about what Submit Q was and why it was unique, at least initially to Uber?
Even to his date, I'm pretty sure only a few companies have anything like that.
Yeah, I think Submit Q is essentially a way for you to guarantee a green main.
So it's a way to kind of serialize your comments coming in and make sure that they play nicely with each other.
There are some more open solutions now, but when we did it at the time, we had actually a paper that we published.
It was fairly novel because we had.
very low tolerance for main being red.
So we'd always want to guarantee a green main,
which means we had to figure out how do we actually test changes coming in
and make sure that the cross-dependencies between different commits were considered.
So it's just a way to kind of handle commits at scale
and make sure that when everything merges, everything still builds with each other.
And that's a trickier problem than it sounds because we had to test things in combination,
discard some paths, and do a lot of processing on the back end.
but there was just one of the systems.
I think we had to build a ton more,
which I can talk about as well.
I was the consumer on this side.
I was just using it.
It was just a built system for me.
But I didn't realize for quite a while that,
you know,
because my team,
the 10 of us who were working on it,
we didn't really like have merge conflicts with each other
because we were working in different parts for the code.
But it took me a while to appreciate that
when I was pushing something,
like at any given time,
there would be maybe 10, 20, 30 people
on the same code base in mobile working, and each build would take 20, 30, 40 minutes.
And then behind the scenes, I actually, I understood only when I read the paper that you were
looking, checking, like, is this code path in conflict?
If so, which one should we build beforehand?
There were like all sorts of probability models.
There's a bunch of math in that paper as well, right?
There's a lot of ML models to, like, I think we had to, like, estimate what would potentially
cause a failure and speculatively try to go on paths that might.
be green to try to optimize and then backtrack if you didn't make that goal and we would make
that model better over time.
So it's quite interesting.
I think if you didn't notice, that's a good thing.
It means that it worked as it's supposed to do.
One thing that Uber did as an engineering practice was monorepos.
Can you tell us how it's Uber had multiple monorepos, but it all started with the iOS monorepo.
And I recall you were there.
Can you tell us how the decision was made to go from having a bunch of separate repos initially and iOS later on Android and say like, all right, let's just do one big repository.
And why did this happen and what happened next?
So it's funny.
I mentioned, I think early on we didn't have tests.
At that point, very, very early on, we had two apps, rider and driver on both iOS and Android.
And we had another repository called library, which is a common component between the rider and driver app, which was just a.
submodule he would pull in.
And it was really painful to pull in as a submodule.
So we said, hey, let's set up some sort of cocoa parts artifactory thing.
You could download stuff from somewhere.
But as we grew, we realized we couldn't put everything in one place.
We had a networking team, experimentation team, a analytics focus team.
They all wanted their own sort of nice little playground so they could experiment.
So we went off and had hundreds of repos for a while.
But everyone was just, you know, unaffected.
by everyone else could focus on their priorities and deliver.
But then when you wanted to pull in these changes,
let's imagine the networking library changed.
You had to update networking,
then the analytics library that relied on it,
then maybe the experimentation,
maybe some other feature library,
it was really painful to upgrade and pump.
And we were sort of on the hook
because anything that the product teams don't do,
the platform team ends up being the kind of like catch-all.
And you're like, hey, this is not going to scale.
I think someone on the iOS team was like
Hey, we should just go to a Monter Report.
We'll just take a weekend.
Famous last words.
I know, but they took a while.
I didn't take a weekend.
But the iOS team did it and it was very effective for them.
It was a lot of pain initially.
And I think happened the same time
they were moving to a new language Swift as well.
So that was like double whammy.
They had to do both.
They pulled it off and the gains in productivity
were massive because you don't have to go bump libraries because we were not an open source,
you know, make sure the interfaces are nice for external consumption.
It's one company, one code base.
Why can't I update my library's public API and just update all consumers in one shot?
It should be possible.
So mobile did it.
And then Android quickly followed suit.
We actually froze the code for a weekend and we did everyone over.
We moored everyone over.
And it was painful at the start because the build systems, as I mentioned, were not
meant for hundreds of modules in the repository.
We have seen some changes to Google to make it better.
And then after a while, we migrated to a new build system buck from Facebook at the time.
And we wrote some auto conversion from the existing system to the new system.
So we had both at the same time.
That was really successful because once things were fast, the NPS improved again, monorepo is good.
So the just is monorepo, non-monerpo, if you don't have tooling, it's going to suck.
at a bigger scale you have to invest in tooling
as your companies grow typically
this need becomes more and more obvious
as things start slowing down
and then later we had the jaw monotipo
and we migrated over to Basel
and we had the Go Monotepo
and I think what we realized with the monoripo
was that the main thing that helped us do
was it was standardized which meant
to make a big change
one team could go do it
like a centralized team could just take the burden
of updating this one networking
library for everyone or updating Google Play services for everyone. You just cannot do that in like
hundreds of repositories. It's just very wasteful amount of work. And what was the biggest pushback
that the teams initially had? Because I remember the mobile ones that they actually happened and it was
you know, like very practical. But then when the Java and the goal was happening, I remember a lot
of teams were dragging their feet. Like I'm not sure we want to do this. It's going to, you know,
slow things down. Do you remember what the biggest, you know, like,
skepticism was and then how it turned out?
Yeah, the biggest skepticism is usually right now, before Monrepo, I could break an API
and not worry about it.
The tax is paid by someone eventually.
It's just deferred.
It still needs to get paid.
But in the Monrepo, if I do it, I have to think about my interface design more.
And my bills are going to be slower for this argument.
With the new bill system, we kind of said, hey, the bills are not going to be slower.
But people would always come up with reasons saying, hey,
if I did this upgrade
then I had to do it for everyone
so we said okay you know what
we will make it possible for you to just
upgrade your part of the code in some scenarios
but otherwise we want to have everyone be standard
it does make things a little slower
but then as a business it makes a ton of sense
because you might not see the macro picture on your team
that you might be slowing slow a little bit
but everyone else gets like a huge productivity boost
because they didn't have to spend the time you did
on each and every one of those teams
to like get up to speed.
Yeah.
Is it safe to say that monorepos often, not always,
they,
they bring kind of a almost necessary friction
where like, you know,
like making certain changes will be painful
because they should be painful.
Like because when they were not painful,
they were masking something down the road.
That's a, you know,
I update a dependency to something else.
And then it has to trickle down.
Otherwise, you'll have these problems,
but they'll only come out in weeks.
or months later.
Yeah, to be fair, like, you can also do stuff without the monorepo.
Like, I know companies like Amazon, Netflix, do multi-repo setups.
But they build tooling, especially to handle those cases where you have golden version
sets that work with each other.
It's a little bit like a monorepo without being a monorepo.
So you can make both work, but monrepo is just generally easier, in my opinion.
Just having considered both.
But if you want to go the other route, you absolutely can.
You still have to invest either way as your code base goes.
And what were some other kind of novel slash innovative things that,
I mean, it sounds like you had to invent some of these things at Uber
because there was just not a solution that you could buy
or it was just really expensive or not built for this use case?
Absolutely.
So the other thing I think which is interesting for us is, as I mentioned,
build times had a problem.
We only knew because we first tried to measure it.
I think back in the day, a lot of the measurement products for developer observability,
but just not built for the entire STLC.
So what I mean by that is, like,
you might have like a CI Jenkins back in the day
that would tell you how long your bills would take,
but you had no idea how much your indexing time would be in your IDE,
and we had very complex projects.
Or you had no idea what was the time between pushing and code review.
So we built a system at Uber called Local Developer Analytics.
We call it LDA for short,
which was like a little demon that used to run on your machine,
collect information about your system,
like what's your CPU usage, memory usage, but also like integrate deeply into a lot of the
CLA tools, the IDE.
So we would, for example, know when you opened your project, which file you went to,
which part of the file you were actually like coding the most, which files would have the most
bugs.
And then you could also see like a funnel, like in a traditional product funnel because we
would care about like developers as like a end user.
You would see like analytics saying, hey, there was a drop in the funnel.
People could not create PRs because this.
part of the build process would error out more often than the others.
So you would go and know what to target.
And it's fairly novel because I gave it a few talks at a few different conferences
and not even big companies had something like this back in the day.
And I think we still use that today to power a lot of the dashboards and the deep analytics
that we get from the developers, for close.
Yeah.
Like what do most, like you've been in the developer productivity, you know, you talk with a lot
of people and know a lot of people, but what do most companies do?
Do they just ask developers, like, how does it feel?
I think that's a very important thing to do for sure,
because even if you have all the measurement,
if the development doesn't feel right to you,
you still want to ask that.
Like, you want to have a survey, which we did.
In fact, when I remember when we started doing the surveys,
I think we had like an NPS score, negative 50 something.
And when I left, it was like positive 8 or something like that.
So I call it as a win that we did a lot of effort.
So you do your survey as the very first step, which we did.
but then you measure the easier things
like your commit time to
like review code or time to like build code
stuff that's out of the developers control
you want to like minimize and also time spent in
meetings like those are things that you could probably get easily
but the deeper stuff is only
worth it if you see that
those bottlenecks are adding up
like if you're making changes but you're not seeing
like people are happier
because you'll always see developers complain
build times are slow or ID indexing is slow that's when
you know okay you need to measure things
but now I think product
like Vosco JetBains have better analytics.
I think we've worked with JetBains, for example,
in trying to make them understand,
hey, these are some of our problems.
It'll be great to have solutions out of the box with these.
And some of those did make it upstream into products that everyone uses today.
So we call that as like we kind of had to forge our own paths
and eventually all the vendors would catch up based on what the industry was doing.
Yeah, but it's interesting how back then there was nothing.
and you kind of like pinpointed.
I guess one advantage that Uber always had is,
A, we had a lot of developers,
and there was actually a dedicated, you know,
what was your team's name?
Was it developer platform?
We were called developer experience.
They had multiple names.
I think we were mobile infrastructure first,
then mobile developer platform,
then developer platform.
That was the final name we had.
But funnily, actually, the team was not that big.
We had my team was about 10 people,
and we supported about 1,000 engineers,
and that's roughly the breakdown.
So you would see that speaking to peers in the industry, the teams are about the same.
You have a very small, quote, centralized team, and then you have a much bigger product
organization that they support.
So it's very common.
And that means that you have to find ways to scale yourself and be effective.
And then there was another project that I remember that was just taking off just in my last
year.
It was called DevPods.
What was that?
So I think a lot of people might have heard the term cloud developer environments or developer environments in the cloud.
So we call these tabpots and what it is essentially is a container of your code, your build system, artifacts, your IDE indices in the cloud.
And what we did at Uber was fairly unique in the sense that we figured out some ways to make it boot really, really fast and have pretty much everything pre-index.
And it is one of the most loved products there, because you could essentially multiplex yourselves now.
Rather than trying to switch branches, reindex your code or context, which you could have 10 different dev parts working on different features.
And they could have everything warm.
You could contact switch quickly.
You could kick off a build and go do something else.
It was actually great.
I think we put a lot of engineering effort to make that scale really well.
And it was a fairly small team as well.
Actually, a bunch of those folks are still in Amsterdam now working on the project.
And can you just explain?
Because I think most companies just do not have anything like that.
Like as a developer, what can I imagine?
Like, okay, I joined Uber.
DeathPots is there.
What does that mean for me?
So when you're on board, previously, we used to have this like bootstraving script you would run.
It's funny because I remember this now.
We have a bootstrap shell script that everyone just keeps updating.
There was a shell script.
And then which usually worked well, but sometimes it crashed.
And then we would have to, you know, ping the on-call.
of your team, but it was a pretty involved script.
Like it would take sometimes minutes to run.
Yeah, a lot of magic under the hood.
And the problem with that is like it's hard to keep it forward and quite compatible
with your system updates on Mac.
So it was definitely like not worth maintaining over time.
But the DevOps is interesting because I think just when we started seeing all these solutions
like VS code remote or JetBrains projector become a little bit more mainstream,
we said, hey, why don't we move this development stuff?
into a container, it should be easy, right?
But it's actually much more than that.
These containers can be huge, multiple gigabytes, right?
Because some of these artifacts and indices are huge.
We had to make sure we also use compute efficiently.
So we would have to have multi-tenant because developer workloads are very spiky.
They don't always have like a stable pattern like a server workload,
which means that you have to like understand when to scale up and down.
We have developers in different time zones.
So we have to also like provision machines close.
closer to them. So the bill caches would be warm closer to where they're working.
Otherwise, you don't upload everything from a data center in the US, but you might be working
in India and the latency is really bad. It's a lot more nuanced, I think, than just spinning
up a container. A lot of the solutions out there now are just focusing on contentizing stuff,
but not thinking through a lot of these other factors that are important. So we went through
and made a lot of improvements here. Some of that stuff actually, one of the funny things I
remember is I think JetBains had this like one shared index thing that they had a while back
where you could download an index of the project. But I think the way they were doing it previously,
you would have the full path of the system in the index. So you would download a denormalized path
and then you would normalize it to your system. Just populating all the absolute paths would take
a really long time. We measured it was like take an hour to actually like index that shared index
for jetpaints.
But then on DevOps,
what we did,
we just said,
hey, everything's in our control.
Let's just put everyone at home user.
So it'll be like closing the laptop
and opening it again.
So we could just throw the cache there.
Everyone was at home user.
So now no more like time spent
like de-normalizing the cache,
which meant we had like a six second boot up,
which is really insane for like a dev environment.
So you would come in,
you would say a devaport start something and you're done.
You don't need to do any bootstrapping
nor learn something.
And you would have one flavor per monetary
for you to work on.
So a data engineer or like a background engineer or mobile engineer would have their own
flavor.
And that's very optimized for that.
So it's basically you go on and say like that pot start.
And then in a few seconds, you have your web, sorry, your local editor.
You're using your local editor, but it's kind of like connected magically from your perspective.
So you feel like you're working locally, but it's just all set up.
It works.
You can build immediately.
You can run your tests, et cetera.
Yeah, you'd have like lots of course as well.
you could run a lot faster than running it locally.
That's pretty awesome.
I previously did a coverage on, did a deep type on cloud-based development environments, I think, two years ago.
One thing I'm noticing is there's not much talk about these things, even though they're just so darn efficient.
Do you see kind of excitement in this space or is just AI right now, you know, blurring everything out?
Because to me, this was a really big win.
Yeah, I think the environments are still good.
I think a lot of companies do want to use this for efficiency sake.
The challenge usually comes with, I think the understanding that I have,
talking to peers in industries, if you start to start with the solution rather than the developer
workflow, hey, we have a container environment, make it work for your developer workflow
versus you have a particular way to work.
Let's make it work in a container or VM.
So we took the other approach.
And that is why we were so effective.
Hey, these caches had to be like super fast load.
Other stuff can come in a sync.
and not everyone would have that inside.
If left to their own devices,
teams might devise a Docker file that has stuff,
but then how are you getting it up to date
as your team churns out more code,
are your indices actually getting up to date?
Otherwise, you're spending a bunch of time
after bootstrapping to re-index stuff
or redownload stuff.
So you'd push these updates every few hours
completely transparently in the background.
So it was very much like a golden path
that we had to like kind of happen upon.
And that's why I think if you have that knowledge
of the entire developer workflow, you can adapt it to each company's individual working style.
If you give a general purpose thing, it'll work, but it may not actually work so well out of the
box and you may not have the investment you may want to make to make it work.
Yeah, it sounds like there's no free lunch. You need to put it in the work.
And Uber had like so many years of like understanding, investing, like being there.
Speaking of years, you spent nine years at Uber. And when you joined, you were an inch two,
right?
Or an inch two
I joined
Or one I guess
Yeah
What is the lowest level
Whatever the lowest level was
Yeah
And then when you left
You were a principal engineer
Which was
Which is there's not many principal engineers
Like several
Like a few dozen or something like that
So you were promoted
Actually yeah
Maybe a couple dozen
Yeah not that many
Out of the closer to two
Three thousand people working in tech
And so you were promoted
I think I counted four or five times
In nine years
Can we
talk about your journey.
Like, how did you do it?
And what did you learn on the way?
Yeah, absolutely. I think one of the cheat codes is obviously joining early
healths a lot, especially for getting on a rocket ship.
But it's hard, right? Because when I joined, we didn't have as many people.
The platform teams were tiny.
It was so small.
It was funny because I jumped straight from entry level to senior.
So I skipped a level because the first year was just put through the pieces of
we need to get a lot of stuff done.
and I was shipping really fast, so that helped, obviously, one jump up.
But it gets harder over time.
I think your roles also change a lot over time.
I think one of my good friends early on in the platform team said, hey, if you can get into a niche and go deep, it can really help over long term,
especially stuff that people may not want to do, like developer platform, telepooling, not everyone likes to do it.
But if I said, hey, I really like this because I'm passionate about it, that helps because then you become the go-to person for that.
aspect. So as a company grew, it was very obvious that, hey, I had to scale myself to
larger and larger efforts. So starting on mobile, then I started working on pack and stuff,
and then on more holistic entire organization stuff. And that takes like kind of, you know,
breaking your own comfort zone a lot, like challenging yourself every year or two. So every
two years, I kind of do some introspection and say, hey, am I doing what I want, do I enjoy
what I'm doing? What can challenge me more? And then,
I try to go for that.
And usually the path opens up if you go that way rather than trying to chase like a particular level.
Because then when you're pushing the boundaries, the next level kind of becomes obvious.
And then past a particular point, it just also helps to have a lot of connections,
like social capital, mentorship.
It's very important at like a big company.
So I definitely went and got some mentorship with some more senior folks.
That helped a lot to understand how these principal engineers operate and what's required is more than just the engineering.
it's also the business side of things.
How do we actually solve problems and focus on the business metrics and bring that to engineering rather than start from bottom of?
And when you say, you know, like you need to get some social capital and mentoring, like it can sound a little bit of hand wavy to people.
But how did you how did you get that?
I'm not going to ask how you went about it.
But it sounds like you didn't say like, oh, I want to get social capital.
But what were things that you did that you think actually helped?
build up and people saying, oh, you know, like, like Gotham, I really trust this person. I know him.
He's a go-to person, et cetera.
Yeah.
So one thing that helps is I have this habit of just helping people a lot.
So when people come to me with like a problem with their dev environment, I would just drop everything and say, hey, let's start to figure it out.
I'll spend a little bit of time, time box it.
And if it fixes the problem, then I have some social capital.
It kind of accumulates over time because they would send someone else to you.
And over time, you can automate some of this stuff.
And I think I was one of the early people on the platform team to have office servers.
So I would say, hey, anyone can come in and just talk about your problems.
And we would, you know, just empathize a lot with everyone.
Say, hey, I know this sucks.
They're working on it.
It helps a lot to humanize the problem, right?
Because developers usually being you on Slack.
And over time, what happens is like as you solve more and more problems, it's always
good to like attempt.
And even if you can't meet the expectation, just say, hey, I can't solve this right now.
then try to say, okay, that's not in my area, right?
Let me just try to help you out.
And that really helps build trust with the other side.
And then when you go for mentorship, it's like, hey, I clearly know you have the right
sort of like mindset, you have the right goals in mind or like the right intentions.
So I would love to help you kind of go to the next level.
So that's usually what helps get very strong mentorships.
Yeah.
And I can just plus one a little bit on the helping people because, you know, I was in
Amsterdam, which is a, we say it's a distributed side, but we were just not an HQ. We were a
relatively small office, you know, like compared to SF or even Powell also in headcount. And we felt
a little bit isolated. And sometimes, like a lot of times, we would feel blocked by SF in, in reviews
and in anything. And after a while, I knew that if I wanted to, you know, you were one of the many
people on the platform team, but I think either someone told me, like, there was this thing, like, if
you if you cannot get it if you cannot return anyone on the platform team try gotham like he usually
responds and just thinking back it was just well to me that like i did get responses from you know
sometimes and uh like unexpected for like unexpected like i i didn't think i would get anything i was just trying
but you did like spend time on this and i just imagine that you had a lot of these things and i'm not
saying you necessarily did it all the time and obviously people like me were doing it sparingly but
it builds so much goodwill and also trust of like, wow, like, I just felt that whenever
you did go on a problem, you actually took it seriously. Like, you, you kind of did it. And as
you said, often you would tell like, I cannot do this right now or it's not an on issue,
etc. So I didn't see everyone. In fact, I saw pretty few people doing it. So like it's,
it's interesting how it feels like something that would not scale. But I guess to some extent it did. Is this
like the start mentality of like, you know, do things that don't scale?
Yeah, I think it definitely doesn't scale in all cases.
But you can make it work.
I think it's funny because some people won't even try because it won't scale.
My suggestion is like, hey, let's meet the person and see it may not even be an issue, right?
Sometimes what you might discover is people are just finding it hard to find the right documentation
because they just looked at the wrong guide.
And that's where they're coming to you.
Then you can just go make that fix and then you have much larger impact.
But if you don't talk to people, you have these broken windows.
Like, you have the big picture, but then you lose touch of what's actually happening on the ground.
It's hard.
So I think my policy was always like, hey, I can always talk to someone, spend a few minutes.
And if it's like a very esoteric thing, I'll say, hey, I can get back to you.
But if it is not, I would rather just point you the right way.
People joke when I left because they're like, hey, we should go build Gautum LLM because he would know the answers to like, you know, pointed things at least so that we could find the right information.
So I saw like an encyclopedia of where to find stuff.
And you were a principal engineer towards the end of your tenure at Uber.
What is a principal engineer at Uber?
And also previously we had an episode at Meta.
Meta has archetypes above staff engineer.
Do you think you fit into an archetype or did you see kind of archetypes at the principal level at Uber?
Yeah, I did listen to that episode.
So that was very, it's a very good episode, by the way.
I think there is some resemblance to an archetype, although we don't actually correct
that.
So either you have depth in a particular area or you have a lot of breadth.
And sometimes it's a lot of internal influence or you might have a lot of external influence.
So it really depends on where you want to take your career.
I focus a lot on depth.
Like, as you can see, my specialization is like productivity and have some amount of breadth,
because of all the social capital and talking to a lot of people usually makes you understand
what are some of the business problems that Uber is going through. So that was actually a good
bonus for me because if I never talk to people, I won't even know, oh, these three teams are
trying to solve the same problem. Maybe we can centralize it or do it more efficiently and I
could come up with solutions and people be like, hey, how did you know? That would work because
you just talk to people. And there were definitely lesser archetypes probably than meta, I would
say, but usually it's a bunch of breath and depth and having people to sponsor you helps a lot.
So the mentorship angle is also part of that.
If you're working with more senior engineers and they see you actually grow, they can give you
feedback where you might be lacking and you might be able to kind of work towards that goal.
So I think that's one of the things I would recommend anyone who wants to go in that path.
With principal engineering, it's a lot more you have to enjoy like understanding how engineering
meets business than just pure coding or just pure, what do you call it, not soft skills stuff.
A lot more soft skills need to be honed at that point.
It's like going and giving talks helps.
And it's funny because when I was growing up, I was not very good at like, you know, public speaking,
but I made myself over time more comfortable.
That helps also like build external influence.
And a lot of the early folks that are using our product right now are people I met in the
community and then they kind of have that trust that they built over time.
So this is a lot of people who are using your product right now at guitar, right?
Yeah.
Yeah.
So I guess that's a good reminder of like how, you know, I guess it's when you're doing
something, it might, hopefully it'll help you outside of when you're at your current
company and also help you while you're at the company.
Yeah.
My tip is generally, if you really focus on the people side of things, you'll do.
really well in these higher levels because it's a lot of relationship management both managing up
and down and shielding your team from some of the noise. And if you don't enjoy talking to people,
it's harder. I mean, you can still do it. There are people who do it, but it's a very unique path
and not everyone can do it. But the most common path is usually like understanding and spending time,
like talking to people and figuring our solutions that are more creative over time. And that also
kind of makes you a problem solver, not just like someone who can just, you know, code and
finish up things up. But that's, of course, like, you can have different archetypes, as you
mentioned. So it really depends on where you want to take things. Yeah, but I guess at the principal
injury level, it's kind of given that you're just, you can code efficiently whenever and however
is just like you might not do it all the time by choice, right? It's funny because there was a graph
someone sent me of like number of years of the company and number of commits,
year and number of reviews.
And there was like a little graph,
like an asymptotic graph. And they were
two or three dots at the very extremes.
And I said, hey, one of them
is me. The other one is you.
Then I was like, damn, I should
ask my boss for a raise.
So you were still writing a lot of code and
doing a lot of reviews? I do write a lot
of code. Even here at guitar, I write a ton of
code. I love to review stuff because
it really opens up your
perspective of what else is happening. I like
to comment on documents. And
understand where there are inefficiencies potentially.
But it just over time, I think you have a muscle to just do it.
But it does not mean that you write tons of code.
It just means that you have consistent higher impact code.
So a really interesting topic is, given that you were a principal engineer,
which was either the highest level or maybe there was one level above it,
but there weren't many levels above that.
How was it like for your manager to manage you?
And I know you also talk with a lot of principal engineers and you have some thoughts on what are some tips to manage these very, very senior engineers?
Like, you know, can you help explain your relationship with your manager?
Was it really a kind of a boss or more of a partnership?
And, you know, what worked at this level?
I think that's a really good question because I think as you grow in levels, you're more and more sort of like a peer to your manager.
You help your manager get stuff done too because you have broader influence.
sometimes it can be hard as an EM to you know like push for a particular priority because
there's not enough senior people looking back right it might be just your directed
person that's sponsoring that but having someone on the team that's senior enough can
completely change the odds for example to get something's prioritized one tip I will
give you managers is like if someone's asking you hey I'm at a senior staff or I want to
become principal after like the senior levels if they come asked you it's hard
you should probably not ask that question to a manager because at some point you will have to
figure things out. There's very few managers who probably can help you after a particular point
because they maybe done it before or have a lot of experience at the company. But most managers
tend to work at like senior or senior plus level. And past that, it's more like they're almost
like the principal engineer on the team is load balancing a lot of stuff technically. So the
manager can focus on uplifting everyone else.
So the relationship is very much like a peer, like more than like, hey, just a boss.
But it does help to get feedback.
I think their managers are a very good asset when you want to do like non-engineering
stuff.
Like for example, you want to get stuff unblocked and the engineering angle is not
working because you try to rationale with some team.
They won't listen.
Sometimes you just go to take the, hey, this is the priority when you get this done,
to unblock yourself.
So that's what I would say for EMs
who might be looking at
higher level engineers.
Basically give them
agency,
but of course,
check in often and make sure that they're unblocked,
because they can do wonderful things
as long as your org is not getting in the way
from them trying to actually have impact.
Yeah,
this is always interesting because
I don't think we really talk about us too much
just because there's not many
of these engineers, but even when I was a manager,
I actually promoted someone,
actually two engineers,
a level up where I was, right?
I mean,
it was a bit easier for me because I saw examples,
but it's always interesting.
And I feel it kind of goes hand in hand.
Like for these things,
those promotions really help usually the manager as well.
And as you said,
for a lot of managers,
it's a first.
And there's no,
there's no playbook after senior or staff.
It's all unique.
It's all based on the business needs
for the company.
in that specific area.
Yeah.
But if you're able to keep up with the engineer,
then they can pull you up as well with them,
because then you can become higher level
and you can focus on bigger picture stuff.
Because you hear a lot of stuff from an org
or a company perspective from the engineers
because just talk to so many people,
then you might have opportunities that you may not have
if you're solely focused on your team.
Before we jump back into the episode,
I want to let you know that the audiobook version
of my best-selling book,
the software engineer's guidebook, is out now.
You can get it on all major,
audiobook platforms like Spotify, Audible, Liberal.fm, Apple Books, Google Play, and also DRM-FreeMB3
files. I started writing this book after a decade of working as a software engineer when I was
working as an engineering manager for a few years already at Uber. Here's a review from the book
from senior principal engineer Tanya Raley, who is the author of the staff engineer's path.
From performance reviews to P95 latency, from team dynamics to testing,
Garragay Divisifies All Aspects of a Software Career. This book is well named. It
really does feel like the missing guidebook for the whole industry.
You can get the book at EngGuidebook.com
or search for the software-insured guidebook on your favorite audiobook platform.
I hope you'll enjoy it.
One thing we just kind of mentioned as natural is you are working on a platform team.
But when you started, I'm pretty sure there was not yet a platform team.
And then there was a split called the Platforms Split.
It was a very famous email that I think later we saw.
I only read it.
Can you talk about what was before there was platform and program teams and how did this
split actually play out from your perspective?
Yeah.
So I remember when I was interviewing to join Uber, they asked me, hey, what do you want to do?
You want to do some platform stuff.
I said, I want to build tools for developers.
And they're like, okay, you're hired because no one wants to build tools for developers.
The longer version is the platform team sort of existed.
We just used to call it.
I think mobile had it first, I think, but then the other ones.
We just had unofficial platform teams.
They were working on core infrastructure pieces.
The platform teams were essentially made so that the other teams could move faster and not duplicate stuff.
There's a lot of stuff that goes into building features.
And mobile was just very new back in the day.
So things were just not as well figured out as they are today.
So you had to build a ton of stuff internally.
So if you're doing this on every team, then it's a lot of ways.
So I think when we decided to the platform programs bit, we said program teams do need to worry about a lot of the underlying layers, similar to a cloud native environment.
You worry about your business logic, your product metrics, and we'll give you all the building blocks, which could be a mix of open source wrapped in our own layer and our own architecture that makes so that you don't mess with the other features.
Because everyone has their own way of building banners or their own way of like,
styling things, not at all consistent for like an app like Uber.
You need to have a common design language, so you'd use common components.
Stuff you would do on a website or take for granted today just didn't exist on mobile.
So it had to be done that way to kind of make things move faster.
So you don't have to rebuild like another widget or rebuild like another analytics processing thing and do something in the back end.
So one pain point, or I wonder how big a pain point is, but one thing that looked like challenging to me,
I was not on the platform team.
I was your customer, but we always pinged you with questions.
And as you mentioned, like a year ago or so, there were 10 engineers for about a 40,000
engineers.
What did support look like and how, what practices do you figure out?
Because I saw a lot of practices like office hours, on-call rotations, of where to bring,
et cetera, that I have not seen anywhere else.
Like the platform teams were advertising these things.
it was almost like, I don't know, like having a vendor almost.
Like they were like contracts.
They were actually SLAs.
It was like published how fast you can expect things where to escalate, how to tax something with high priority, etc.
I've never seen anything like this.
And how did you figure out like support?
And how important was it?
Sure.
So the angle that I think we thought about was like, hey, for everyone else, the end user is a customer.
For us, the developer is a customer.
let's not treat it as like another internal thing.
Let's really be customer obsessed.
That was the value we had at Uber,
that we extremely focused on the customer experience.
Developer experience versus the customer experience,
which means imagine if your Uber didn't show up,
you'd be very upset.
So if you're built and finish,
exact same feeling, okay, I cannot ship my code.
So we need to make sure that there is reliability
and latency guarantees for that exact same reason.
So all the product, I think, like minded folks had, like very whole,
much horn in on like product analytics.
We said, hey, let's think about what is important for the developers.
We asked them things like build time.
They said, hey, I want to make sure my CS reliable has no flaky tests.
Can you make sure the flaky tests are not blocking me?
Because it's expensive with like the bill cut.
You have to chase the release train.
And that's not fun.
I'm sure you've seen your team complain about like, hey, I had this flaky test.
I had to resubmit this diff.
And that's not going to merge in time with the release train.
Yeah.
And just like staying later, I'm not sure.
I think the release train left on Wednesday.
night. So Wednesday night was the cut off and or morning. I'm not sure. But I know people stayed longer
on those days to cash the release train, meaning getting their their PR, as we call the DIF merged
on there. Yeah. It was as you say, it was it was a big deal. I didn't actually realize that
you were like obsessing about it, but it kind of makes sense because after a while it wasn't an issue.
Yeah, because that's when people complain a lot that, hey, I tried to cash the train, but then
something was out of my control. The infrastructure.
was flaky.
So we missed this deadline.
Now, I'm on the hook as a platform team to ensure a heart fix happens, which means a
bunch of other teams are going to get disturbed because they have to retest all their
workflows.
It has a lot of ripple effects, right?
And this are not obvious unless you think about, oh, we cared about testing because
the end user will see the impact if you don't test all these features again.
Yeah.
So for us, actually, the metrics are very important to have, like, we would have SLOs,
we would publish them, we would try to keep them, we would review if we, like,
them and then do incidents management and like retrospective so we could like avoid those.
A lot of the good tooling came out of that was to ensure high reliability, high, like low latency,
which means you could ship product really fast as a company and the infrastructure would not
get in that way.
And that is why some of those metrics you see matter is because once we actually talk to the
developers, we knew what actually mattered to them.
That's what we focused on.
So is it just, I'm just thinking about this, but is it safe to say that, I mean,
I think we can agree that the platform team or the developer experience team at Uber did a great job.
But they did a great job because you ran it like a product team.
Like you, as you said, you talk with the customers.
You figure it out how to map how they feel or what they care about into metrics.
You then monitor those metrics.
And then you kept iterating on these things.
Like if I replace this with, you know, like a paying customer is the same thing, right?
Like that's what a great product team does.
absolutely you have to run like a product
because then the incentives are good on both sides
because if the developer doesn't like the product
then you should
you should kind of be like okay what do I do to make it better
so if the company is shipping high quality product
then we should also ship high quality product for our developers
and that's how we took pride in making sure that our developers
could deliver to the end user so yeah you're absolutely right
the metrics I think a lot of people might focus about at a high level
may not be enough you're right you have to talk
to people, you have to look at the funnel, where are people dropping off, like things like
the onboarding, for example, for new engineers. Where would they have difficulty? Like the DevOps,
for example, was like, hey, people would go look at these outdated docs and they would try to
set up an environment and then just spend a lot of time. So having a devour just eliminates that. And then
later you might have a situation where a build failed and there's a very easy quick fix, you should
automate that. You should not ask the user to like go to it. If there's a linker failure,
in my opinion, we fail because a lynch should not completely either fix or not,
just telling them, hey, this is a problem, is not going to do anything.
You're so much pressure to ship product that if it's not blocking me, I will merge anyway.
So if it's blocking me, if it can be auto fix, why can't you have auto fix?
That's kind of how we took things.
I like that.
What is your take on measuring developer productivity?
Your team was called developer experience.
I'm sure developer productivity came in.
This is a very heated topic these days.
It seems like for the past few years,
and there's a lot of back and forth of what can you measure,
what can you not measure?
Should you measure diffs per engineer or per team?
You probably thought a lot about this or have been in the middle of this.
What worked and what didn't?
Yeah.
So I think the very first thing we measured was just sentiment.
So talking to developers, as I mentioned, the NPS survey was very easy
because it was a very quick signal to see if things are in the right spot.
If everyone's super happy, then there's no problem, but that's really the case.
There's always problems that would pop up.
I think as you mature your measurement, I think a lot of the frameworks we see out in the open
are sort of inspired by what Uber did early on and took some of the good picks out.
I think when people talk about like Divers per engineer, they're not really saying,
hey, you should commit more DIPP engineer.
is something preventing you from moving as fast as you could potentially.
Because I can tell you that engineers, if basically nothing blocked them, would love to write code.
If you're an engineer, you want to ship code, you want to feel proud about what you shipped.
There's no one who will come in and say, hey, I don't want to write these many dives.
I want to write code because it makes me happier.
And what we were looking at those signals for was, hey, do you have too many meetings?
Do you not have focus time?
Because devs were usually a matter of focus time.
So when we improved that, we saw focus time improved.
I think when we had COVID everyone was working from home,
the DevSpo engineer shot up,
and everyone thought that, hey, people are working so hard, it is amazing.
It's just because people just didn't have too many meetings initially.
And over time when everything went to Zoom,
it kind of came back again down a little bit.
It's usually a way to make sure that you, our infrastructure,
is letting people move as fast as they can.
And as you remove roadblocks, you're focusing on the right things.
And in fact, actually, one of the things we measured is time for code review, more than anything else, the time to review was the biggest bottleneck.
Because as you mentioned, you would put up a PR in Amsterdam, might wait for review from SF.
That's already like a 10-hour delay.
Yeah.
And I think the P90s were pretty crazy.
Like, it would take days to get code reviewed.
And that would just be the biggest blocker more than anything to ship product.
and honestly like CI time is like a blip compared to the amount of time it takes to review
right so in the big picture I think we work to improve turnaround time by nudging people
by auto approving PRs that had minor changes that semantically would not change the meaning
or the behavior of the pier that helped a lot because you didn't have to get another approval
for making like a common fix or you're doing a small stylistic change for example so that helped
kind of unblocked a lot of those scenarios we had auto approve or auto land which helped because
if you review approved, then it would merge immediately.
So you don't have to rebase and worry about conflicts.
A lot of these kind of small things can actually have a really big impact.
But if you don't measure, you wouldn't know.
So I think in terms of developer productivity, when you talk to your engineers,
they might say, hey, I feel less productive because I can't do something.
Then your metric should flow from that rather than just blindly taking a framework
and saying, hey, this is a framework in the industry, this should be it, right?
It may not actually work.
I mean, it's not also the same for all kinds of developers.
Is it safe to say that at Uber, you know, developer productivity was measured in the sense
that there were these, you know, charts metrics.
It showed the diffs per engineer on your team.
You could see it and how it changed over time, the time to code review, focus time, et cetera.
But is it safe to say that these were measured with the intention of finding bottlenecks and
what to improve?
and they were not measured with the intention of using them for performance reviews, promotion,
or figuring out who, you know, who's the low performer.
Like, is that a fair assessment?
Yeah, that's the usual controversy.
People tend to misuse it.
The guidance is always, like, you know, look at them at a high level.
And that is where a lot of these were aggregated, right?
And this, you can find ways to, like, dissententwise people from bringing us up, right?
Because you can easily catch bias.
We had, for example, folks on the performance company that would be there just to catch bias that, hey, are you just looking at divs or you're not looking at the quality?
And sometimes, like, people may not have as much output because they have other contributions they're making.
They might be writing docs or they might be like influencing strategy.
They might be working on like unblocking teams or like figuring out product and business goals and distilling them to like requirements and engineering.
So it was definitely like, I think initially people thought of it that way, but that was never the intention on that was never the case.
in my experience, at least on all the companies I've been,
we never looked at it as like anything more than,
hey, it's just good to know that,
hey, if you haven't done any PRs, that's a red flag potentially.
We had zero PRs in the entire year.
Explain yourself, like, you will not like,
we'll clarify with the manager, hey, what's going on?
Are they, what are they doing?
But we never say that as like a blanket, okay, that means nothing.
So we always want to use this as a way to say,
if there are red flags, we catch it,
that there might be something else going on,
but it was mostly used for unblocking,
looking teams from shipping past.
Yeah.
So it sounds like it's just like if you have data and you can gather it and that data can
actually help you build the right things.
I mean,
it will be silly not to do it.
And as you said,
start with what the pain points are,
what people complain about.
I mean,
we always complained about,
well,
that was there.
It was always about code review and how long it took code review to take.
And I don't think we ever measured it.
I think I left by the time this was there.
But if we would have measured it,
it would have just really shown that our team was,
we probably have like,
like our median was probably more than a day because we were waiting a team on the other side of the Atlantic
and then we figured out how to cut those ties.
Code is rarely a bottleneck, especially now with like AI tools, you could just generate a bunch of code and throw up here.
That's rarely the bottleneck.
You're bringing alignment or figuring out product requirements or making sure you're not breaking the experience takes the hardest.
So just measuring output in terms of PRs is probably going to be harder as a justifiable thing after this new
age of AI tools because people are just shipping so much more.
Like we use a lot of AI tools at work, including our own, and we ship a ton more than
we used to back it over.
So you're now in the business of AI tools to help engineering productivity and engineers
work better.
How do you see these tools changing how we do software development?
I'm talking, obviously, about the usual suspects, the coding, auto-complice, the co-pilots,
also the agents and also everything else that's coming.
Yeah, absolutely.
So we're very thick in the space.
Obviously, the auto-comput is a huge boost because it kind of, in many cases,
helps you sort of like less, type less, but think more.
So I think you probably use a bunch of these tools.
I do too.
It removes a lot of the grant work, but it won't remove your thought process because
the taste in what you want to do is still there, just not the how sometimes.
It's like, okay, this makes sense, right?
but I feel like a lot of the tools are either focused too much on your completion or they're only on the code review.
I think back from when we learned things at Uber, we had a lot of inefficiencies in things we wished we could solve with some sort of intelligent layer.
Like I was, for example, going unblocking things.
Like, hey, I knew exactly what would need to happen to like unblock a build.
So having some sort of like an agentic understanding of things that are happening in the IDE, why did you type this character?
that affects like what's happening in code review and that affects what happens on deploy that affects what happens on like your incident for example there is nothing that's cross-cutting so that's kind of what we're trying to solve that guitar is like have this whole STLC under the purview and not just focus on the point solution of I'm just an ID code complete or I'm just going to be on your CI reviewing stuff a lot of those solutions currently are I believe good starting points and my experience like how I've seen technology shifts happen
the much nicer or more important unlocks happen a little bit later into the cycle.
Since it's very early in the cycle with AI, at this point, people are just experimenting stuff.
But I believe that there's a huge value add for agents, just because, as you mentioned,
there are some things that are so obvious when you think about it that, hey, we wish we wrote
like a script to automate this, right, for support or for documentation.
Those can be potentially driven by agents, but then still have the human in the loop when I think,
need to be approved. That's kind of how we think about it.
And like as you're using these tools, like how does it change your development?
Like are like obviously we're writing less code because it's now like there's just more
suggestions or there's also agents. But you know, like what does it allow you to do?
Are you, you know, just moving faster or are you actually having deeper thoughts?
I'm just interested in like what it means for for you, the current version of these agents that
And we know where, you know, it's a start of a cycle.
We never know how long it'll go, but clearly it's, it's only been like no more than two years.
So. Yeah.
I think it allows you to explore more paths that you previously didn't have time to explore
because you could experiment much faster and discard ideas much quicker.
Because when you're prototyping, you don't need the full rigor.
You can put them behind a flag and then say, hey, this looks good.
I think people are calling it vibe coding these days.
Like, hey, this has the right vibe.
We can experiment with this.
Of course, it's harder when you have a lot of other constraints.
strains. Like, you know, you have to make sure the software is, you know, secure, compliant.
A lot of people are not thinking about that when prototyping. So it lets you prototype faster,
which is great. And I think what's also happening is I feel now there's all of talk about,
hey, junior engineers are going to get replaced. I think it's the opposite. My take is that
junior engineers are going to thrive because they are coming with new knowledge, new ways of
working with these tools and are going to be much more effective because they don't have
bias of working a particular way.
They're able to like achieve things that, for example, it would take a long time to
like even wrap up.
It doesn't mean that you shouldn't know the fundamentals.
In fact, like, if you think about 20 years ago, they had like a DBA and a HTML and like
a Java application engineer.
There was no data engineer.
And a webmaster sometimes.
And there was no like SRI.
No.
There's no like front-end person, there's no mobile person,
there's no like data engineer, I'm an engineer.
There's so many more archetypes now or like, you know, sub-categories.
I think with AI, the general engineer is going to see a rise again,
which means I'm able to jump into, for example, front-end
without not doing much front-end.
Because I can understand that, okay, this is what I expect to happen.
and I would like to have this outcome with my software,
which means I can then focus on the other parts of the problem.
You mentioned vibe coding, and this is a brand new term.
What do you think it stands for?
Is it just prototyping?
Is it just like throwing out ideas, or is it a bit more than that?
I think vibe coding is just you trying to figure out how the system should behave
when you're prototyping.
The good thing about vibe coding is like,
you're able to like basically iterate much faster
with this agentic loops that now you're just focusing on
how would I actually achieve my outcome
rather than the exact way to do it
because you can always change those details later.
If you have the right abstractions,
you could swap layers out.
So I think that's changing the way people think about software
because if you're someone who has taste
and knows like this is the outcome I want,
to see on the ux you may not even need to know a framework you should experiment with it till
it feels right and of course like you need to massage it to make sure it doesn't make it hard a day
wall later on i think a lot of the software today gets you to the initial it looks great but then
how do you maintain it is basically sort of like left for later time and i think a lot of people who
are still coding with AI are early in the prototype phase i don't think at enterprise scale that would
work because i can imagine it will probably mess with your entire abstraction layer you already put in
So you have to really constrain it to work well.
So vibe coding is great for prototyping at this point.
But that's also why I think you can move a lot faster with less constraints
because AI has a lot more freedom to explore.
One of the problems that we're seeing,
and if you're using it, you'll also see as like there's this,
Adiosmany called the 70% problem that, you know,
like vibe coding and these things that they get you started.
But then it can get just really confused and stuck.
And people see this.
Like people who are not really technical,
they kind of get stuck.
And that's where experience engineers really shine.
You said previously that you think junior engineers will do great there.
How do you think less experienced engineers can deal with this?
That's one.
And then the other is like, is this where we might see, you know, senior engineers actually,
you know, thrive and spend more of their time?
Yeah, I think the main differentiator there is like when things go wrong,
understanding why they went wrong, which means strong CS fundamentals,
strong sort of system knowledge of,
hey, how does the entire thing work end to end?
So senior engineers usually tend to have a better, bigger picture.
So if they adopt these new tools, actually,
they can get a lot more productive.
I can tell from my own experience that adopting these tools,
I'm able to get a lot more stuff done
because I know, for example,
this is how I expect my system to work,
which means I can delegate to,
I used to give it to a more junior person
to, like, you know, understand the system.
Now I can just give it to AI.
I can check the work and I can come at it.
But if you're very stuck in the base and not want to use these tools, you can still do it.
It's going to be a lot slower and that's totally fine too.
It's just a matter of how quickly you want to ship.
And for things that are not critical, you may want to still experiment to see what's out there.
Because the things, the way they are going, the tools are getting better every day,
which means over time, a lot of the mistakes that are happening today will start to disappear.
They'll be like other agents that will check the work of the first agent.
so they're not going to be as many mistakes.
You still want to have a human in the loop before you check in.
Otherwise, you will not have full guarantee that whatever your shipping with the user is going to be good.
But you definitely want synergies to focus on the high-level system-level knowledge,
and the agentic loops can be more nuanced and more focused on smaller parts of the puzzle.
After nine years of Uber working on developer experience,
you've now co-founded a company called Guitar.
How are you thinking about continuing this developer experience?
And also, what's your approach to AI?
What are your bets on where the space is going to go?
Yeah, that's a good question.
So when we co-founded Guitar with my co-founder, Sully, and Raj,
we basically saw that a lot of the inefficiencies we had,
tools were not available back in the day where Uber.
We saw because we had a golden path for developers.
Now with agents, I think, we talked about DevOps
previously, right?
Like, hey, there's a golden path for a developer
to get stuff done.
So what if an agent had a golden path
to get things done for a particular enterprise?
So we're thinking of guitar,
like the agents we're building at guitar.
We just are going to launch a new product pretty soon.
And you can check us out at GitHub.a.
The agentic AI, we build,
is actually inside the IDE on the code review system,
inside your deployment, it's across the board.
So there's no reason why you can't auto-complete
like a deployment shop and not just be stuck inside IDE.
And having an understanding of things like,
hey, how are things going in production?
Which part of the code is slow?
These dots are usually what engineers connect, right?
The context is what flows through your brain.
And if there's a standardized path for an agent
to evolve software and maintain it,
it becomes much easier to focus on the business value.
And one of the things that people dread
is, hey, maintaining stuff like updating libraries or fixing tech, debt, or adding coverage,
we feel like a lot of that grant work can be done very efficiently with agents.
So that's where our focus is that we don't want to be just like a code complete or within
your point solution.
We want to have it across the entire stack of software development.
And we have that experience working on this for many years, talking to peers in the industry.
There's a lot of potential for disrupting the space.
and the stuff we wish we had,
hey, I wish someone was on call during this time
to fix a small thing,
we can make that happen now
because we can give the agent powers
to unblock people when you might be asleep,
which means you'll be on call less often.
So we're thinking of it from a holistic developer experience standpoint
and producing more code is just one part of it.
It's about doing it reliably,
maintaining it and making it understandable
so newer developers can understand what's going on.
Because if we just let AI run amok on your code base,
and not have an idea what's going on, then that's a recipe for disaster.
So we feel like there's a lot of opportunity for making the entire experience end-to-end
much more standardized so that the golden path for developers and agents is like really well later.
Now, like, you know, when I'm listening to this, like one part of me is like, you know,
if I'm a business owner, I'm like, oh, that sounds great because over time,
I might just have less developers because I'll have a really capable agent.
And as a developer, like, obviously one thing that's,
into your head is like, wow, you know, like we might have a smaller team.
Hopefully I'll still be there, but I might have fewer colleagues or, you know, I might
be a fortunate I want to get the cut.
Like, let's just run with this, that this future is coming and we will have a lot more
capable agents.
In this case, what are, what do you think developers will do?
And what are, what are skill sets that are worth investing in so that you will still be
an efficient, a great developer in, in the future, you know,
seen as a highly efficient developer 10 years from now,
you know, your managers will be like, oh, my God.
Like, we cannot do work without this person.
We need this person here.
And we should probably give them the raise because they're underpaid.
Yeah, I think, as you said, there might be less people in a business,
but there'll be more businesses, right?
That's typically how a lot of these technologies go.
If building software is easy, then the value-onlock will be, like, actually the taste
and the experience for the end user.
Like, what do we actually provide?
If every website looks the same because you product,
typed it in like a website builder, then that's not going to be your differentiated.
Like even I think during the mobile era, a lot of apps came up to do things like to-do lists
or, you know, like reminders or like, you know, just scrolling social media.
But then eventually some clients are great UX.
They want to not because they're just a joy to use.
So knowing what matters to your end customer is going to be differentiated for engineers to like
hone in on.
So that means understanding the business really well.
right and then the other thing is like how do you build something for scale essentially as you as
you like grow the business you don't need as an engineers probably and that's okay and that might be
fine because if you're at a smaller company with less a number of engineers if you have equity for
example you might have more of the pie which means that's a good way to kind of incentivize people
to experiment with more ideas and usually competition is good in that way right so you may not be
bound by those restrictions that you have to always be like a big company, I think we'll start
seeing smaller, more efficient teams who are going to be more productive and will deeply understand
the product. And that's going to be, I think, a good way to compete with other products in that
space. Like the taste, the art that goes into like crafting grade Ux or great end user experiences
is going to matter more than how you wrote the code. Now, if I'm thinking, because I think it's
a good thing back that, for example, even 15 years ago, 10, 15 years ago, we already had this,
right? Like WhatsApp, they were barely at 50 people when Facebook acquired them for $19 billion,
which is crazy. This was no way I know nothing, but clearly they were a very efficient team.
Instagram, I think 12 or 13 people. Also, they built an app that was already 30 million users
with that small. Even today, that would be a big deal. So it's like to get that skill set of
you being able to, as an engineer, being able to do a lot with these tools with the small
team. You mentioned taste. You mentioned art. What other things that are worth focusing on
that? It should probably help, you know, like do a lot more with less. Yeah, I think having an
understanding of the system layer is good. So I think the scalable part we didn't cover. So how does
your app or software scale to more users? What parts of the system are less efficient, more
efficient because all code might do the same things, but less or more efficiently.
And how do you maintain it?
So that's the challenge a lot of people learn into.
As software systems grow, is it easy to upgrade because you have less external dependencies?
Is it harder because you have so many versions?
The software might work the same today, but okay, three years down the line, if you can't
upgrade and move faster, your competition is going to basically move faster than you because
it started on a clean slate.
We see that all the time.
I think there's a lot of legacy technology or database systems.
people are trying to move to modern systems.
It's a big cost, right?
And that's very expensive, a lot of engineering time wasted.
So if building for basically maintainability is going to be important as well.
Building for like either AI agent maintainability or human understanding so that you don't
have to like, you know, look at outdated docs or do an expensive migration.
I mean, for example, if I could wave a magic wand and move all my micropos to a mono depot
and everything just work, that would be awesome.
I don't think any system can do that today yet.
Maybe it would in the future.
But if that's a reality because you're designed your software to move that way, that's great.
So having that knowledge is not because AI agents are fundamentally based on expert knowledge.
So if your knowledge is like understanding of your entire business, because they can't really tell you how should you write your code for an user.
What matters to one business may not matter to the other.
some parts of the software might work well for one segment but not work so well for the other segment.
So really understanding those business values will differentiate just AI-driven code versus how do you actually use that code to make your business like actually thrive?
So again, it might not happen, right?
Or it might happen slower than we expect, et cetera.
But what I'm hearing is like even if it does happen, having expert knowledge helps, having taste and you kind of get taste from seeing things.
failing things. So is it safe to say that it's probably a pretty good strategy to try your best
to go and work at companies that are either building stuff at scale or they just have kind of taste.
You know, there are startups that are known to do these things so that you can build up one of
the muscles of like scale, taste, if you can switch between them because in the future, I assume
that the some of the kind of repeated, like the successful people who you have a pattern will
we'll probably have the pattern of like, oh, they worked at companies that kind of did pretty
groundbreaking stuff at their time.
And now that we have these tools, well, yeah, they're doing even more groundbreaking stuff.
And they probably build a bunch of their experience.
Yeah, the experience matters a lot.
Like, knowing a lot of the stack helps.
Like, when I was at Uber, I didn't do much work on front end, but now I'm at guitar.
I'm doing like front end.
I'm doing infra.
I'm doing a bunch of stuff.
And I'm using agents to help me.
Like, we use our own agent to like rewrite a bunch of our stuff.
And I was like, okay, now I understand why this work.
that way. I'm a general engineer now. I have strong fundamentals. Let's say I can see the system.
I know this is how it's supposed to work. Now I uplevel myself without spending. It's kind of like
that matrix movie where Neo gets that, you know, upgrade to like no kung fu. But it doesn't mean
that he can beat the master yet, you know, like he still has to train up and understand it.
Yeah, I like that. So with that, but let's go with some rapid questions. I'll just shoot out
some of some things and then you go
what is your favorite programming language
and why? Rust.
Rust. Since when?
Since guitar actually I was like initially like
oh my God's Rust has a bad reputation for being hard
but it makes things so safe.
It's hard to have I think like outages if you design it right.
I mean you can still have them but like you can really understand
the type safety is there to help you out
and you start treating it like your friend
then it unlocks a lot of stuff.
So I'm a big of...
So someone who's...
You've spent a lot in Java and Go, right?
Like, you've probably spent like half your career in those...
Yeah, it's funny because when I graduated college,
I told my friends, I would never work in a company that does Java,
and then I did Java for nine years.
So, you know, never say never.
What's an AI power dev tool that you use and like?
We use our own agentic thing quite a bit.
Oh, nice.
Jimmy, which is a new agentic.
Jimmy, yeah, because guitar, you know.
We use the cursor as well for like, you know, the auto-complete,
but we feel like Jimmy gives a lot more end-to-end stuff for us.
So we've been using it pretty heavily, but you can also try it out hopefully soon.
And then I use a bunch of other smaller, like, clawed directly, obviously, for question
and answer stuff.
How do you play with like some of the newer deep seek research models and stuff like that?
Because I've been hearing good things about them for more deeper work, but I haven't had a chance here.
And what's a book that you'd recommend?
My favorite book, I don't read too much, but I read this book early in my career called Head First Design Patterns, which is a little old at this point, but the patterns are really good because it really makes you understand how you can layer things.
A lot of software today is just abstraction or abstraction or abstraction if you think about them.
So it really helps you understand what pattern makes sense when and when to use it, how to transition.
between them. I would highly recommend that. If you're trying to go into more systems that
approach and want to design software for maintainability, that's very crucial. Making it plugable,
composable and replaceable under the hood. A lot of the migrations we did at Uber was zero downtime
with like no code fees. If we did the Java migration, there was no code fees. So that was
a drowning achievement. Yeah. Yeah. And it banks these days still, you know, like they will shut
down their thing for some migration for like a whole day on the weekend.
in 2025.
It's a big different, as you say.
Well, Gautom, this was,
I really enjoyed this conversation.
It was good to reconnect and just see how developer,
productivity, empowering developers,
you're still like just running with it.
Yeah, I think this is the area.
I love working in this area.
I can't see myself doing anything else.
I've been doing it for 10 years.
I hope to do it for 10 more.
We'll see.
But I hope I can make more developers productive.
I hope you also found this episode about Uber's internal tools,
develop productivity and AI's impact on software engineering as interesting as I did.
You can find Gotham on social media as linked in the show notes below
and check out his company, Guitar at Guitar.A.I.
For more deep dives on Uber's engineering culture,
check out the pragmatic engineering articles linked in the show notes below.
If you've enjoyed this podcast,
please do subscribe on your favorite podcast platform and on YouTube.
This helps more people discover the podcast and a special thank you if you leave a rating.
Thanks and see you in the next.
one.
