CoRecursive: Coding Stories - Story: Platform Takes The Pain
Episode Date: November 2, 2023How did Spotify scale from 10 engineers to 100s to 1000s ...without slowing down? Without becoming corporate? Facing an IPO deadline, Pia Nilsson worked with 300 teams to transform how Spotify built... software. She spearheaded a movement that led them from working in silos to a unified developer platform. Hear the inside story of how Spotify's Platform teams embraced transparency and customer focus to create Backstage — now used by companies worldwide. It's an amazing tale of ingenuity and perseverance. Hear Spotify's secret to scaling engineering without losing speed and independence. Don't miss it! Episode Page Support The Show Subscribe To The Podcast Join The Newsletter Â
Transcript
Discussion (0)
Hi, this is Co-Recursive, and I'm Adam Gordon-Bell.
Each episode is the story of a piece of software being built.
Today, we're peeling back the layers on Spotify,
and a monumental challenge they faced several years ago.
Here's the problem.
It was 2016, and Spotify needed to IPO.
But IPOs, they take years to plan. You have to hit the market timing right. You've got to have
everything in order internally. And you've got to look exciting to potential investors.
To get ready, you have to pay investment bankers. You have to pay auditors to get feedbacks on the
gaps in your business.
And then they found a problem. They had learned that from the external auditors that we had too many vulnerabilities in our CI infrastructure, i.e.
we had over 200 bespoke Jenkinses running at various stages of cleanliness, sort of.
Because, of course, that was not only 200 teams maintaining each of these,
but actually it was 300 teams on these 200-plus Jenkinses.
So these Jenkinses were maintained at best to some kind of degree.
So yeah, each team was responsible for getting its own code built
and shipped to prod. Each one owned their own CI instance. You could use the tool that best suited
how your team worked, and you would set it up and you would manage it yourself.
Brilliant, right? And very empowering and motivating for teams. The challenge, of course,
with this is this led to a lot of complexity and duplication.
That's Pia Nelson. She's fresh from a smaller company, and she's a force of nature. She's an engineer turned passionate manager, and she thrived on technical challenges. Spotify's situation, though, it wasn't just a technical challenge. It was about culture and it was about business stakes.
Imagine Ernst & Young, the IPO auditor, sifting through Spotify's operations,
you know, looking for secure practices. They find this maze, 300 individual paths to production,
all in varying states. Alarm bells went off, right? How can you guarantee security with 300 distinct paths?
The risk was evident.
So Spotify needed to address this.
If not, it would have to show up in the audit,
which is an official IPO SEC document.
And it would hurt the stock price potentially.
It might hurt the IPO.
That's a big problem.
So when I joined, the CI project was to sort of consolidate all of this,
build one CI solution for everyone, and do that rather quickly.
We had a timeline of six months.
That seems fast, right?
Join an organization and lead an organization-wide migration
at a company 10 times the size of the one you just came from.
Migrate across 300 teams, across 2,000 engineers,
and do it in six months with the IPO on the line.
But Pia signed up.
And really, when it was all over,
she and her teams at Spotify didn't just consolidate CIs.
They led a transformation in the way Spotify did things.
A change that is now spreading to other large engineering orgs.
That's today's story.
And it all starts on day one when Pia joined Spotify.
Because on her first day, she stepped into a world unlike any she had
ever seen. It was super cool. Like, you entered into this building, and for each floor, you walked
from one room to the next, and they were so unlike each other. Like, it was like a different universe,
every room. It was real. The identity of the team was on the walls, basically.
Pia's desk was in the squad room of a team that was called Pipe Dream.
It still is, actually.
And they were very happy with that name, as they are the CI team.
So Pipe Dream sounded fantastic.
Good name.
It was just a mess in their squad room. A lovely mess, sort of. Because over the years, there were posters upon posters on like very music oriented and sort of...
Yeah, it was very alive.
What music posters?
Oh my God.
Everything from Bob Marley to Daft Punk to anything else, sort of.
It was sort of them love for music and expression
it wasn't a certain like oh we are into rock or we are into our R&B not at all it's like
love of music I think was the overall theme and very many musical instruments you could
run into guitars just everywhere and
lots of music enthusiasts. If each squad room was unique to the team,
each desk was distinct to the individual. You could never sit at the wrong desk, basically,
because that was like living in someone else's house almost. It was so, yeah, a huge identity physically.
That was very, very clear.
And of course, super exciting.
And people were very sort of proud of their team
and their team names and belonged to the team.
Pia took it all in on that first day.
People are tied to their squads
and the squads and the people have strong
identities. It's great. It's cool. It's amazing. Each team is unique. But Pia's there for a reason,
right? She's there to consolidate things, which is sort of removing uniqueness, at least when it
comes to CI. So this might be a struggle. And also part of this culture just didn't fit with who Pia was.
I never loved to sort of try to look cool because these rooms looked really cool.
And that has never sort of appealed to me for good or worse.
So I've never been one of the cool kids.
And I guess it's something that does not attract me to try to look cool, sort of.
I feel I'm hiding and I feel like insincere. What do you mean you're hiding?
I'm not the customizer. No, I have never sort of, I've never been a homey person like that.
I kind of feel stuck almost, actually, when I sort of try to
identify with some physical thing. It makes me feel stuck rather than expressing myself.
So Pia's desk stayed bare. In this personalized squad room, with everybody's identity on display
with trinkets and musical instruments, she just had an empty desk. And the others customized
things for good reason, because the most hardcore Spotify engineers, they were at their desks for
12 hours a day. It practically was their home. But Pia, you'd never find her at her desk,
because she was a chapter lead, Spotify's version of an engineering manager.
Which meant that I didn't have one team, I had five. But I didn't have five teams either. I actually had a few people here and there in five teams. And that seems like a very strange setup. It was. However, there was a very practical reason for it. of entered into after the fact, hyper growth as in quadrupling their size every year.
Then having people, leaders like that, focusing on people growth and people health, while
new teams were constantly forming, it made more sense to have chapter leads because the
people could keep their manager, which meant they would be more likely to be retained while moving to the next team,
to the next team, to the next team. So that was sort of the real reason why we had chaplates,
and that had been working out really well. It's a smart idea in a way. If a company is
growing so fast that new squads are sprouting constantly, maybe you can minimize the chaos a
bit by having your engineering manager stay constant.
That gives the ICs, the individual contributors, it gives them some stability.
But imagine being a new engineering manager, a new chapter lead.
The company is growing fast and you're trying to connect with people,
but they're in five different rooms.
It's chaos.
Pia was confused and so were many of her reports.
Imagine working at a place that quadruples in
size every year. A team of four becomes 16 and then 64. Those original four are now spread across
eight teams where no one but those original four had more than two years with the code base.
And meanwhile, the code base itself is changing at 16 times the rate it was two years earlier.
How do you handle that? How do you keep up?
You could take those original four. You could have them get the other 60 up to speed,
right? Full-time, just work on that. But then short-term velocity would go way down,
and those original four wouldn't be doing the work that they loved anymore.
They might even leave, and plus you're getting slower. And Spotify couldn't get slower.
This is when they needed to be moving faster. So Spotify had its own way. It was all about
independence and speed. Let the old hands keep shipping and the newbies will catch up.
Almost a gross ache of a teenager, you know. The company had been so small and we had had
these wonderful, brilliant engineers that had coded all day, all night, basically, which had been incredibly important for the company and successful.
And now we were so big, several people who were like this employee number 10 coding all day long, all night long, obviously knew everything there was ever to know about the infrastructure at Spotify. Brilliant thinkers, writing code faster than I ever saw before, often during the nights.
It was challenging to just keep up with the code that had been written when I had been
home eating dinner.
And there were like hundreds of lines of code that I needed to
understand in order to even understand what is going on here. 60% of the team were not up to
speed on the last pull requests that these two people maybe had created during 14 hours workday.
And that just kept going. You can imagine like it's just challenging for people who work normal hours to ever catch up.
This is another problem. Pia has her six month CI project, but now she also has to unite these groups.
She's the squad leader both for employee 10 and employee 1100. She needs to find a way to cross this divide.
That was something I was struggling with because it was my responsibility to have team health,
to have pull request reviews that actually were meaningful.
But then to discuss tech debt, one has to understand,
well, what does this system actually look like?
And also love of speed.
I mean, who doesn't love the speed of iteration in development?
Do you slow down in order to get everyone on board,
to have a conversation?
And then, you know, some arrogance might come in
for people who feels like they are way ahead of everyone else.
Are we going to spend their hours discussing with the team
and slowing down
and pair programming so I'm leaning always towards like we need to get everyone on the same page
and we think better as a team than as just one person here one person there because then we
can't build on each other's ideas and And the team culture also is way nicer.
It's also way easier to onboard.
I am a strong believer in this collaborative,
let's work together approach.
And I sort of was challenged myself,
like how do I actually make that happen?
Because these other folks, they are brilliant.
They know everything by heart.
If I ask that engineer to do it, it's going to take them 15 minutes.
If I ask the team to work on it, it's going to take two hours.
But then, of course, five people will know the solution instead of just one.
This was a problem Pia could see as an outsider.
The things that had made Spotify
nimble and move fast when they were small were now working against them as they grew.
Pia brought this up. We had a fantastic community within our chapter lead group.
And I think that's the reason I kept fighting with myself on this problem sort of. Because
they were seeing it. And I had sort of this feeling like,
these are among the best engineers I've ever worked with,
but we aren't as impactful and as effective as we could be.
And we're a bit fragile, too.
Because if someone here goes on vacation, who knows those five systems?
But they were coming out of a culture where it was always possible to just reach out to that someone.
They would always pick up the phone and walk to the office basically and fix it.
And I don't think that is a sustainable way of living and working.
And it certainly wasn't a sustainable way of growing the company.
So is that literally true? Like everybody lived in the area and you could just walk in to the
office if something went wrong with that system you knew?
Absolutely. Yeah.
But there is an advantage to that, I guess, in the early days, you know,
everybody's in Stockholm and you could just walk in if your service falls over and kick it or?
Yes, absolutely. And I think it's a great way to start. As is working on a monolith in the
beginning, usually. You need to start somewhere and have people actually understanding the full
picture. But if you're successful, you will run into a place where that isn't going to be
effective any longer. Because many managers were seeing these challenges.
They were just like me running in between these rooms
and seeing, well, this team asked this question
and that team just had that question two weeks ago.
It's about challenging the status quo a bit.
And I think I did that a little bit more than maybe others
because I have this strong belief
that we can work better together.
Yeah, I have a hard time handling
when I see ineffectiveness.
This is why I wanted to do the episode.
This is Pia's mission, right?
And it's even bigger than the CI thing.
I mean, she absolutely needs to hit the CI goal,
but there's a bigger challenge.
Scaling an engineering org, keeping execution speed up, and both chaos and bureaucracy at bay
as you go from 10 engineers to 100 engineers to thousands of engineers. How do you do that?
It's a big challenge to tackle, but luckily Pia does have some leverage because she has the CI project
and it absolutely must get done.
And leaning on that,
since that was anyway aligning with my values,
I could use that as an example for the other teams
on, well, this is actually possible.
So she's got a countdown,
six months to get everyone on a secure, standardized build pipeline.
But there was another problem.
The platform team, because of the autonomy, the platform teams did not own the problem of adopting their own tools.
The team I led, we did not have the mandate to tell anyone because we had this autonomous culture.
This seems like a problem, but I think this is actually pretty cool. If a platform team can just
make everybody use their stuff, they might force teams to use tools that don't fit. I've seen this
before. Spotify avoided this trap. They said, we'll build the tools, but we won't force it.
If there's a better way to solve the problem, go use that.
But that meant that the platform team didn't worry about adoption.
Yeah, it wasn't about being lazy.
It was about being respectful of the autonomous culture.
As in, we should build something that is so good that they just want to adopt it,
and they will plan for that adoption themselves.
But as you can imagine,
very, very busy feature teams would not maybe have the chance to even know about some of these tools
sometimes, nor plan migration. So this old way of not pushing your tool out of respect for the
other teams, it just wasn't going to cut it. Not with the six-month deadline, not with an IPO.
The platform teams needed to switch gears. They had to drive adoption, not just hope that it would
happen. But the platform teams resisted this. In the very beginning with the CI team, when we were
like, okay, we're going to actually go out there to squads and help them migrate, which was entirely new.
And the sentiment in the CI team was, but they're not going to want this.
They will basically not let us in to the squad room.
And we're going to be looked upon as this sort of corporate folks that want to centralize.
And that's like saying that, like the devil,
being the devil almost, like, they're gonna hate us, basically.
This is a problem, right? Who would want to go team by feature team, one by one,
walk into their custom squad rooms and say, we're going to take over your CI. It's going to go our
way from now on. But Pia had a different idea she told the pipe dream team
maybe you've got it wrong we're not here to control the other teams we're here to help them
help them get ready for the ipo and to do that we need to change who we are as a team we need to
worry about adoption the platform teams did not think they were accountable for the adoption of their products. So it was like
both starting to take accountable for adoption and that would lead to going out there to the
customers actually sitting there, onboarding them, migrating them. And we had this mantra
that we still have and it's still a part of our main engineering practices for the platform
mission which we call the platform takes the pain it really helped us actually because it's short
and snappy and everyone knew what that really means because it's tedious to migrate someone
off a jenkins to another ci engine this is not sort of the lovely ivory tower work
that maybe some platform teams had loved doing.
Deeply thinking about sort of orchestration
and creating some fantastic product.
But this is actually sort of going out there
to some feature teams, 34, 50 feature teams,
and sitting with them and helping them because they have different
challenges, all of them.
No two teams were alike.
These weren't cookie cutter migrations.
This was custom hands-on build work.
But the Pipe Dream team leaned into the grind.
They took pride in taking on the pain of each migration.
I remember this one team we went to
and we actually sat there with these engineers. It was only two from this other squad. They were
like just asking some really great questions like, okay, how would that work? And how are we going to
do this? And okay, so if you do that, then I can do here And like, okay, just get it done. Let's get it done. Sounds good.
Okay, good.
Let's go.
This was the total opposite of the pushback they'd expected.
And it kept happening as they met with more and more teams.
It turned out the big resistance wasn't coming from the feature teams.
Those teams were open to change as long as it helped them ship features faster.
Because the platform teams are the teams that really care about infra. So they wouldn't have wanted to have an infrastructure solution
being sort of asked of them to comply with, because they really care about these things.
But a team in e-commerce platform, in user platform, in the playlist platform,
they care about other things way more than what CI
infrastructure they are running on. So it was like a lack of customer awareness or understanding.
Had we known this earlier, I think this move would have been enabled way faster and way earlier,
much longer before I joined. This was a big realization.
The autonomy that teams valued,
it wasn't about specific technologies or workflows.
It was about owning their work,
having the freedom to build the stuff that mattered.
Not owning CI was fine,
as long as it didn't mess with their ability to ship things.
Once Pipe Dream got this, it was a light bulb moment
and they realized that most engineering teams would embrace
any changes that remove friction from their lives.
This gave Pia hope that with the right approach,
she could rally the teams around a shared mission to work better together.
But first, they had the six-month deadline,
lots of Jenkins files to write.
And at Spotify, you don't want to be a bottleneck.
You can't slow people down.
So they had to make sure that all their changes rolled out smoothly.
So the toughest thing with this migration was that there were so many Jenkinses.
And you can imagine around 300 teams that were impacted. So the scale of reaching all of these critical pipelines, actually migrating them,
that was one of the biggest challenges. The second one was the build templates differed so much.
There was no standardization on the build templates, which makes it very hard to actually
build them. However, we figured out we're going to just containerize the builds.
So that was the solution for building very custom build templates.
Through all this work, PipeDream transformed their thinking. Platform takes the pain? Yes. But platform also cares about their customers. Platform understands their customers.
Platform cares about impact. These ideas started to spread. There was no moment I can remember, at least,
where someone said, like,
right now we're going to move into team ownership
of adoption for all your products, platform folks.
It just happened gradually,
as I think people were seeing more and more
that it actually works to own adoption.
It's also a matter of like, how do you define success in your organization, right?
We started celebrating when teams reached high adoption of their products.
And we also tried to move the flywheel by celebrating successes.
This let Platform rethink some things.
Yes, they didn't want to be top-down, they didn't want to be corporate,
but maybe they were wrong about autonomy.
Where is the autonomy, really?
We started to learn how to nuance that understanding,
because autonomy has to do with impact.
We're not interested in being all alone in a room being autonomous.
That's useless. When we say autonomy in the engineering industry, right, we mean actually
impact. So I think for the platform teams, this autonomy word started to be equal to, okay,
if I actually own the adoption of my tool, I have more impact.
And if I don't own adoption of my tool, I don't actually have much impact.
And it doesn't really matter if I am autonomous.
This change shifted the team.
They went from ivory tower architects dreaming up theoretical solutions to customer-focused
engineers.
Because we really weren't customer-oriented at all when I joined.
Nobody had asked the engineers to be customer-oriented.
It wasn't that they didn't want to.
Nobody had thought about it, basically.
But this owning your own adoption, they have to become and they will become customer-oriented
because they're going to go out there and fix that adoption.
So did you hit your six-month deadline to get everybody?
We did.
You did?
Yes, it was a success story.
And the IPO happened in 2018, I think, April or so.
It was a massive success.
And the CI challenge was solved
through this canerization of the build templates,
which really helped us control all CI where we had to comply.
But Pia still saw problems to solve.
She had a success, a big success, yeah.
But Spotify still had silos, still had tribal knowledge.
She'd seen firsthand how not knowing where things lived
led to people reinventing the wheel, led to productivity bottlenecks.
And this was the hardest thing for her to handle, working at Spotify.
Because there were so many folks that were not open to this change.
They called it like, oh, we're going corporate.
I mean, they were fantastic engineers, many of them. And they had built so many useful things for Spotify, but not being
able to reach them was, you know, taxing and just, it was sad. It was sad. And one gets a little
angry as well, like upset. How can you not care about the bigger picture how can you who are so incredibly
smart not see that this is wasteful why is your pet project more important and because there were
there were so many pet projects there were so many in switch we call them small pope running around. And these were great, highly skilled people that had built really
valuable stuff. So not being able to reach them, I think, was my hardest learning in the beginning.
Here's the thing about autonomy. It can sometimes work against transparency. Even in a place full of talent, communication gaps are going to happen.
But reaching everyone
and breaking down silos
is a big problem.
So Pia decided to focus
on something smaller.
What frictions are getting
in the way of work at Spotify?
We were doing this service
where we basically sent out an email
people could reply to on like,
what doesn't work with us,
basically, what doesn't work with backend right now, and people were just filling out a form.
And it was also a little bunch of internal jokes. So we had a very good response rate on those.
And we were seeing a lot of times that people were struggling with being interrupted all the time and that people couldn't find things.
That seemed to be the top problems overall, quarter after quarter.
And this speaks to the challenge of silos, right? When one is in an organization like that, where we have a lot of siloed orgs and teams, I wouldn't know really exactly how to integrate with an other
system. There will be several APIs that I may find myself through digging around in the infra,
but I wouldn't know exactly maybe how to integrate and which documentation that is up to date, etc.
So I would be tapping someone on the shoulder
that I know, oh, this person probably knows something about this system.
So we had a name for it even.
It was called Rumor Driven Development.
What we meant was you had to know
who actually had once upon a time written something in a system
and then you tap them on the shoulder and asked are you the right person to ask about this api
and then they would forward you to the next one to the next one to the next one and finally you
would find your team and of course sometimes people don't have time to go through that rumor
chain but instead they had to build it themselves and there we end up with fragmentation so we And of course, sometimes people don't have time to go through that rumor chain.
But instead, they had to build it themselves.
And there we end up with fragmentation.
So we saw this need of like, oh, people just want some place where they can find everything.
And since we don't have that place, they are tapping each other on the shoulder all the time,
which leads to the second problem of being interrupted.
They decided they were going to fix this. Just like they were fixing CI, they would roll out
a centralized developer portal with the data that everyone needed. A place for all the services,
who owns them, their APIs, their documentation. Maybe if they had all that, this would fix
rumor-driven development. It would fix transparency. They needed a way to test this,
though. They needed a metric to aim for, a way to measure success.
The metric we decided to track was the onboarding metric.
We borrowed that from Meta.
We used the number of days it takes to make the 10th pull request, which is a very crude metric.
And of course, one has to follow up with a bunch of other things because developers do not only code.
That's not the only way to get onboarded.
But at the time, that was the metric we were using.
And it was spread all across the organization.
So everyone understood it.
It's a simple metric.
And we had over 60 days when we started.
And this was sort of the rallying cry for,
well, why do we actually need this?
Well, we pointed to this one metric.
Everyone could look at it.
Over 60 days to actually do their 10th pull request
for all these newcomers that were joining every single week.
It's a tough metric, right?
It's not just measuring whether existing people
will adopt the developer portal,
but whether new people can figure out
how to contribute to the code base faster. They could end rumor-driven development, but that might not be enough.
They needed to add enough transparency that it would speed up onboarding. And besides that,
even getting feature engineers to use the service if they built it, it wasn't a given.
We were still a very autonomous engineering culture, and we still are. So we needed to build buy-in for this idea that there would be one developer portal,
one catalog holding everything.
In fact, a solution like this already existed, and one of Pia's teams owned it.
It had never been sort of seen as like a core to our infrastructure at all.
It was like just a catalog of the backend services, basically.
In the past, because they had never worried about adoption,
the service directory never took off.
But why didn't it take off?
Well, it lacked features.
It didn't cover enough things.
It didn't cover front-end things.
It didn't cover data things or infrastructure.
So they could fix that.
They would call this new catalog, this project Backstage,
and they would use it to try to end rumor-driven development.
Exactly. The core idea of Backstage was to visualize the connection between all components
and all owners so that you would be able to solve, for example, an incident way more
autonomous than you had before. You should not have to know exactly anyone, actually,
in order to figure out who owns this data pipeline, and when was it last run, and who's on
call right now, and is this incident actually already logged there, and who's working on that,
blah, blah, blah. You could get that through a few clicks.
But there's another issue, right?
It's a centralized service
trying to solve a decentralized problem.
One team trying to reflect
the autonomous decentralized nature
of all the teams at Spotify.
We were trying to sort of work out
the architecture of Backstage
so that it fit our decentralized engineering org and also engineering culture sentiment as in autonomous ownership.
We also saw it as sort of we needed to decentralize ownership of the plugins in Backstage to increase speed and never become a bottleneck.
Because the main focus for Spotify all the time
and still is speed of development. So whatever the platform folks like myself come up with
can never impede speed. So that was one of the core ideas as well, like whatever we do here,
if we are going to centralize the developer portal, it may not slow down the feature teams. And the way
to slow down is to create the bottleneck, right? Of a central team owning everything.
So they built Backstage around a plugin model that allowed for expansion. The idea was that
if Backstage was adopted, but it had a gap, then instead of the rumors, any of the thousands of
devs on various feature teams could jump in.
They could add a plugin. They could push things forward. There would be no bottleneck at the
platform team. If all the backend services weren't backstage, but the web components or the software
libraries or whatever weren't there, then the plugins were away around that bottleneck.
And the plugin that really showed the way that this could work was the data plugin.
You see, the data teams
needed different types of data.
They needed to know things
like retention policy.
We have some data sets
that are the root data sets.
And then there are a myriad of data sets
that build off of these.
And sometimes they are necessary
to be maintained, obviously,
super critical going forward as well.
And sometimes that was sort of for this one campaign,
for this one initiative,
and then that data set isn't as important any longer.
The data folks were just able to create this beautiful plugin
for all the data engineers at Spotify, several thousands of them, so
that they can sort of interact with their datasets through Backstage instead of other
portals that were already existing, of course.
This decentralized approach kept Backstage flexible and fast-moving.
And adoption spread internally.
Backstage became an important part of how engineering was done.
More plugins were developed, more information appeared.
But Pia knew that a portal alone couldn't transform the culture.
It's not possible that code shifts culture, I think.
It has to sort of come from a need that was already identified across the organization.
And then this technology speaks to that need.
And I think we were at this breaking point when I joined 2016 and 2017 even more,
where the scale of Spotify had become what was slowing us down.
And our former ways of working did not scale.
Everybody wanted the change anyways. They just didn't know how to accomplish it.
Does that sound true?
It sounds very true. And also, I think that the autonomy needed a refresh. This idea of
a complete autonomy, team autonomy, needed a refresh on like, well, what about the impact?
It used to be the case when we were a small company that complete team autonomy had the right impact because the company wasn't as big.
But in a large company, you won't have the impact that you're looking for.
And then autonomy becomes almost meaningless.
That's sort of the cultural change that we went through, at least as a company,
I think. We didn't have this kind of transparency because we had this employee number 10 kind of
archetype who were incredibly fast in producing valuable code and systems. And then the rest of
the team kind of struggling to keep up or maintaining the stuff that were produced.
And with Backstage, sort of the sentiment behind Backstage is team ownership.
There is no individual ownership.
It's the team owning everything.
I think the shift is very gradual towards team ownership.
But Backstage sort of softly moved the organization towards.
How does it softly move you towards team ownership? Because the rumor-driven development
fades, you don't need it as much. Because people
are like myself, I could just look up the team to
connect to in order to understand some certain API.
I wouldn't know that it was this person called Karin that actually had built it all from the beginning.
I knew everything there was to know about this thing.
I wouldn't have any idea.
I would just go to the person that was called a goalie in the Slack channel, in the team support channel, and speak
to them. And I would have no idea that there was this person who actually built the whole thing
from scratch five years ago. I think it's sort of put an interface in between these people.
That's simple, but it makes sense, right? In rumor-driven development, the individual who built this is the most important thing. But now, with service goalies and teams acting as the unit,
the individual fades. The team, the squad, is the main thing. Team identity isn't harmed
by transparency. If anything, it's increased. Absolutely. It's also a gradual change. But looking back at before Backstage and looking now at mature Backstage adopter, I think the big challenge that at the time seemed so big that we just don't know where things are and what's going on hardly exists because you can see everything. Everything
is available. There's like no need to actually know anyone for me in the commerce platform.
Recently, I tried to look up a few APIs, understanding them and how they're integrating
to answer some questions from some other teams.
And I don't work in commerce platform, but I could just go and find them and trust that their APIs that they were surfacing on the backstage page were the right ones, obviously,
because they were there and that's the place to go.
I could get to very far on my own, basically.
And then I had some really relevant questions to
ask. And I knew exactly who to ask because it was like obvious which team was owning this.
Slack channels were just a click away. And I think it's this empowered feeling. It's something I
lacked in the beginning. I felt like a junior when I joined, was a very odd to me and and after having
spent like one and a half decades in engineering I felt I hardly did not know
anything I had to ask everyone for everything all the time and everything
was different also people were giving me five different answers which were all to
some extent true you guys centralized, but you left the autonomy.
You left the team structure still.
Like, is it still silos?
I guess you guys can, it's easy to find information,
but are the people still in their pods?
But we never wanted to move away from strong team identity and belonging.
It's been the core to our culture and still is.
So we wanted to find this balance where standards set you free.
That was another one of these mantras that we invented to help people understand,
no, standards and centralization in some places actually empowers me. So I think today teams are
quite on board with that. And they're seeing that this did not lead to us becoming very corporate
or a top-down kind of engineering culture. Not at all. It just removed a bunch of toil. The whole process took time. Building out backstage,
adding plugins for data, plugins for documentation so that people could learn more about your
service, plugins for seeing builds, plugins for seeing Kubernetes stuff, and then templates for
building new services and so on and so forth. Along with this, the culture shifted,
not a complete 180, but towards a place where autonomy and team identity also meant embracing
standards, embracing transparency. Spotify embraced Backstage as the central portal,
and two years in, they hit their goal. It was now taking 20 days for 10 PRs for new developers.
They had cut their onboarding time in half.
And this is even though they were much larger than when Pia had started.
We celebrated a lot.
And one of the ways we celebrated was to open source it in 2020.
Backstreet had become this super critical part of our infrastructure, and we were super
proud of it. And when something becomes that critical, you've got to protect it as well.
And one of the fears that was starting to arise was, hold on, this is now super important to us.
What if someone builds an internal developer portal and it becomes the industry standard?
At the time, there were no developer portals basically in the industry.
But we were sort of thinking we would have to migrate off of Backstage because we're not going to sit there with a bespoke system.
And that's going to be so painful for us because now we have every single team on Backstage and so many plugins, over 200 plugins actually, internally built.
So this was a real threat to sort of our speed again.
And so we made the decision like, we're going to actually donate this.
We're going to give it away and make sure to keep investing in open source Backstage because we need it.
We need it to be the industry leader for ourselves, for our speed.
So that was why we open sourced back in 2020.
That we really celebrated.
As companies grow, they can do more.
But each person can do less.
10 people can move fast.
1,000 can't.
But 1,000 people can accomplish a lot if they can work together. A thousand can't, but a thousand people can accomplish a lot
if they can work together. The problem is coordination, right? Some big companies,
they're all administration. They're all top-down control and nothing happens fast.
Others, they just can't pull off big projects. There's too much internal politics. It's like
Game of Thrones in there. But it seems like Spotify maybe found another way.
With their focus on speed of development, speed of iteration,
they've kept that small company feel.
They've kept the autonomy while still being able to work together.
I really want to believe that.
I really hope so.
I think if one really wants to prioritize speed, then one has to figure out the solution because the top-down company isn't fast.
And usually, from what I have seen at least, it's difficult to make the right decisions because you lack so much information at the top.
And engineering industry is moving incredibly fast. So actually, the devil is in
the detail. One has to sort of understand a problem space to the lowest level almost in order to
have this sort of creative, brilliant idea that afterwards sounds like, oh, obviously,
we should have done that. I also think the real speed happens when one deals with ambiguity really great.
Because honestly, I think many companies,
we don't know exactly what will be most beneficial to build.
Because the space is so ambiguous.
The customers usually aren't exactly sure exactly what they need.
So one has to be fast at failing.
And in order to have a fast failure culture,
one has to be sort of empowered to make quick decisions,
try something out.
Oh, it failed.
Great.
Then we learn.
Next experiment, next experiment, next experiment.
That's how you actually win.
You out-experiment your competition.
So if one is serious about speed and success,
I think one has to figure out, like,
how do we enable empowered teams?
Because they are going to be the ones out-experimenting.
So here's the thing about how companies run.
Even as companies get larger, as they scale, it seems like it's possible to maintain some independence and some speed.
How do you make this happen? Well, if you're in a platform role, I think we know some of the answer,
right? Platform takes the pain. Take ownership of adoption, have a customer-centric mindset,
understand your users' real needs and
what's slowing them down and just be willing to take on the tedious work involved in aligning
teams. But there's a broader lesson here, one that applies to all of us. Don't underestimate
your power. Even the smallest change can make a difference. Be the connective tissue that a growing organization needs.
Be Pia.
I've always had this huge interest in understanding how people collaborate and thrive together.
I always ended up being the responsible to sort of arrange the parties and arrange the get-togethers. And that was kind of my informal role
of trying to make people come together in a sense.
So you're an organizer of people?
I hope I could say that.
Maybe I'm even more so a listener.
My go-to is to hear people out by being that friendly ear.
So that has always been something very dear to my heart.
I feel like that's been a red thread, actually, throughout my life.
A big thanks to Pia.
What an amazing person right she and the people she worked with at spotify i'm sure she'd be the first to say this wasn't all her they created so much value and they they
did it while keeping the things that spotify valued it's amazing and it's funny how hearing
some of the internal struggles at Spotify,
it just, it makes me interested in working there.
I'm not saying that because Spotify is paying me. Trust me, they're not. I just love hearing
stories like this. I love hearing about these changes, big and small, inside these big tech
organizations. So if you've got one, let me know because I find this stuff fascinating.
Also, if you're new here, sign up for my newsletter
and you'll get some behind the scenes details
about the episode.
And for a truly behind the scenes experience,
join as a podcast supporter.
The next two episodes that are coming out
that are amazing, I have to say,
they were very much shaped by some of the discussions I had
with the supporters, both on the Patreon channel and on Slack. So yeah, thank you so much to the
people who support this effort, the people who help create this podcast with their financial
contributions. And until next time, thank you so much for listening.