Software at Scale - Software at Scale 58 - Measuring Developer Productivity with Abi Noda
Episode Date: June 13, 2023Abi Noda is the CEO and co-founder of DX, a developer productivity platform.Apple Podcasts | Spotify | Google PodcastsMy view on developer experience and productivity measurement aligns extremely ...closely with DX’s view. The productivity of a group of engineers cannot be measured by tools alone - there’s too many qualitative factors like cross-functional stakeholder beuracracy or inefficiency, and inherent domain/codebase complexity that cannot be measured by tools. At the same time, there are some metrics, like whether an engineer has committed any code-changes in their first week/month, that serve as useful guardrails for engineering leadership. A combination of tools and metrics may provide the holistic view and insights into the engineering organization’s throughput.In this episode, we discuss the DX platform, and Abi’s recently published research paper on developer experience. We talk about how organizations can use tools and surveys to iterate and improve upon developer experience, and ultimately, engineering throughput.GPT-4 generated summaryIn this episode, Abi Noda and I explore the landscape of engineering metrics and a quantifiable approach towards developer experience. Our discussion goes from the value of developer surveys and system-based metrics to the tangible ways in which DX is innovating the field.We initiate our conversation with a comparison of developer surveys and system-based metrics. Abi explains that while developer surveys offer a qualitative perspective on tool efficacy and user sentiment, system-based metrics present a quantitative analysis of productivity and code quality.The discussion then moves to the real-world applications of these metrics, with Pfizer and eBay as case studies. Pfizer, for example, uses a model where they employ metrics for a detailed understanding of developer needs, subsequently driving strategic decision-making processes. They have used these metrics to identify bottlenecks in their development cycle, and strategically address these pain points. eBay, on the other hand, uses the insights from developer sentiment surveys to design tools that directly enhance developer satisfaction and productivity.Next, our dialogue around survey development centered on the dilemma between standardization and customization. While standardization offers cost efficiency and benchmarking opportunities, customization acknowledges the unique nature of every organization. Abi proposes a blend of both to cater to different aspects of developer sentiment and productivity metrics.The highlight of the conversation was the introduction of DX's innovative data platform. The platform consolidates data across internal and third-party tools in a ready-to-analyze format, giving users the freedom to build their queries, reports, and metrics. The ability to combine survey and system data allows the unearthing of unique insights, marking a distinctive advantage of DX's approach.In this episode, Abi Noda shares enlightening perspectives on engineering metrics and the role they play in shaping the developer experience. We delve into how DX's unique approach to data aggregation and its potential applications can lead organizations toward more data-driven and effective decision-making processes. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev
Transcript
Discussion (0)
Welcome to Software at Scale, a podcast where we discuss the technical stories behind large software applications.
I'm your host, Utsav Shah, and thank you for listening.
Hey, welcome to another episode of the Software at Scale podcast.
Joining me today is Abhi Noda, the CEO and co-founder of DX, a developer productivity platform. Welcome.
Thanks so much for having me, Utsav. Excited to be here.
Yeah. So, Abhi, first, the thing I want to talk about is the fact that you graduated from UIUC.
So, virtual high five. I graduated from there as well. So, if you remember anything about
your CS experience from then, I'd love to know a quick, quick, quick story.
Well, it's actually funny.
I actually did not ever graduate.
I did attend.
Okay.
But yeah, I spent a few...
Actually, hold on.
It's a pretty hilarious story.
So do you want to still go into it? I'm happy to go into it.
It's not like private or anything.
No, go for it.
Yeah.
Okay. like private or anything no go for it yeah okay so it's a funny story about my college experience
because i actually switched majors late in high school after i'd initially applied to schools
i'd initially applied as a econ and finance major to a bunch of universities and going into my
senior year it changed my mind and decided I wanted to study computer science. And growing up
in Illinois, UIUC was the obvious choice in terms of cost and the strength of that program to attend.
However, before actually going to school, I ended up taking a gap year and working as a software
developer. I had been able to pick up a little bit of basic PHP while in high school. And so I
was able to land a summer internship going out of my senior year working on WordPress sites.
So ended up taking a gap year, eventually did attend UIUC. And seven weeks in ended up dropping
out with a friend of mine to go travel the world and try to start a company. We never got a company
off the ground. And actually, funny enough, several of our other friends also dropped out
shortly thereafter to go and start companies. I would say that our cohort of friends who dropped
out of college together have been pretty successful in their entrepreneurial endeavors.
So that's good. But I unfortunately didn't get to spend
as much time at UIUC as I wish I could have. Really enjoyed my time there and made a lot
of lifelong friends from my experience. But yeah, my experience at UIUC was a little bit short-lived.
Yeah, there's only so much research can help before a podcast. You never know what's going
to come out. That's a wonderful story. I'd love to have that kind of cohort of friends who are all so ambitious like
oh i just want to start my own thing make it big that's awesome and after that like i remember
reading about you i don't know if you remember this from our show a long time ago from like the
indie hackers blog and like that whole circle of oh pull panda was like this
bootstrap company and then it became this big thing everyone's using it and got acquired by
github so like maybe a brief story on how that got started i started pull panda it was originally
called pull reminders and i started that company because i actually lost my job at a horrible time. It was over Christmas vacation. And I knew no
companies were hiring. There was no point to network and reach out to people or even try to
send my resume around. So I decided to work on a side project that I'd had the idea for almost a
year. I've been working as an engineering manager at several companies. And at each company,
I had ended up spending a considerable amount of my energy as an engineering manager, just chasing
down pull requests and asking developers on the team to make sure they reviewed them or followed
up with reviewers to get them through the door. And I'd had the idea that perhaps this could be automated.
And I'd even tried to, for example, use Zapier to create an automated bot that would just post
a reminder to the team, sort of a status dump of here are all the pull requests that need to
be reviewed or need to be merged. And so the idea originated from that experience. And I worked on the first version of Pull Reminders for about a month.
The idea was not to turn it into a business.
It was really just to build something and put it out into the world and see what would
happen.
But as it turned out, there was, right from the get-go, a pretty strong amount of interest.
I had a few people reach out, even found me by SEO, which is pretty funny.
Typically, a brand new website or product is not found through Google.
But I guess I had just the right HTML title on my page that was really targeted toward
a specific use case.
And so within a couple months, I got my first paying customer who paid me $20 a month for
unlimited users.
And from that point on,
Pull Reminders became a business. And as you know, over the course of the next year and a half,
it sort of just took off like wildfire on its own. And I don't really attribute that to anything amazing that I did. It was sort of just being in the right place at the right time with the
right product and a number of
fortunate circumstances such as the rise of Slack. Slack was really still in a growth phase at that
point. GitHub had recently launched the GitHub Marketplace. And so Pull Reminders became one of
the featured products in the GitHub Marketplace. So thanks to occurrences like that, Pull Panda
really just took off on its own and
became a pretty successful business. Yeah, I think it's so interesting. And is that when you were
first interested in developer productivity? Or that was just when you were an engineering manager
who was trying to make sure your team was executing quickly? You mentioned that it was a year you were
thinking about this? Yeah, it's a great question. I was really thinking about the problem of how to measure productivity.
That was the actual problem I was trying to figure out and solve and potentially build
a product around.
And Pull Reminders actually eventually became Pull Panda, which was a suite of products,
one of which was a product called Pull Analytics, which provided the common get and pull request metrics that you see across the industry today.
And so really, the inspiration for going down this path was to try to tackle the problem of
engineering measurement. However, the pull reminders feature or product was really designed to be a baby step for me to just ship something.
I kind of considered it like a warm-up lap.
I wrote down in my journal, just ship thisers by itself became quite a successful product.
And I later added the measurement and analytics features to the bundle.
Yeah, it kind of reminds me of breaking down a project into multiple milestones.
Your first milestone was pretty successful.
So nice work.
But I think you're coming close to the actual end goal, which you had in mind then, which
is figuring out
how developer productivity is measured. And that's with DX. And I really like the solution that
you've come up with at least so far, which is you have these surveys, because I think we've tried a
bunch of different times, there's like a bunch of tools in the space. The first one I can think of,
I had heard of way back was Git Prime, which I think got acquired around like metrics from tools and you all took a different approach at least in the beginning which
is it's actually a very well targeted survey that can give you a lot of insight so when did you
think of that as the problem was that something you were thinking of at GitHub and that's where
like DX came out how did did everything start? Yeah.
Took me a long time to arrive at the current state of what I believe is the right approach and the right solution to this problem. Back when I was working on PullPanda, I did adopt the
approach similar to tools like GitPrime of taking in the data that was available from source control tools
like GitHub and project management tools like Jira and using that data to try to produce metrics
that were useful for engineering managers or, broadly speaking, engineering leadership.
The story goes that I really just hit a wall with these types of metrics. I hit a wall in terms of the
value I saw in them myself as an engineering leader, as well as the limited value I saw in
customers who were using my products who weren't really able to do much with these metrics.
The pattern I kept seeing was that companies would get really
excited about the prospect of these metrics. They would purchase my product or products like
GetPrime. And a few months later, if I were to check in and ask them how things were going and
what they were doing with their metrics, generally speaking, they weren't really doing anything with them. In addition, another pattern I saw was that although most organizations weren't
really doing anything valuable with these metrics, they also were sometimes doing things that were
harmful with them. So for example, when I worked on Poll Panda, I would regularly see people trying
to export the data or reaching out with feature requests with specific types of
reports that were specifically for the use case of using this type of data to evaluate the
individual performance of engineers. And at this point, although I was still early in my journey,
it was very evident to me that these types of metrics should not be used for those types of purposes.
And so I knew that there just had to be a different approach.
The approach of just pulling data out of systems and using those metrics could only provide value in a very narrow use case of understanding code review. But if you wanted to understand bottlenecks or constraints or aspects of productivity beyond that, there really wasn't any data available within these systems.
And as I continued to ponder and struggle with that problem, it started dawning on me that any question that leaders were trying to answer with this type of data.
And let me give you an example.
One question my customers started wanting to ask was code review quality.
They wanted to understand they were using the data from Poll Panda to increase the speed of code reviews,
but they wanted to understand if that was affecting the quality of the reviews.
And the way that customers asked us to solve this problem was to calculate the number of comments per code review. And to me, this seemed like such a poor proxy for code review quality compared to,
for example, just asking your developers whether they felt that
the code reviews were of quality or not. And so this pattern started to emerge as well. I found
that more and more, any question that we were trying to answer with Git data was better answered
by simply asking your developers. And this is before I knew anything really about
psychometrics or survey-based measurement approaches. But it was just that simple idea
that if you could just ask your developers, that would actually provide you the insight into a lot
of these questions that leaders were trying to ask. And that's really where the idea for the
current approach and the research I'm doing was born from. And it seems like there was a similar timeframe when the industry was thinking about developer
effectiveness metrics, things like DORA, like D-O-R-A and space. How is the timing related to
the industry thinking about these things versus your thoughts evolving?
Yeah, well, DORA and the book Accelerate came out while I was working on
Pulled Panda. And so when that book came out, I actually got in touch with Nicole Forsgren,
who's the lead author of Accelerate and who I now work with at DX today. And I was inspired by that
research and by that book. And similar to the phenomena that I observed with customers and our product, I was very
excited by the prospect of these metrics.
It became clear to me, though, that there wasn't anything special about the DORA metrics
in particular, that they really sort of suffered from the same problems and limitations
that the Git metrics that I was working with at the time did as well. And so I think the DORA
metrics was a big thing for the industry and created a resurgence of appetite around this
problem of how to measure engineering organizations. And space is something that came
out while I was actually working with Nicole later on at GitHub. And so her and I were working
together on product solutions for companies in terms of how to measure engineering organizations,
as well as how to understand and improve developer effectiveness internally at GitHub.
We were undergoing a pretty big transformation, and there was a big need for measurement to guide
our progress there as well. So my experience working in this problem space has intertwined
and intersected with things like Dora and space at several points along the journey.
Yeah. And that makes sense to me. I think those metrics, that book, Accelerate, I remember, I think it was my first job. And I was working in a developer effectiveness team when my manager was super we did both, right? We started measuring deployment frequency
as well as we were doing
these developer effectiveness surveys.
And I think both of those together
gave us a reasonable picture of what's going on.
I kind of want to move around in time.
So you mentioned that when tools like GitPrime existed,
people were excited about that.
What were people doing before tools like that existed?
How did organizations try to measure productivity way back?
And how did it get to where we are today?
Before tools like GitPrime, I don't think there was much formal progress around this
problem of how to measure.
When you kind of look back at history or the history of our domain, one of the most common ways was to just try to measure the output of developers. So metrics like lines of code or function points or velocity points were sort of the de facto ways in which organizations tried to understand and measure productivity. And in fact, those are still common approaches today.
You could argue they've somewhat been superseded by things like counting number of pull requests,
but really that's a similar approach to counting commits or lines of code or velocity points.
At the same time, there were many outspoken critics of this type of approach. For example, Martin Fowler wrote an article,
I want to say in 2001, maybe it was a few years after that, but a long time ago,
talking specifically about this practice and why it wasn't effective. There's also some famous
stories, quotes from people like Bill Gates, early engineers at Apple, who also pushed back
mockingly at the practice of trying to measure
software development using output metrics. So I think that's where things were when Git Prime
came into the picture. And in fact, one of the selling points of Git Prime was that it was an
alternative to just measuring lines of code. Unfortunately, the reality was that GitPrime also did measure
lines of code and in fact, really provided lots of metrics that were similar to lines of code,
although they weren't exactly lines of code. And so hopefully that answers your question.
Before GitPrime, I think the status quo was just measuring and counting some type of output of
developers based on their activity.
And do you think it's just like an organizational urge to have metrics so that there's something to track?
Is that kind of where you think all of this comes from?
Why is it a good idea to try to measure story points in the first place?
I mean, absolutely.
I think this is driven by really sensible needs and desires of businesses and leaders. I mean,
even today, tens of millions of dollars are spent by the typical organization on software
engineering. So to ask the question of how good are we doing? How can we get better guided by
data is a very reasonable question for leaders to ask.
I think at the same time, having been in this position myself, this often is sparked by a CEO
or non-technical business stakeholder asking an engineering leader or a technical leader
for measurements to report up or prove or show or demonstrate the value or productivity of their
organization. And so I think the need is pretty universal, all levels of an organization,
but it's definitely a need that's not going away. And in recent years with, for example, COVID,
and now the tightening macroeconomic conditions, the need has only grown. It's not going away.
That's for sure. Yeah. And recently, it looks like there's a new paper coming out about a
developer experience framework, which you've been working on. Maybe you can tell us a little bit
about conclusion or like something that y'all have been thinking about how all of this can be tied up
into a set of questions that you can ask developers or into a framework that you all have been thinking about, how all of this can be tied up into a set of questions
that you can ask developers or into a framework
that you can use to actually measure
and think about developer experience or developer effectiveness.
This paper is really a culmination of all the experiences
and all the research done both by myself
and by my fellow authors on the experiences, and all the research done both by myself and by my fellow authors on the paper,
including Nicole Forsgren, Margaret Ann Story, who is one of the co-authors of Space,
and Michaela Grayler. And to convey the real goal of this paper, the other week, I was talking to
the CIO of a major bank. And he told me that internally, they were having heated discussions
about the right way to measure and understand developer productivity. And of course,
someone within the organization was advocating for the use of metrics like lines of code and
commits and activity based measures to do this. And he said to me that he realized that there is a better way, a different way, but that there wasn't anything concrete that he could offer. to measuring productivity than the conventional approach of measuring development activity,
development processing times, and other conventional metrics that are common today.
So this paper, the subheading is the developer-centric approach to measuring and
improving developer productivity. And that's really what this is about. What we're trying
to outline here is an approach to measuring developer productivity
that is based on the feedback and signals from developers themselves, rather than focused on
a top-down view of developers' activities and processes. And so in the paper, we go deep into
what that means. When we talk about the feedback and signals from developers, we give that a
definition and a conceptual model
around developer experience. And then we get into the more practical recommendations of how
to measure developer experience and the examples of types of metrics that organizations should use.
Yeah. As a business leader, you can imagine that the kind of questions I might have are,
am I shipping enough features given that I have
an X large team of developers? Is that something that this framework can help me answer?
No. I mean, that's a very valid question. And that's a question that is actually really
difficult to answer using even the conventional metrics. And I think that's where
organizations go wrong. Google recently published a research paper about the difficulty of measuring
developer productivity. And in that paper, they discuss the challenge of trying to measure
knowledge work and software development is knowledge work in the same ways that we have typically measured non-knowledge
work such as manufacturing, right? And so they give the example of coal shoveling. They talk
about how you can't measure software development in the same way that we would just measure the
amount of coal that someone shovels in a given day. And so to your question of, are we delivering enough features? Or are
we delivering a good number of features? That's not a question you can really answer based on
counting tickets, counting commits, or counting anything. In fact, that's probably a question that
is subjective within the organization. So in some ways, it does relate to the approach that we're
outlining for understanding developer productivity. But our framework is really focused on understanding
the root causes, the things that inhibit productivity or promote productivity. Our
framework isn't about measuring the actual yield or output of an organization. Because as that Google paper
discusses, that's not something that is actually feasible with knowledge work, you can't count
the output. It's not a factory where you can count widgets.
Yeah, I think this is more and I think it's absolutely correct, which is it's a framework
that you can use to debug your engineering organization, especially when it has, you know, self-imposed
blockages or small things that are basically impeding large amounts of productivity, right?
Which is exactly what you should be using something like this for. Is that fair to say?
Exactly. Yeah. I would argue that developer productivity is difficult to even define. We aren't as an industry even close to being able to
measure it in good ways. What this framework offers and what other previous approaches have
offered is ways to understand and debug productivity and improve it. But actually
measuring productivity itself, at least our conventional definition of what that is, meaning yield or
output, that's just not something that is really feasible based on all the research and collective
experience we have in this industry. Yeah. An analogy I like to think of is that it's similar
to product quality. Like how do you measure that a product is high quality? You can maybe try to
track how many bugs you're getting in, but if the product's really slow, and customers are just leaving, that might not translate to a bug count. So it's similar in that
way. What do you think? Absolutely. There's a great book called How to Measure Anything.
And the subtitle of that book is How to Measure Intangibles. And quality is an example of an
intangible, right? Whereas widgets
coming out of a factory are tangible, you can count them, you can see them. Quality is something
that is abstract and intangible. And in the book, one of the approaches that is discussed is
psychometrics, meaning that in the same way that we take out a thermometer to measure the
temperature of a room, the thermometer in that case is a measurement instrument. It takes in
input from the room and spits out a number. We can use humans as a measurement instrument as well.
Humans can observe input. They can observe the world around them, and then provide data back. And so when you're trying to measure something like quality that is intangible, the actual asking humans to rate things based on a rigorous
approach and using that data to produce quantitative and or qualitative insight.
And you've been thinking about developer productivity in this space for a really long
time.
But while working on this, there might have been aspects that surprised you, even though
you've been in the space for so long.
Is there anything that comes to mind that's like that?
I think one of the things that surprised me is I have arrived at this understanding of
survey-based measurement as a practical and most often preferable approach to debugging
developer productivity.
One of the things that surprised me is how much opposition and prejudice there is against
survey-based measurement by tech leaders. A lot of tech leaders, when you bring up
survey-based measurement, scoff at it. They're not interested in it. And in fact, they don't trust it. And I
think that's one thing that surprised me, given that survey-based measurement is a well-established
means of measurement in other fields, such as healthcare, economics, and education.
So that's something I'm currently working on better understanding and working on helping educate leaders across the industry.
At the same time, the leading companies in our industry like Microsoft and Google and
Amazon heavily rely on survey-based measurement for insights about developer productivity.
And so I think there really just is an education gap right now and a gap in the way leaders
understand and perceive the effectiveness
of survey-based measurement.
Yeah, I have to agree.
I think even I personally was biased or thinking, what's the point of surveys?
We have all of the metrics we could possibly need.
We had Git metrics, CI metrics, pull request metrics, and like surveys are biased, people are going to complain.
But that actually tells you about all of the issues that you just cannot obtain from any of
these metrics, like, oh, a team is blocked, because they're waiting on security reviews.
And that information is in some Google Doc, hidden far away. There's just no way you're
going to get that through extremely quantitative means
unless you have like an organization that tracks things perfectly in a JIRA or something like that,
which I don't think any organization is. So I have to agree. I think even my mind about this
changed after running or like at least seeing the results of a few surveys.
Absolutely. And I've spoken to leaders at companies like Google.
And in fact, I was speaking with a researcher who focused on developer productivity in Google. And
one thing they shared is, of course, Google is one of those companies that does have really
in-depth instrumentation across all their developer tools and systems. And so in theory, Google is an organization that could rely more
heavily on system-based methods of data collection and measurement. However, one of the things that
surprised me is that they told me that they use both methods. So they use survey-based methods
and system-based methods. And they found that their survey-based methods provide the same
information as their system-based or log-based metrics.
And the takeaway from that is that companies that aren't Google, companies that don't have comprehensive logging across the entire developer tool chain, which is most companies outside of Google, Facebook, and Microsoft, those companies can and should be relying on survey-based measurement to capture these same
types of metrics. So if it works for Google at their scale and their size of developer population,
this approach should really work for all companies and all leaders.
One thing that might concern people is this idea of survey fatigue, which is also called out in
your paper. How often should you be surveying people? Or is there like
a different way you should be thinking about surveys? Survey fatigue and engagement on
surveys in general is one of the sour parts of survey-based measurement. You can't get around
it. And it's something that a lot of organizations do struggle with. I recently published an article sort of
sharing the approximate participation rates I've heard of across the industry. And I would say
across big tech companies, the average participation rate for quarterly surveys
hovers around 30 to 40%. By most comparisons, that is not a good participation rate for an employee survey.
And so it begets the question of why is this the case and how can organizations improve?
In my own personal experience working with a lot of organizations that use DX, we have seen sustained participation rates of 90% or above. And I don't think this can be boiled down to one feature or
one aspect of how we approach developer surveys. But I will say that I think a lot of organizations
make common mistakes when it comes to survey programs. And as a result, they don't see
sustained success over the long term. Just some common examples of mistakes I see.
One of them is just the design of the surveys themselves.
Typically, surveys suffer from questionable levels of confidentiality or anonymity in
the responses they capture.
Many of these surveys don't have a compelling purpose to them.
They're often coming from one particular
platform or tools team that's collecting feedback about their tools. It's not positioned as a global
developer listening program that is capturing data about the holistic developer experience
and how that data will actually be used to drive improvements. So having a clear and compelling purpose around a survey program is important.
But another really important part is to actually follow through on those promises and that vision.
So a lot of organizations promise the world when they deploy their survey.
But after one or two surveys, developers quickly come to realize that nothing seems to be happening
with the data.
Nothing seems to be happening on their team.
Nothing seems to be happening with leadership.
And of course, as a result, the entire survey program doesn't feel worthwhile and participation
drops.
And so ultimately, I don't think frequency by itself is the culprit of the challenges we see. I think frequency needs to
align with the purpose and the action that surrounds a survey program and really making
sure that those things are clearly defined and well executed upon. Those are really core pillars
in terms of driving sustained engagement over time with surveys.
Yeah. And that last point is kind of why I'm hesitant on, you know, rolling out a survey or
even a survey tool, because at least at my current company, just because I don't know if I'll be able
to actually fund everything that comes out of the survey, or even like 60%. There's a little bit of a fear in me. It's like,
okay, what if there's 30 issues that get surfaced and we solve six of them or five of them? Is that
enough? I know there's messaging we can do, but we don't have as much funding to work on this
versus other things. I don't know how often this objection comes to you or this is something that you hear about? This is a concern that I've had looking across the industry and a headwind I've seen in terms
of the adoption of survey-based measurement. My current view on this is that action or
specifically engineering initiatives, change initiatives spawned from surveys,
are not absolutely necessary to sustain a successful program.
What ultimately matters is the perception by developers that the program is worthwhile.
And collecting feedback can be worthwhile even if official initiatives are not launched as a result of the feedback.
For example, if the CTO of the company just simply told developers that the feedback was
really valuable and that although no action can currently be taken, the feedback is still
though valuable for understanding trends and informing priorities in the future.
In my view, that would be enough.
And if that message were re-emphasized by other leaders and managers across the organization,
that would alone make developers feel like their feedback was valued and that the exercise
was worth it.
So I think the point I'm trying to make is that survey results do not need
to be turned into JIRA tickets in order for that feedback to be made feel worthwhile and meaningful.
A simple acknowledgement, simple communication that values the feedback and communicates the
importance of that feedback to the business, I think, is most often enough to sustain a program
like this. I think that is very informative for me. And it makes sense, right? Like even
informing developers that this information is going to be useful, we may not have time
to work on it currently, but it's going to inform our future priorities does make it seem worthwhile. I'm curious about how you've seen companies.
So you talked about Google doing surveys
as well as looking at log-based metrics.
Have you seen companies that have taken things
like the developer experience framework
or learnings from tools like DX to make changes?
I don't know.
Are there some interesting case studies or just an example that you could share? Well, absolutely. I mean, at these large companies, these quarterly
surveys, and actually augment quarterly surveys with more real time feedback as well. But these
developer listening programs drive a lot of the priorities of the organization when it comes to
infrastructure, developer tools,
and developer productivity. Speaking personally with the organizations that I've worked with,
there's a lot of action and initiatives and change that has come out of the insights gained through
survey-based and system-based measurement. I think to bucket the two types of approaches
that I've seen, and I believe that both approaches
in combination are ultimately necessary. But generally, you see a focus on either bottoms up
improvement, which is a focus on local teams and local improvements. And on the other hand,
you oftentimes see a focus on top down initiatives and-down change. So to give you an example,
Pfizer is one organization we work with that has placed a huge focus on enabling their individual
pods or squads to get the data out of these periodic surveys and review them as a team
and drive improvements and changes at the local level all across the
organization. In contrast, another organization that I've worked with, eBay, is much more focused
on aggregating the insights at the enterprise level and driving larger, longer-term initiatives
through their developer productivity organization to drive change and improvement to all developers across the organization. In my view, both of these
approaches are good, but in an ideal case, I think organizations would use both approaches
at the same time. I think some problems in organizations lend themselves more to sweeping
top-down or executive-level initiatives,
whereas a lot of problems exist at the local level within teams, in a particular part of the code
base, or due to a particular workflow on a particular team. And so those types of problems
aren't going to be addressed by executives or developer productivity teams. Those problems
need to be addressed by the local teams themselves.
How have you seen engineering leaders decide
how much funding they should drive
towards developer productivity, platform-y kind of teams?
Engineering leaders have to think about funding,
infrastructure work, security work.
There's so many things to think about.
What have you seen or what do you think
the right approach is about how much total engineering
bandwidth should go towards developer productivity or developer effectiveness teams?
This is one of those questions where the answer is, it depends.
I've seen all kinds of examples from across industries.
I recently spoke with the former CTO of Atlassian and Shopify, and he advocates
for 50% of engineering headcount or FTE spend to go toward what he defines as platform work,
which includes things like reliability, productivity, internal tools, etc. In reality,
especially in the current environment, platform work is often being deprioritized and often can become second fiddle to core customer-facing work. And I think that's okay too. practice when it comes to what the right amount of investment is. It ultimately depends on where
your business is at on its journey, the context it's operating in, and what the potential ROI is
of investments in developer productivity and experience. And I think one clear trend I see
is that larger organizations, when they look at the amount of money that's being spent on their
developer organization, when you think about even 1% or 0.5% productivity or efficiency gains
multiplied across that headcount, there's potential for huge business impact in terms of
increasing overall engineering capacity to be able to be focused on delivering more value
and customer-facing work. And so I think that's one of the reasons why with the larger organizations
like Spotify, Google, and Microsoft, there's a huge emphasis on understanding and improving
developer productivity. And I think that's true even as you go down in size from those
large organizations down to organizations that are
100, maybe 200 engineers, typically, I see organizations start focusing on these types
of problems when they hit around the 30 to 50 engineer count as far as headcount.
Yeah, I think that makes sense to me. It's like each organization has its own unique challenges.
And you also have to see how much does increased investment here improve leverage throughout
the rest of the organization.
Maybe a last question around surveys.
I was just reading the paper again.
How standardized do surveys need to be?
You know, like how, I guess, bespoke do you think surveys
should be across different companies versus how standard do you think a survey could be?
Could I reuse the survey that I used in a previous company in a new one and just change
the name of the tools?
What do you think the best practices are?
I think that, first of all, developing reliable and valid survey items is actually a very involved and
expensive process. And so most organizations, I think, do not have the expertise nor the resources
to design effective surveys that produce reliable measurements. And so that's one vote for standardization, in my opinion.
That being said, no two organizations are the same. And in a limited survey consisting of
n number of questions, it doesn't make sense that every organization would be focused on the same
things. And so I think to answer your question, the ideal case is some sort of combination.
I think organizations can
benefit from leveraging the work of experts or other organizations who have developed accurate
and reliable measurements that are survey-based. However, organizations will also benefit from
thinking about their own particular needs and use cases and developing their own measurements and data points that they want to
capture. One distinct advantage of the former approach of standardization is, of course,
benchmarks. And with survey-based metrics, similar to a lot of system-based metrics,
it's difficult to contextualize the data and inform decisions without being able to compare your data against
the data of others. And I think that's especially true when it comes to sentiment-oriented measures,
things like satisfaction or ease of use of tools. These things are difficult to understand in
isolation without contextualizing them against industry benchmarks or the data from
peer organizations. So I think that's also one point to consider when thinking about the benefits
of standardization versus investing in bespoke development. And to open it up a little bit,
I know that DX is working on getting metrics from tools directly as well.
So I think DX maybe started with an approach or like a focus on surveys, but now y'all are thinking
of expanding it to get metrics from GitHub and Jira. What do you think is going to be different
about your approach, given all your learnings over the years? It's a great question. It has come as a little bit
of a surprise to myself that I find myself on year 10 of this journey in engineering metrics,
having already built solutions multiple times that deliver system-based metrics to now find myself
working on the same problem again. That being said, it's clear that the solution
that organizations need for getting complete insight into developer productivity, but not
just developer productivity, even just operational performance or team operations, requires more than
just periodic survey-based data. And one of the opportunities that we've
seen is to provide a different way of unlocking system-based metrics and insights as well.
So what we're doing that I think is different than what currently exists on the market is that
we're providing an unopinionated platform for system-based data. We really are providing a platform rather than
metrics in a box, which is the approach that I've previously taken with this problem. And you see
most companies and vendors in this space taking. There's a lot of vendors out there who offer
cycle time and DORA metrics and quote-unquote space metrics in a box. They proclaim that they have research-backed
metrics that will solve all your problems as an engineering leader. And of course, we know from
our experience that those claims are far from the truth. What, however, is true is that there is
some value in different cases for those types of metrics. However, it really depends on the individual
needs and contacts of each organization. And so to provide a solution that can meet
organizations where they're at, what we're focused on is providing a platform that enables free
access and visibility and reporting across both third-party tooling, such as GitHub, as well as
bespoke internal developer tools, which more and more make up a considerable portion of the
developer tool chain, especially at larger companies. So what we want to do is provide
a third-party solution to really just save organizations the time and trouble that many
go through to build their own data pipelines
and data warehouse specifically for engineering data. And since it's unopinionated, like,
as an engineering leader, how would I approach using the platform? Like, would I be doing a
bunch of setup? Or is it that the metrics kind of come in, but I get to decide how to use them? If you could just walk me through like an example.
Yeah, so what we provide is a ready to go or a data schema and data connectors that
are ready for analysis.
So if you're an engineering leader or lead a developer platform team, you can use our
solution to quickly centralize all your data
across your internal bespoke tools, as well as third-party tools like Jenkins, CircleCI,
GitHub, Jira, etc. And what I mean by unopinionated is that we don't offer charts out of the box. Instead, we provide an enriched schema that enables organizations to
run the queries and reports and metrics that they are interested in for their specific needs.
So that's, I think, the major point of difference between our approach and the approach of a lot of
organizations out there is that we've seen that many companies who purchase off-the-shelf
metrics tools find themselves in a tricky place where the metrics that come out of the box aren't
exactly the metrics that they actually want from the system. Or worse yet, sometimes organizations
like the metrics that are offered, but actually want to display those metrics in other places, for example, in their Looker BI dashboard or within an internal developer portal. So with our approach, we provide
an extensible platform where we really just provide the data in a ready to analyze and use format.
But you can use that data in any way you want, whether it's plugging it into your existing BI tool to produce reports, or whether you want to build an internal developer portal for presenting these metrics in
a specific way. So we offload the costs and complexity of engineering work that goes into
building data pipelines and data warehousing solutions around these types of tools, but we don't tell you exactly what your reports
should consist of. Yeah, now I'm just imagining all sorts of things like if you could slice and
dice like, you know, developer NPS data on how good their developer experience is, if you could
break that down and put that right next to perhaps some operational metrics, it seems like it has a
lot of potential.
And I can already think of how I might use something like that.
Yeah, that's a good point. And I actually completely forgot to mention that. What you
just described is one of the key reasons why we're excited about this solution and the value
it can bring to organizations is that one of the unique things about this platform
is that in addition to centralizing data from bespoke tooling and third-party tools like GitHub,
this solution also collates the data from surveys from our existing survey-based measurement
platform. And so combining that system data alongside the survey-based data can unlock unique insights
like what you just described for customers.
And that's one of the other advantages that we see in this approach of combining both
survey-based and system-based measures.
Yeah.
Well, I am still excited to see all of the future research papers that come out.
I'm getting to learn a lot.
So thank you so
much for being on the show. And I hope you had fun. Thanks, Itzav. Yeah, this is great. Thanks
for having me. Of course.