Software Misadventures - Cory Watson - Leading observability teams at Twitter & Stripe, how to succeed in a new org, effective ways to advocate for your team and more - #16

Episode Date: November 12, 2021

Cory is currently a Solutions Engineer at Jeli.io and very well known in the community for his work on Observability. His career in observability began at Twitter where he managed the observability t...eam and then he joined Stripe, where he created and led the observability team, this time around as a Principal Engineer. We talk to him about how he got his start in customer support and the role it played in the later part of his career. We discuss his time at Twitter where there was a power outage in the data center on the day he joined and how once he had to stay up all night dealing with file handle leaks. We also discuss how he created and led the observability team at Stripe as an individual contributor, how one can succeed in a new org, how to navigate information asymmetry in the workplace, what are some effective ways to advocate for your team and how we all are just humans trying to get stuff done.

Transcript
Discussion (0)
Starting point is 00:00:00 I'm sort of on phone duty, sort of like being on call. It's my day to answer the phones. And I answer the phone and it's a woman who's having a difficult time getting onto the internet. And I was distracted. I really wanted to go play. I really wanted to go... Here I am in an era of dial-up and I've got a T1 behind me and I can play Quake with my friends or work on... I wrote a few patches to the Linux kernel way back in those days. And so all this stuff was so fun. I just wanted to get back to the phone. And I was frustrated with the woman. She was, you know, she didn't know what she was doing, which is fine.
Starting point is 00:00:29 She's not supposed to. That's why she called me. And I'm helping her. And at some point, she interrupts me. And she says to me, son, I know that this is frustrating. But you being mad at me ain't going to fix this any faster. And I don't know who that woman was. I don't remember anything about her,
Starting point is 00:00:46 but she had such a big impact on the rest of my career because other than getting a start in tech support, which to your point is tough, being mad at somebody is not gonna make it any better. Welcome to the Software Misadventures Podcast, where we sit down with software and DevOps experts to hear their stories from the trenches about how software breaks in production. We are your hosts, Ronak, Austin, and Guang. We've seen firsthand how stressful it is when
Starting point is 00:01:14 something breaks in production, but it's the best opportunity to learn about a system more deeply. When most of us started in this field, we didn't really know what to expect, and wish there were more resources on how veteran engineers overcame the daunting task of debugging complex systems. In these conversations, we discuss the principles and practical tips to build resilient software, as well as advice to grow as technical leaders. Hey everyone, this is Ronak here. Our guest in this episode is Corey Watson. Corey is a solutions engineer at Jelly.io and very well known in the community for his work on observability. His career in observability
Starting point is 00:01:51 began at Twitter as a manager, and then he joined Stripe where he created and led the observability team, this time around as a principal engineer. After Stripe, Corey worked at SignalFX as a technical director, where he focused on advocacy, customer engagement, and product improvement. We had a blast speaking with Corey. We talked about how he got his start in customer support and the role it played in the later part of his career. We discussed his time at Twitter, where there was a power outage in the data center on the
Starting point is 00:02:19 day he joined, and how once he had to stay up all night dealing with file handlings. We also discussed how he created and led the observability team at Stripe as an individual contributor, how one can succeed in a new org, navigating information asymmetry in the workplace, and what are some effective ways to advocate for your team. We had a lot of fun speaking with Corey. Please enjoy this highly entertaining episode with Corey Watson. Corey, super excited to have you with us today. Welcome to the show. Thank you so much. Thanks for having me. I love this. Like I said, software misadventures, this is going to be a little, or like I said, no one else heard it.
Starting point is 00:02:55 It's going to be a little therapeutic. I think I love to talk about this stuff. It's fun to talk about how the process happens. Yeah. So we did this for the first time. I tried this experiment and I posted it on Twitter. We haven't done this before. And I asked the Twitter community, hey, we're interviewing Corey. What should we ask him? We got a few responses and I want to pick one
Starting point is 00:03:18 which was suggested by Uma, who we have spoken to before. So Uma, if you're listening, we're asking this question. And by the way, I did not really get the context behind this, but it sounded really funny. So, we thought this is the right place to start. And she said that her favorite Cody Misadventures adventure is the horse head Zoom meeting freeze.
Starting point is 00:03:40 So, what is that? Yeah. Yeah. So, I have one of those rubber horse heads that uh you see sometimes um and i i i keep it i don't have it with me today but i do sort of keep it around because i find that levity is lacking in many meetings and um one day uh when i was working at stripe we were having a big meeting where everybody was on Zoom, of course, like we're being inclusive of remotes. This was pre-COVID. And whoever was on Zoom showed up on some big screens behind the head of infrastructure, who was Will Larson at the time. So if Will happens
Starting point is 00:04:14 to listen to this, then sorry, Will, for this happening. But Will's standing in front of everyone and he's presenting something about infrastructure. And there's a trick in Zoom. So this is a pro tip for everybody out there who wants to ruin their software infrastructure career or what have you, that if you turn the camera off and turn it back on again, it sort of puts you to the head of the queue. It makes you visual in, you know, like sort of, it basically brings you to the front and it shows you. And so I was doing this thing periodically where while the presentation was going on, I would like turn my camera off and on and then have my horse head on in a weird pose, looking up to the left, looking down to the right, drinking from a soda can, whatever it was.
Starting point is 00:04:50 But one of these times that I did it, Zoom crashed. And so instead of this fleeting, possibly subconscious picture of a horse head, instead I'm just cemented on the screen as a giant horse head. But here's where the plot thickens. The Zoom crash caused someone else's name to show up. Because, you know, our names show up in the bottom corner of the screen. Someone else's name showed up and I got away with it. So no one knew it was actually you?
Starting point is 00:05:17 I'm pretty sure everyone knew it was me, but I had a plausible denial. There are only so many people who are regularly known for showing up to meetings in a giant horse head. So this has happened for – I've been remote for a long time, and it's not the first time something like this has happened when I worked at Twitter. Being remote – this was many, many years ago when being remote was a little less sort of accessible than it is now. But there were big, big, big projectors or big TVs in the rooms, and people would often walk by rooms and see a giant Corey head projected out there. So, I mean, I'm okay with this. It's good for the brand. So, yeah, that was a shout out to Uma for remembering that.
Starting point is 00:05:49 I wouldn't have recalled it, but when she mentioned it, I knew precisely what she was talking about. So, I'm glad that's the impression I made and not like my skill or something like that, that instead my horse head shows up. So, yeah, that was a pretty good one. Something to remember Corey by. I mean, I think we need more of those in every single zoom meeting i think so too can we can we have that picture and then to be the picture that we have with this episode i think that'll be pretty honestly if you if you if you drop me a line after this uh i will try to go find it because i'm pretty sure somebody will have a screenshot of it because i know i took one but it was probably like on a
Starting point is 00:06:24 stripe laptop which since i don't work for stripe anymore i don't have um but i might have a picture of it somewhere but i'm sure one of my colleagues at the time bothered to make a picture so uh yeah you might be able to get that it's not a click bait you know if that's literally your friend well if hey if nothing else we'll pretend we'll like i'll take a picture with the horse head on and we'll put on like an instagram filter that makes it look like it was old or something like that so old timey horse head we will do with the new one too it's okay i with the horse head on and we'll put on like an instagram filter that makes it look like it was old or something like that so old timey horse head we will do with the new one too it's okay i think the horse head okay okay yeah we can totally hook that up i'll send you one next week awesome uh okay so uh coming back to some of the other things you want to talk about
Starting point is 00:06:56 well i mean yeah we would love continue talking about the horse head by the way but i'm sure so talking about some of your career stuff so So I heard in one of your talks, you mentioned that you got your start in customer support. Now, I have a lot of respect for folks who work in customer support. I mean, it requires a lot of patience to be in that position. I mean, while we say customer is always right, maybe.
Starting point is 00:07:19 Or they. Exactly. Like customers can also be very difficult. Yes. So can you share some stories from those days where you experienced some of this? Yeah, I'm really glad you asked about this. I don't think I've ever told this story in a talk, but I'll tell a little bit about the background. So I got started and this was probably in the mid 90s or 96, it's probably 96. Like internet cafes were a thing back then, right? Like our, most of us, if we were lucky, had ISDN as a way of accessing the internet, which was
Starting point is 00:07:52 usually about 100, either 64 or 128 kilobits a second, which is, which is pretty slow. But most of us had dial-ups. So we were using 33.6, 28.8, something like that. And so I wandered into one of these internet cafes that had opened sort of across the street from my high school. And I started asking questions and, you know, wanted to know like what their access was like. And they had a T1, which is I think 1.4 megabits a second or something like that, which is insane at the time. And so I asked enough questions that the person who ran the thing offered myself and my best friend a job who were both sort of quizzing him but i didn't i didn't really ask a lot of questions about like how are you gonna pay me um so i kind of got paid in computer hardware for a little while which i didn't tell my parents about hi mom um but uh it worked out um and so anyway by that
Starting point is 00:08:40 did you mean like was that a euphemism for like you stole because they didn't know you? No, no. Yeah, I just took it home like to sell. No, they'd be like, oh, so, you know, we've got some like extra video cards laying around. You know, you want to take one of those home? And I'm like, sure. And, you know, at that time, I don't know how to negotiate a paycheck or do anything like that. So I'm like, yeah, that'll work. But I lived at home and was going to high school.
Starting point is 00:09:01 So like I didn't really need a lot of money. And so it all worked out. And sort of short story long, I ended up with a gig at a dial-up ISP in what is effectively rural Middle Tennessee. And this was, of course, like I said, era of dial-up. I've already said that, I think, three times. But anyway, the point was it was a lot of work. So, there were a couple of things that were fascinating about it. One was this was a bunch of – many of them high school kids who were working at this place and sort of linux was still something distributed on floppy disks like you know you had 112 slackware disks
Starting point is 00:09:34 of which the 74th was almost always corrupt somehow and would abort your install and i had only learned about linux from a friend who had tried to teach me a few things and then i joined this company sort of working and it's a bunch of us being system administrators. And a friend of mine from high school was the system administrator who was teaching me stuff. And so I didn't have a lot to contribute. So I helped out on the phones. And so I'm taking calls and working the front desk, helping people sign up for internet. And this is an era when getting on the internet was tough. Most people were using Windows 3.1, which didn't even have a TCP IP stack. So it didn't have any way of sort of accessing what we consider to be the Internet.
Starting point is 00:10:08 It only had like LAN protocols that you would use in an office if you were lucky. And so we had to help people install a TCP IP stack. Shout out to Trump at Winsock for any of my friends who remember what that is, any of us old people. And then we had to set up things like modem initialization strings. We had to get people online. and it was a lot of work. And these are people who, you know how to do stuff sort of on my own. I learned in a competitive environment, competitive environment, but that was very friendly. Like we're always trying to one up each other. Like, ah, look, I, I bought this deck mulcha off of, off of eBay and I'm going to set up this old operating system on it. And we thought that stuff was so much fun. Um, but I got a call
Starting point is 00:11:01 one day and you know, I'm, I'm sort of on phone duty, sort of like being on call. Like it's my day to answer the phones and I'm, and I answered the phone and it's a woman who's having a difficult time getting onto the internet. And I was distracted. Like I really wanted to go play. I really wanted to go like, here I am in an era of dial up and I've got, you know, uh, uh, a T one behind me and I can play quake with my friends or work on, you know, I, I wrote a few patches to the Linux kernel way back in those days. And so all this stuff was so fun. I just wanted to get back to the phone and I was frustrated with the woman. She was, you know, she didn't know what she was doing, which is fine. She's not supposed to. That's why she called me and I'm helping her.
Starting point is 00:11:35 At some point she interrupts me and she says to me, and I'm not going to emulate the accent, even though I can do it because I'm from the Southern Middle Tennessee area. She says to me, son, I know that this is frustrating, but you being mad at me ain't going to fix this any faster. And I don't know who that woman was. I don't remember anything about her, but she had such a big impact on the rest of my career because other than getting a start in tech support, which to your point is tough, being mad at somebody is not going to make it any better. And so at the end of the day, we've got to be patient. We've got to look at these big complex systems that we operate that are sometimes responsible for billions of dollars and realize that we're just all humans struggling to try to get to the other side of it. And there are customers on the other side who are just trying to get the job done. And so that idea has been a part of my career, whether it was myself supporting internal infrastructure teams, whether it was myself supporting teams that were on the customer side of things, whether it was dealing with customer support themselves, or whether it's talking to a customer.
Starting point is 00:12:37 And sometimes in the other direction, sometimes speaking to leadership or speaking to vendors or what have you. Like that sort of humility that you need to look at your work with is important. And it was important to me. I i don't i don't have a i didn't i didn't finish college i didn't get a degree in this i'm self-taught um and i've been very lucky so shout out to that lady i don't know who who she was what she's doing in her life but you know i don't know i'd love to i don't know send her a gift or something because it was it was pretty important to my career so yeah tech support was tech support was a good place to start for me. So when you made that, like when you came up with that realization, that's like a very important, as you said, almost like a life lesson, right? But were you able to just come up with that, at that instant? Or is it something that...
Starting point is 00:13:18 No. I'm imagining this like... In the fullness of time, I've come to respect it. I was probably annoyed at her for giving me that. But, and, you know, plus I'm old now, so maybe it didn't even happen. Perhaps it's just a delusion. But it's not something that I think fully clicked. Because I then went on to work for other ISPs. Later, I worked for like a computer repair shop that would do house calls.
Starting point is 00:13:41 And so that's a whole different thing, right? Like it's one thing to talk to somebody on the phone about, you know, their modem initialization strings. It's something else entirely to go to their home and fix the computer. But it also taught me sort of how plans never work. It's like the worst thing about having a plan is not having one. But the second worst thing is that they never work. And so you've got to have a plan when you go in. But that, I don't know, I gained a lot of patience for anyone who's delivering something that maybe shows up late or if you're in line. I was in line to pick up a prescription the other day and everyone was having some sort of problem. And it just made me more patient. I think it's made me
Starting point is 00:14:20 more patient with everything. Sometimes I think I'm too patient. Luckily, my partner is less patient and she will often advocate for me in ways that I'm bad at. But anyway, no, it took the fullness of time. It's something that I'm not sure I knew had a conscious effect on my career until the last few years. But I have tended to situate myself in companies where I want to help the people around me. And something I try to instill in the teams that I have built or in any situation I have to display leadership, it's that it's our job to make everyone around us better. I don't believe there are 10x engineers. I believe there are engineers who make 10 other people slightly better or something to that effect. That's certainly not a quote attributable to me. I'm sure I've read it somewhere else. But
Starting point is 00:15:03 that's what I want to be known for. And some of my peers have, I have seen conversations where I've been referred to in that way by people who I look up to. And I can't imagine a better thing to put on top of your career. That's a great point and a great story. It's the only one I got. The rest of them will be. I highly doubt that. So you mentioned leadership and team
Starting point is 00:15:27 building I would just want to say I want to bookmark it because we are definitely going to come back to this uh but the other thing I wanted to touch on in regards to customer support uh like one aspect you mentioned was patience which is a great quality not many of us have the other aspect that makes me think about when i think about customer support is empathy or like just trying to understand what uh the other side is experiencing either it's on a phone or whether you're on a chat or something like do you think how that the empathy part uh carried forward with you in the career ahead after customer support i think um i think everybody's got certain qualities that they've developed over time i
Starting point is 00:16:14 think empathy is one that i'm personally just okay at and and you know it's not doesn't you know i'm bad at other stuff i can't tell my left from my right i have to do that thing where you hold your hands up and look at which hand makes an L. So like I'm bad at a bunch of stuff, but empathy is not one of them. And I think that, I don't know, there's perhaps some mindfulness, some thoughtfulness to like putting yourself in a position of, okay, what is this other person trying to accomplish? And it turns out to be very relevant to, I think, dealing with incidents and the misadventures of software that in the moment, no one's trying to be bad. No one's trying to make bad decisions. The reality is that we believe
Starting point is 00:16:52 that what we're doing is going to make it better. Or in some capacity, the it is even questionable sometimes. Like what is the it that we're trying to make better? But at the end of the day, in the literature around resilience, there's some good illustrations of like someone lost in a maze. And it's easy when you're looking from above the maze down, which is the way we like to represent the fullness of time, right? Like in hindsight, everything looks so clear. But in the moment, it's difficult. And having the empathy for the person on the phone, the person sending the ticket, you know, because even if they are being, back to your point, like unreasonable and they aren't right, they're still representing a view of the world that you do well to acknowledge. And so whether it's – and even ourselves, I think something that's been important to me in the fullness of my career is understanding that we even often can represent dualities in this.
Starting point is 00:17:38 Like I can think my software is really good, but if I can like check out for a day or two and then come back and use it with a slightly different perspective, that even my own perception of this thing is going to change quite drastically. So being of, I also think it hurts me sometimes in my career because I think I'm so empathetic sometimes that I forget to have an opinion. And I read something recently that said the difference between junior and senior engineers is that senior engineers have a feeling for how things ought to be. They have an opinion. They have some sort of flavor or taste as to how things should go. And I think sometimes I have not had enough of that. But I don't know. I'm not going to complain. It's working. So, we'll just keep going with that. Something to improve on maybe. But yeah, I think empathy is incredibly important to this because at the end of the day,
Starting point is 00:18:21 being able to see this from someone else's perspective is often what makes the difference in a good product than a bad one. So, it's helpful at that level or even just like dealing with a co-worker on a given day, right? And empathy, I think, comes out quite a bit in the conversation we've had so far. And maybe this is a bad question, but do you have anything – I know it's hard to, you know, to me, maybe empathy is something that's maybe a little bit more innate. But are there things that you can do to, you know, try to actively like improve upon it? The reason why I ask, the last time we spoke with Ashwin,
Starting point is 00:18:58 and he almost had a, this is like, to me, it's like approximation, first order approximation at empathy, by thinking about the customers, their backgrounds, and when he's trying to build something, to really just literally put himself into their situation. Right, right. I was wondering, are there any sort of tips that you have to just, you know, more things to do at work?
Starting point is 00:19:21 I think, I mean, perhaps it is innate. I was watching comedians in cars getting coffee the other day, the Jerry Seinfeld thing. And he was interviewing, oh gosh, Steve Harvey. And they were talking about how comedy isn't teachable. And sometimes I think comedy is about empathy because you're trying to put yourself outside of the norm and look at something in a funny way, like, you know, what's up with airlines, you know, or whatever. So I think that the important thing is like what motivates you? You know, it's often to me about motivated thinking.
Starting point is 00:19:55 Like what is it that the person that you're working with, like what is their motivation? Why are they doing this? Why do they need this? And if you – I think that's a pretty simple question to ask yourself. Like sort of what motivates you? Even in your own work, when you say something like, I need this done, why, what's motivating you? And if you can just think about motivations, it's a really good short step over to sort of helping with a situation because it isn't about like gaming it or anything. It's just about trying to understand what people's motivations are.
Starting point is 00:20:23 At the end of the day, we're all being motivated by something. And it's often something that comes up in incidents and in problems that what is the motivation? What are we trying to get back to? Why are we doing it? Because, you know, the motivation of a person in leadership who comes into an incident is usually, I have gotten yelled at by a customer. I need it back to normal. But the motivation of an engineer is often, I got paged because we have run out of file handles. And sort of the gap between these is really just about motivation. It's about what are you trying to get back to? And leadership usually decries the engineers for not having the sort of greater awareness. But that's super difficult because they got paged. Their context
Starting point is 00:21:01 is very different. And so putting yourself in the shoes of the person who's asking questions like, why is my leadership asking for a status update? Oh, it's because their feet are being held to the fire for something different. And I can empathize with that because I'm glad I'm not getting held to the fire for that. That's someone else's job. That's above my pay grade to worry about that particular customer or whatever. But putting yourself in that position is often what helps you move along in your career. Because if you can empathize in that way, and if you can say, look at what this other person is motivated by, how do I position myself so that we both benefit, then you become more integral in the process. And or if nothing else, view to someone who can be very helpful. I had a boss once who made himself available for everything from changing light bulbs to
Starting point is 00:21:43 dealing with big budget questions. And I think that can sometimes... You have to be careful, time manage. Don't let yourself get caught up changing all the light bulbs when your job is to do big budget management. But if you are willing to roll up your sleeves and contribute in that way, then everyone around you usually looks to you as someone who can be helpful and just try to get the job done. So i don't know it's worked for me nice i love that um yeah that's like super actionable right it's like thinking about the the why when you um they do those things and i think we often do that i think you see that a lot
Starting point is 00:22:15 in sort of like open source help situations when people say what are you really trying to accomplish here and no one really likes that question because that's really actually pretty annoying. What are you trying to do? Yeah, what are you trying to do? What are you really trying to do? We've even got an acronym for it. I can't remember what it is. W-A-R-Y-T-T-D, something like that. What are you really trying to do? But I think it's kind of annoying too because we don't usually think that way. We're just trying to get a thing done. We don't want to reduce it to its sort of base components. But if we can work on that and if we can help people arrive at it without being snotty jerks about it in chat rooms, then I think it's beneficial for everybody.
Starting point is 00:22:55 But if nothing else, it's a good first order test of like, how can I be helpful to this person? If nothing else, it allows you to gain some of the patience because when you understand their motivation, it allows you to get out of your motivation because your motivation is usually to get the ticket closed as soon as possible. We see this so often in tickets when it's like, well, I put a reply and now I'm good until tomorrow, something like that because our motivation is usually to get on to something else. So jumping a few years ahead, you worked in infrastructure at a few places and then joined Twitter as one of the first hires in bed at SRE. Definitely very curious about what those early days, well, not for Twitter, but for you, were like at Twitter. I think you mentioned something about first day power outage. Yeah. So the first thing that I did after, well, first off, I was very excited to get to work
Starting point is 00:23:48 for Twitter. I sort of told myself one day that I wanted to try that. I taught myself Scala. I started to contribute to some of their open source projects. And I remember being very proud of some of my past work. And going into the interview and interviewing, I can remember who it was. I won't say his name, who it was. But I remember telling him about some of my past accomplishments.
Starting point is 00:24:06 And he was like, yeah, that's great. And then just moved right past it. So it was very humbling. I was just happy to get a free trip to San Francisco, but I ended up getting a job. The very first day, the first thing that I did was I went and got a cup of coffee, which I promptly spilled all over the person who sat next to me at Twitter. I spilled it all over their desk. So that was my first misadventure. So shout out to Stefan Zurcher if he happens to be listening to this.
Starting point is 00:24:32 Sorry for spilling coffee all over your desk. Thank you for being such a wonderful mentor in my first days at Twitter. But yeah, the power got turned off. So they were doing some power maintenance at one of the data centers. And this was pre-cloud stuff. I mean, the cloud stuff existed, but all the Twitter stuff at the time was, was, was physical data centers that we ran and something like every other row or half the racks got turned off. And I remember, um, I don't know how deep we'll get into this, but I remember feeling over, over time, over different incidents,
Starting point is 00:25:00 but I remember feeling very sort of cool at the time. Like, not cool as in like, that's cool that that happened, but cool as in, I'm not going to freak out. It's okay. You know, and helping to bring the system back online. And I remember it made an impression with some of the leadership at the time that they were like, wow, this guy just started and he's willing to jump in and sort of help with all this stuff. And it's easy to be, I think it's easy to be confident when you're clueless, which I was. Why confidence? Yeah, absolutely. Now, I haven't any breeds innovation. confident when you're clueless, which I was. Why confidence? Yeah, absolutely. Now, Evidy breeds innovation.
Starting point is 00:25:28 That's a saying. But yeah, the power got turned off, and so I helped to bring it all back up. And so one of the difficulties of big systems like that is turning them back on is often harder because you don't know how they work, right? Like you know what they work when they're working but as we saw with the Facebook outage Earlier this week sometimes turning a system back on is really difficult because none of the caches are warmed up You know, you're you're in the midst of what we call a thundering herd problem So all these were tricky, but yeah my first day or my first week I think was in the first week the power was knocked out in the data center and we had to bring it all back up But I also spilled coffee all over everything. That was another thing I did
Starting point is 00:26:03 I also took down all of Twitter's observability stack once by naming a dashboard with emoji. That just happened one day. And I worked on the team. I was either the manager of the team at the time or the SRE that was supposed to keep the stuff working. But instead, I managed to take it all down by putting emoji into it. So shout out to the Twitter observability team who will remember that. Which emoji was it? It was the little spinny star, like someone bumped their head, then an alien, and then the poop emoji. I remember it very fondly. It was those three. That's worth it.
Starting point is 00:26:33 It's creative. Yeah, I thought so. Yeah, it was chaos engineering, I think is what it was. That's right. A little early, but yeah. I was like, I wonder what happens if I do this. I wasn't expecting it to take down the whole Twitter observability stack. It only took down the sort of read side.
Starting point is 00:26:47 The right side still functioned fine. But I took down, for internal use, I sort of took down all of Twitter's observability stack. Just so that no one was able to use it. That's it. Yeah, yeah. I don't remember what. I think someone had to go in and edit the database to sort of remove it, basically. And then we patched in support for emoji after that, probably.
Starting point is 00:27:03 But I was just curious what would happen. Okay. How many dashboards did you end up having after that Twitter? Did emojis proliferate in all dashboards after this? I don't remember. I know we talk about it. It's come up on Twitter once or twice. Those threads sometimes come up and say like, hey, senior engineers tell horror stories so that those of us early in our careers maybe know a few more things about how you messed up. And I think I've told that story a few times on Twitter that I broke it and then tagged in, you know, some of my old teammates who remember that pretty fondly.
Starting point is 00:27:35 But yeah, shout out to, oh gosh, I don't even remember who fixed it. I'm pretty sure Jen fixed it, but shout out to her. I won't call out Twitter handles. I don't want to put people on blast, but yeah. So speaking of, so you're managing the observability, but shout out to her. I won't call out Twitter handles. I don't want to put people on blast, but yeah. Um, so speaking of, uh,
Starting point is 00:27:47 so you're managing the observability. What was that transition like? Um, from when you started? Yeah. Um, I had done management previously, so I worked in the middle Tennessee area and technology stuff for a few years.
Starting point is 00:27:58 So I'd done like VP roles and stuff. So the managerial roles were not unfamiliar to me, but, uh, the transition was weird cause I went from, so I joined the twitter observability team because i'd never seen anything more rad like i i'd never worked at a place where they had invested in that sort of stack what we now consider to be normal observability was brand new at least in name um and i'd never seen anything so cool i remember joining their their it wasn't slack it was hip chat at the time but joining
Starting point is 00:28:24 their channel and being like hey i just want to say what you do is cool. And then leaving because I was afraid. And then I just like ran away from the channel. But later I ended up asking to be the SRE for that team, which thanks to Tung Vo, that was something that I was able to do. And then later took on the management of the team. It was weird. I remember I messed up pretty early because I was like so excited to be the manager that I got everyone in a room. And I said, let's write down everything we could do. And I thought that would be so fun because I was the manager. Now I got to help decide and then sort of the de facto product owner as a result. So I was like, oh, we're going to get to do this. And I remember my first managerial like feedback session afterward, I got dinged
Starting point is 00:29:01 because people were like, why did we put up everything we could do so that we could not do any of it? And I was like, oh, I didn't think about it from that perspective. So I guess I didn't empathize very well to go back to our earlier thing. But yeah, it was a weird transition. Thanks to everybody on the team who made that, I think, easier for me to become the manager. But again, empathy. I just tried my best to make the team work as effectively as I could and to give them what they needed to succeed. I just wanted to make the environment that I would be happy to work in. And I think we often leave for managers more than we do for anything else. So, I hope I did a good job by that.
Starting point is 00:29:36 And when did you decide that the management track was for you or at least you tried it out? That it wasn't for me? Was that the question? No, no, no, no, no. Oh, that it was for me. Yeah, yeah, yeah. Or that you decided to try it out? That it wasn't for me? Was that the question? No, no, no, no, no. Oh, that it was for me. Yeah, yeah, yeah. Or that you decided to try it. I think that many people often feel like the IC role tops out somewhere and you start to feel sort of like a grass is greener thing.
Starting point is 00:29:56 Like I need to go into management to sort of further my career. And, you know, I think still to this day, like the rise of staff engineering has, I think, given us a sort of another level. But still, if you want to be the head of engineering hither and yon, then you probably need to go into management at some point to advance. So there are very few, I think, you don't see a lot of extremely senior IC roles that have the scope maybe that we shoot for. So I still think there's this idea that you need to go into management. But I thought it would be good for my career. Um, at the time, at the time, I think I had aspirations that I, I've always thought it would be really rad to have a CTO title.
Starting point is 00:30:30 I think in practice now, I, I, I don't want that anymore, but, um, at one time I thought it was. And so I thought it would be a good step for my career. Um, but if, but the real reason I did it is that I wanted to see the observability team realize our goals. I wanted us to build a suite of tools that made running Twitter effectively and efficiently. I wanted that to be as good as possible. And so I thought that would be the best way to realize that goal was the real reason I went into it. And the power and the money, all of those.
Starting point is 00:31:00 Not necessarily in that order. I definitely want to get your thoughts on advice for people who are starting into management, knowing that you realize that some of the things you could have done better. But you also mentioned that you've thought about the CTO title. And it sounds cool, but now you don't want it anymore.
Starting point is 00:31:21 Can you speak why that is? Yeah, I don't want the responsibility anymore. Power comes with, let's go back to the empathy thing again. Like the power of being the CTO or whatever comes with a huge amount of trade-off. Like your customer changes. Like you often think, man, if I was just in charge of this place, we'd finally be able to invest in that technical debt that I think is so important. Probably not. Like your manager or your manager's manager or your manager's manager's manager is probably not, does it probably does not have it out for that little bit of the tech stack. There are probably legitimate trade-off things that you say that when you look at them with
Starting point is 00:32:00 almost all disagreements to me come from information mismatch. And so if you are frustrated by what your leadership is choosing, that's usually because you don't have visibility into the tradeoff that your leadership is making. And vice versa, right? Like you wish you could work on this thing instead of that feature that you've been told you must work on. But that's almost always because it's counterfactual. Like, you only are saying that because you saw this fail once. We didn't know than the few that have been exposed by those incidents. So I think, to me, the amount of tradeoff that you have to make doesn't really match up. Maybe at a smaller company, I think that would be fun for me. But I just don't personally – I mean, I know some CTOs at big places nowadays. Like, shout out to David at – David Singleton at Stripe.
Starting point is 00:33:03 Like, he's got a hard job. I don't want that job. Like working with him. And I definitely don't want to do that job like that. I don't know how he does it. Like he balances so many things and does so much stuff that I don't understand. So Helen, that's a CTO. I don't even like VP of Inj, VP of Infra.
Starting point is 00:33:18 Like I don't even know, man. It's so much easier to sort of just be like you, your keyboard and a technical problem. Be like, this is what I want. But I don't know. Maybe that'll change in a few years. But at the moment, I think I'm a bit burned out on stuff breaking. So, that means I don't really want the – I don't want the responsibility. I really related to the sort of information asymmetry.
Starting point is 00:33:40 I think that's something. Oh, yeah. Information asymmetry. Thank you. Yeah. Yeah. Being sort of the source of most like misunderstandings um how do you think about like at the end of the day though right like especially if you're in a icy position where
Starting point is 00:33:52 you don't have access to that information how do you go about not getting sort of frustrated or upset at the fact that you know yeah it is not what you want right like because of information yeah i think i think i mean we come back to empathy here and motivation, like, well, why, you know, what is the push? You know, at organizations I've worked at, like I've worked at startups that were scrambling for revenue. And so when revenue is king or, you know, year over year revenues or whatever, you know, it is, it's, uh, you know, timelines viewed, you know, what, what's the motivation? What is the thing that leadership is being pushed that they must do? Like, it's reasonable. I mean, we, we all do this. Like if
Starting point is 00:34:32 you, if later this evening, I'm like really having a sweet tooth and craving some chocolate, I'm probably going to eat some chocolate. If then tomorrow I go to the doctor and they tell me that my cholesterol level is bad, like, eh, I made a tradeoff. That was what I needed at the time. You know, so I think for helping this sort of thing, you first got to think about what the motivation is. Like, put yourself in the shoes of the business. Because ultimately, that's what we're here for. We're not here to write code.
Starting point is 00:34:56 We're here to do work that either helps the system accomplish the task or helps the system work effectively such that the task can be accomplished, you know, whatever. This is capitalism after all. And so, if that's the case, put yourself in that situation to say, what is it that the business needs? How do I best align my work with those needs? And I'm not saying how do I best accomplish those needs at all costs. I'm saying, how do I align my work? What's my section of the business? You know, it'd be easy, you know, working in observability at Twitter or Stripe as I I have, to say, well, I'm just here helping get charts built and stuff like that. And I often would talk with people on my team and we'd discuss, how do you know your work is worthwhile and valuable to the company? Well, I couldn't think of anything more valuable than seeing the chart that showed that Stripe was working effectively and efficiently or seeing the timelines reloading quickly. Twitter, those are, to me, visualization and visibility into the system is the most important part of it.
Starting point is 00:35:53 Like, how do we know it's working at all? And so I think you just have to. So anyway, I'm not talking about observability as the way I'm saying you have to find a way to align what your work is with the needs of the business. And when those two can work harmoniously, you're doing the right thing. And if you can't find that, you might be doing the wrong work, either because the business doesn't need what you're doing, or because you aren't doing what you need to be doing. So you may need to change teams. You may need to change jobs because you may need to find the place in the organization where your passion and the thing you're working on aligns effectively with what the business needs.
Starting point is 00:36:27 That's what we're there for at the end of the day. They're not just employing us to draw cool charts and make cool JavaScript visualizations. As much as I would like that to be my job, that isn't my job. I mean, today I was working on incident reviews at my job. And I would love to just work on that all the time. Actually, that is my job. So that's different. But for you, that may be be different so you know you have to find where your passion
Starting point is 00:36:49 and your work aligns with what the business needs that that is my advice for that so you you talked about alignment earlier you mentioned that a lot of alignment misalignments happen because there is information asymmetry uh one thing that i was is slightly tangential, but related. When people are in a meeting and people don't get each other's perspectives, one way people ask this, and okay, I'm going to piss a lot of people off by saying this. One way of saying, let me repeat what you said just so that I make sure I got it. And sometimes it's like, well, I think i was pretty clear in what i said and i think people say this many times but uh and they're they have good intentions in that moment it doesn't really come across that way uh to the person who's who's saying trying to get a point across.
Starting point is 00:37:50 And I saw a chart the other day that had like other ways of saying, I wish I could find it now. Maybe we'll find it for the show notes. But there was a thread on Twitter that was like, if you're trying to say this, maybe you should say this instead. But I mean, I think that repeating something back that form of sort of active listening can be helpful. But like anything, it can become unhelpful depending on your tone. I think something that I would remind people is that people can – I've said this before. My last job, I had a – you know how you can set your Slack
Starting point is 00:38:17 bio. Mine had like 50 alternative titles I had given myself. And one of them was voice smiler. Because if like right now, I'm sort of making a frowny face. But right now, I'm myself. And one of them was voice smiler. Because if like right now, I'm sort of making a frowny face, but right now I'm smiling and you can sort of hear the difference in our voices. And we bring ourselves to work in that way. If you're frustrated, like that, going back to the beginning of this discussion, that woman that I was working with on the phone that I was doing tech support with, she could hear me being upset. She couldn't see me, but she could hear me. And I think that you have to be really careful because your whole self is expressed often in your language and in your tone and your position of your body and stuff. You have to be careful and think about that.
Starting point is 00:38:55 I'm not saying you have to treat yourself like an actor and sort of like adjust to how, you know, and sort of pretend. But if you're frustrated, it's going to come through. Maybe you should just not talk. You know, maybe you should just wait. A friend of mine has had this acronym, and I'm sure he didn't come up with it, but W-A-I-T, wait, why am I talking? Like, are you accomplishing anything by talking right now? Or, you know, could you just wait a minute? Because I often find that if I shut up, someone else will say it. Because I'm in a room full of smart people, right?
Starting point is 00:39:23 Like, that's why they hired us. And so, anyway, I don't know that I'm answering the question. I think it's very specific. But I think that you have to just be careful about how you bring your tone and how people think about your work. I'm also someone who likes to speak in public. So I think it's easier for some than others, right? Like they often hire us that, you know, they give us these big quizzes and stuff and have us like, don't know implement red black trees and reverse link lists but in practice our real jobs going back to the questions you're both asking are just trying to figure out how to work with other meat sacks pretty much like we're complicated and it's and it's really hard
Starting point is 00:39:58 uh completely agree with that uh i think people deal with emotions differently in some cases. Emotions overpower how the tone comes across, even if they don't mean it. And that's why I said people are coming from good intentions and they're trying to understand the other side or the perspective. I think it takes a little bit of practice to be able to do that effectively, even if they wanted to. And sometimes you're going to lose. It's okay. Okay, I want to switch gears a little bit. So we talked about observability at Twitter. You also led observability at Stripe. But this time around as an IC, and by IC for folks who might not know,
Starting point is 00:40:42 which I think all of them do, we'll just say IC is individual contributor. As I understand, like this was a different company, different culture. Though you bootstrapped the team and you led the observability initiative at Stripe, what were, what was different in terms of challenges? And what was easier this time around? Yeah, well, one of the challenges was, can I do that again? Like the observability team at Twitter existed. I had no involvement in the creation of that team. I merely took the work of others and wrote a blog post about it.
Starting point is 00:41:17 And then, you know, contributed to some of the team later on perhaps. But the stuff at Stripe was different. Number one, when I joined Stripe, I had a reputation for some observability work. And so I remember my interview. Part of my interview at the time was effectively an implementation of StatsD, which went on. At my time at Twitter, we wrote a StatsD implementation called Veneur, which a lot of companies use.
Starting point is 00:41:39 And I ended up joking that I've just been trying to ace my interview for the next three and a half years. But the things that were difficult, anytime you come into a new organization, some advice that I often give is for any of you who happen to cook, eggs can be somewhat difficult because if you put cold eggs into a hot thing, they will usually cook. And so the technique you're usually taught to help with eggs is to put some of the hot liquid into the cold eggs and sort of bring the temperature up of those eggs and then add the eggs into the hot thing. So that way they don't sort of instantly cook and turn into scrambled eggs. You instead can incorporate the egg in its more liquid form. And I think that's how it has to work in organizations.
Starting point is 00:42:20 Like, I'm sure that you think you know everything and you have probably had a lot of success, which you are pretending isn't confirmation bias. But when you come into an organization, they have a lot of context that you don't. For example, I came into Stripe from Twitter where Twitter was a very heavily metrics driven organization. They didn't use logs for anything. And whenever I came into Stripe, Stripe is a very heavily log-based organization. And I spent the better part of my career sort of fighting, trying to get us to be much more metric oriented. By this, what I mean is not metrics as in we make decision with metrics, but as in charts that we draw with approximations of 10-secondly metrics or whatever the standard is, instead of, say, taking log files and assembling
Starting point is 00:43:06 them at read time in large queries. The answer was somewhere in the middle. I fought that, and I shouldn't have. I should have spent more time learning and incorporating what the culture of Stripe was into my own culture. It ended up working out because there was room. But in larger organizations, sometimes there isn't. You can't go in and make those types of changes. But when I joined Stripe, there were just a few hundred people and it was a little easier to make those changes. But I think where I did succeed was I came into that org and I didn't immediately start saying, you're missing this and I need to go make it. I paid my dues. I worked with my teammates on projects that Stripe was doing at the time that
Starting point is 00:43:44 were not the thing I wanted to do because those were aligned with what the business needed. And so by demonstrating to the organization that I'm willing to do whatever the organization needs to do within reason, I mean, I wasn't, you know, working crazy hours and not getting paid anything. I wasn't getting paid in computer hardware or anything crazy, you know, I was getting paid real money. Yeah. I was getting paid real money. So I did the job and then I asked the organization for, okay, I did this. I did what you wanted. It's no different than any other relationship, right?
Starting point is 00:44:11 Like, hey, we've been watching what you wanted on Netflix for the last two weeks. I want to watch this show. And you shouldn't suggest something that's so off kilter that they're not going to want to watch it. Nobody wants to do that. My partner doesn't like to watch Formula One. I do. I'm not going to suggest, would you watch the race? She's not going to want it. Nobody wants to do that. My partner doesn't like to watch Formula One. I do. I'm not going to suggest, would you watch the race? She's not going to want to. But if I suggest something that's a little more aligned, it should work out. And Twitter needed some of the stuff that the
Starting point is 00:44:32 observability team was going to do. And so I was asking for something that made sense based on the fact that I had paid my dues on these other projects. And we started small. I took what they'd give me and I demonstrated value. And we did it week over week over week. And, you know, we sought out the advice of our peers. What do our peers need to make this work? How do we make people happy? And optimally, if we do that right, you don't even have to ask me if we're doing a good job because the people around us will tell you that we're doing a good job. And so we made a reputation inside of Stripe as an organization who was going to help you get your job done effectively. And, you know, we built a, we built, we basically did everything. We were very sort of full service.
Starting point is 00:45:09 That's not the way I would necessarily do it in every organization, but it was, it was a way as an IC to get what we wanted and to bootstrap the team. And it ended up being a team at peak of somewhere over a dozen people, you know, handling the largest con like some of the largest pieces in terms of what we spent on outside vendors and stuff like that across all of Stripe. I don't, I'm not sure what I'm allowed to say about any of those things, but it was big. I mean, it would be big, it would have to be big. But, you know, obviously stuff like AWS being more expensive. So in terms of like paying dues, you know, before you can do what you want to do that
Starting point is 00:45:43 still align with the company, is that what you you already know like new when you join stripe it's like oh you know maybe like two weeks in it's like ah i think this is what i need to do but you know i'm two weeks in uh let me go do this other thing or was it something that maybe the you didn't have a clear idea but then by the year mark it's like everything just kind of came into place um i don't remember but i think that i i think that in any organization that you join i don't know that i'm well calibrated for it i think i've joined some organizations and maybe waited too long to manifest an opinion and in other organizations i've probably manifested one too early like i think with twitter i or with stripe i manifested one a bit too early about like we shouldn't care so much about logs. I never won that.
Starting point is 00:46:28 I mean, we cared about them forever. In many ways toward the end, I think I incorporated a lot of log stuff in. It's kind of the last minute. I wish I'd done it earlier. But I think you have to sort of – there is no answer to this, right? You shoot your shot. And sometimes you're going to be early and sometimes you're going to be late. But the best thing that you can do is listen. Because at the end of the day the org is going to tell you what
Starting point is 00:46:48 they need but you know some of us some of us take those shots and we succeed and did you take sort of a startup kind of mvp idea when you pitch the idea being like hey you know give me a certain amount of time let me show you some the yeah i i basically did it by myself so i asked like can i just do a little bit on qa and sort of show you the power of some of this and then sort of worked my way up from that and then i didn't say like okay i need five people and 10 billion dollars and all the unobtainium you can find i just said like yeah sure like can i just get a little bit of help with this or that or the other and you also have to be willing to sort of take on, you have to do some horse trading. So there were parts of the stack that I wanted to take because I wanted to fix them. And there
Starting point is 00:47:34 were others that I didn't want, but I was willing to take them because I knew I'd be able, it would be more appetizing to other people. Because again, what is your motivation? Your motivation is that you need to like get rid of some of this ownership of some systems that maybe you don't care as much about. Well, I'll take that. And in exchange, you can give me this. I mean, at the end of the day, making a compromise is everybody getting a little something out of it. But you're also losing a little something. And so my recollection of it, at least, is that we positioned the team in that sort of way. And then we tried to demonstrate success. So myself and the folks that, that helped me get the team started.
Starting point is 00:48:07 So Andreas and Josh, you both helped get the team going. Like we, we just tried to do good work and then attract other people to that. And it worked internally and externally, but there were people who joined Stripe to work with us. And then there were organizations internally that, that,
Starting point is 00:48:22 you know, loved to work with us. And then there were some that didn't, we had to go help them and, you know, convince them, but you know, you're not loved to work with us. And then there were some that didn't. We had to go help them and convince them. But you're not going to make everybody happy. But I try really hard to make everybody happy. I once was told, this is one of the better compliments of my career, because again, I think I'm too empathetic and often too slow to manifest an opinion. Someone said, Corey will be super nice to you, but you'll do what he wants.
Starting point is 00:48:41 I don't know. I wish I could get more of that in my life. But yeah, I don't feel like we did that most days. I don't know. I wish I could, I wish I could get more of that in my life, but, but yeah, we, I don't feel like we did that most days. I think we mostly just tried to make, mostly we just tried to make strike good. Like what could we do to make stripe good when stripe was busted?
Starting point is 00:48:55 Cause a lot of it was just like reading incident. Like I was involved in every incident I could get into. I helped set up our early incident review stuff. Like I came to every incident review I could go to. It's sort of, you know, bloomed into the into the future into me now only really working on incident stuff in my job at Jelly. But we – I don't know. You can learn so much by being – you learn a lot. Sorry, to bring it back to the content, the subject of what we talk about here.
Starting point is 00:49:20 At the end of the day, you learn a hell of a lot about systems by how they break. And being willing to sort of muck around and learn from that like some of the best advice career advice i can give folks is just hang around when something breaks and learn just learn about the org learn about the people learn about the systems like you will learn both technically and socially so much about what's going on around you yeah uh one thing which i would add to that is uh it's so true uh at least in my experience. Like when I joined LinkedIn four years back, one way I got to learn about how systems work were by just participating in incidents, even when I was not on call.
Starting point is 00:49:53 And I did not realize it back then because I was like, I'm curious. I want to know how this works. But I think especially when you're new, like being part of incidents and seeing how they break, that learning is almost exponential. Like there's no amount of document reading or reading the code that's going to help
Starting point is 00:50:09 unless you see how it broke. I wish that organizations would be, because I'm convinced that pretty much everybody who, you know, the three or four people who were interested in listening to me after this is over are all nodding in agreement to this. They're saying like, yeah, I learned so much from that. Yet as organizations, we do so little to capitalize on it.
Starting point is 00:50:30 Yes. You know, like we, at best, we write a postmortem that is a recitation of the facts and a reminder that your data is valuable to us, you know, or something like that. But at the end of the day, like, I'm giving a talk, or I'm about to record a talk for RubyConf soon. And one of the things I'm mentioning there is there was a flight that crashed many years back. And the NTSB saved the wreckage of that flight in a hangar for 25 years and trained
Starting point is 00:51:01 investigators with it. And even now in 2021, when they're finally getting rid of it, it was TWA Flight 800. And whenever they're getting rid of it now, because they're finally going to get rid of the wreckage, they still digitized it. Because it turns out, incidents are usually the moment when we take what we thought. So, we have an idea of how things work, right? And then in an incident, we learn how they actually work. And they're never in agreement because mental models are always slightly wrong. Well, they're at least slightly wrong. Sometimes they're very wrong. But, you know, an incident is when we learn that. And there is no more sort of vulnerable time for us to peek under the covers of how our systems work and learn from them and i don't know i i sort of feel like more companies need to hire incident analysts and bring that idea in and talk about it and and and tell stories you know because what's better you know what your podcast is all about and why i think it's so interesting is what's better than hearing a story of how the system ultimately prevailed because there's usually some pretty crappy stuff in between a lot lot. Yeah, a lot.
Starting point is 00:52:10 Yeah, we spoke with Lauren Hochstein from Netflix a few weeks back, and he mentioned the same thing. Huge fan of that. Huge fan of that guy. I'm blanking on the name he mentioned, who Netflix hired to be in that exact role of an incident analyst, and that is a full-time role, and I think that is incredible. Like, not many teams or orgs do that. Yeah.
Starting point is 00:52:28 I, I've often felt it would be, you know, whenever, like if tomorrow there were some, I don't know, problem where, you know,
Starting point is 00:52:34 all of one type of tree fell over simultaneously, then a bunch of news networks would scramble to bring in an arborist who really knows that type of tree. Well, right. And, and it's because we often don't have the context we need to that type of tree well, right? And it's because we often don't have the context. We need to know about that tree and what was interesting about it.
Starting point is 00:52:50 And you can tell I just looked out my window and what the first thing I saw was a tree. But we need that. We often need someone who can come in and tell us like, well, this is why we made these decisions. Like it's a storyteller job, but we also can't sit for three hours with a hundred people on a phone call and listen to a story. There's a balance here. But I mean, a lot of the work that I do today is really just about that. I've chewed up a non-trivial amount of my life to the uptime gods and I'm not better for it. I mean, I've been very lucky. I've built a good reputation and I've probably made a lot of money off of it and stuff like that. And this is, you know, it's an industry. I'm very lucky to have fallen in the career that I fell into.
Starting point is 00:53:27 But at the end of the day, I don't know, my family sacrificed, I sacrificed. Like, I haven't told any of the horror stories, you know, we haven't talked about too many of those. But I remember a night when, and this was, I think it was on Thanksgiving. I actually had to set an alarm on my phone and wake up every hour to restart a thing inside of Twitter that had a file handle leak. We didn't have time to fix it. So I had to sit up, babysit it all night, an hour at a time. And that stuff adds up. I've had incidents where at a startup that I was at, I destroyed a Kafka once. And it was the Kafka. It was the important one, the one that I had all the data in it. And I had to give the incident to another engineer
Starting point is 00:54:02 so that I could go lay down. I couldn't't, I couldn't, I couldn't mentally deal with it. Um, which we think we can, you know, we, we, we often idolize that sort of behavior, but it's also okay sometimes to give up the reins and give them to somebody else because I didn't have the, so yeah, I don't remember where this was going. Um, uh, but I just wanted to make sure that got onto the podcast because it's important. We have to admit that sometimes we don't have the, the fort to stand up to it some days we do some days we don't i used to think i was good at it but now i don't know over the last few years when an incident happens it's just like i feel it i feel it in my stomach i feel that drop you know sometimes we rise to the challenge sometimes not yeah well i think it's but that's why we have friends because they take drugs
Starting point is 00:54:45 yes uh that's true and i think it's actually really important to acknowledge that uh that it's sometimes it's difficult even for people who are as seasoned seasoned as you i mean i don't know if you would like that word if i used it but like people who have had so many experiences it's better it's better than old i've been i've been professionally software engineering in some capacity for 25 years because i said i got i got a gig at 16 and so it and it counts i was getting paid for it we had outages we had you know i broke production probably a whole bunch of times whether it was the dial-up side or the billing side or you know breaking the dns servers or whatever it was the dial-up side or the billing side or, you know, breaking the DNS servers or whatever it was. Like I've been breaking software continually for about 25 years. And it's been really rewarding, but it's also, some days you just don't have it in you. And, you know, I think
Starting point is 00:55:36 we have a culture of assuming that everyone around us sort of knows everything. And we'd like to pretend like we always have the answers. But I can tell you, I've reviewed a lot of incidents and we're all kind of dumb. Like we are building things that we barely understand. And it's getting bigger and bigger and more complex every single day, but that's okay. Look at the innovation, look at the stuff that we're doing. It's really rad. But we have to acknowledge the fact that the more we build and the bigger it gets, the more specialization and awareness and understanding we have to have. Other industries, I think, are better at this than we are, but we're young and silly. Making progress.
Starting point is 00:56:13 Yeah, getting there. I want to talk about team building a little bit. You mentioned that you bootstrapped and started the observability team at Stripe. I think it's amazing to do that as an individual contributor because you're working with a lot of influence without that authority. How do you actually go about requesting a team? So you mentioned that you did the POC or an MVP.
Starting point is 00:56:37 You showed some progress. There were people who were helping you. But at some point, you're going in and requesting resources and saying, hey, i want a team for this to be successful you're not managing them directly but you're still leading the team i think uh if i recall or if i read this correctly you are the product owner almost for observability yeah i have rarely seen that where it's actually a person who is not in a management position be a product
Starting point is 00:57:05 owner which is i think incredible how does that happen how does that work exactly how did you get there uh it it happens because you just make it happen uh through force of will sometimes but um i i don't know that there was a plan um i i would i would love to pretend as if there was but uh i mean i i guess there was to some extent. My perception of it was that organizations have to decide what to build and what to buy at a sort of fundamental level. And so you know that the thing that's your core competency is the thing that you want to build. If we're a company that makes, I don't know, software for tree production stuff, I don't know, the tree analogy from earlier. If we're a company that makes software for arborists, then that's the thing we want to do.
Starting point is 00:57:54 That's our bread and butter. That's what we make. And so do we want to build our own servers? Probably not. But that's at one end of the spectrum. And so I envisioned that we could build a tightly coupled to Stripe's infrastructure and software observability stack that was nebulous. Like if you had said to someone, I mean, I think now if you ask people what observability is, they're going to have five different definitions. But what's important is that it's a word people understand now. Like SRE. So I was the second SRE hired at Twitter. And back then, SRE was like a what now? And nowadays, if you say we need SRE, everybody just nods. Nobody gets fired for having SREs anymore. But I think there's a transitional period where if you're lucky, you can either create that. was. So for those that don't know, Kafka is this sort of really interesting combination of, I'm not going to say database, but it's sort of this interesting combination of sort of a very explicit, like very deliberate use of a log and the capability to write to it and to keep track of the position of it and then to have many consumers make use of it. That concept is otherworldly at the point that it was created. And so now, though, everyone just uses it as a building
Starting point is 00:59:25 block. You just say, we're going to use a Kafka. And so I think in some respect, if you went back and you ask Jay Kreps or the other folks that worked on it in its early days, they'd probably say, well, it was a crazy idea. Maybe they've got some confirmation bias, or maybe they were deliberate. Maybe he did a great job. But now when you look at Kafka's Confluence business, it's like, well, of course they're successful. They're like big public con committee now and all this other stuff. But at the time, no one knew what it was. And so I think I benefited significantly from the fact that the greater industry was materializing that observability was a real thing. And so therefore, I had some legitimacy. For people who need to carve that
Starting point is 01:00:02 path without that benefit, I'm not sure how. I think I lucked out from that. I also contributed to it. I mean, I gave many a talk about observability some of the very earliest, but at the end of the day, the rest of the market took over. And so I think to some extent, you're trying to carve that path and demonstrate that value. But there's a concept from biology called stigmergy, which is this idea that when someone sees you, like it comes from ants. Like ants don't have central coordination. There's no like big ant, you know, in a traffic control center telling ants what to do. But if you're an ant and another ant walks past you, like the opposite way, carrying a donut, you're going to be like, oh, damn, I like donuts. Where did he get that?
Starting point is 01:00:42 And you're going to go to where the donuts are. So if people see you being successful at something, then they're probably likely, if they also seek that success, to sort of like at least come by and check it out. And I think you have to, again, we're all humans and we're working together. You got to think about it in marketing terms. What are you doing to get yourself out there? How are you showing your success? I think to a fault sometimes within Stripe, we advocated for our own team. We went out of our way to broadcast what we were doing and the success that we were having. We probably annoyed the hell out of some people.
Starting point is 01:01:14 I remember a manager of mine, shout out to Kaz, or not a manager of mine, but a manager at Stripe one day said, hey, Corey, you succeeded. Shut up. You don't have to keep doing this. That wasn't the way he said it, but shout out to Kaz. He's a great guy. But you have to advocate for yourself. That goes back to a lot of the conversation we've been having here before. It's an extremely delicate line. I think it's harder for... I've also been a huge beneficiary of the fact that I'm a white guy in tech. So a lot of this stuff is easier, quote, for me. It's harder for people for whom that is not the way they're expected to behave. And I think we can do a lot. We obviously need to do a lot to make that better. But I think you have to advocate for yourself. And you have to learn
Starting point is 01:01:55 when that's okay and when it's not. When is it okay to stand up and crow about your success? And when is it better to just shut up and work? And it's really delicate. And I don't think there are answers for how. There's no formula for that. Man, I was going to ask for a marketing 101 for engineers. What's the thing that you can do there? But is there anything, though, that people are not doing enough of, maybe, or doing too much of? I mean, I don't know what you should do, but I can tell you, you should do the thing that works the best for you. Like, it's easy to say, ah, man, like, I don't know, so-and-so's got a super great Twitter game. I need to step up my Twitter game. Or man, like, they keep writing all these
Starting point is 01:02:35 amazing blogs. Why am I not doing that? Or man, their TikTok is fire. Like, how do I, it doesn't really matter what you do. It matters what is the least effort for you and what can you do that will help show your work that doesn't require you. You can't be other people. You have to be the thing that works for you. There's no version of you better than the one you are. You just have to do the thing that works for you. So if you make great TikTok videos, then great. Do that. If you make great little memes, do that. Whatever it is that works for you. I speak and I write, and so I can do those things internally and they're low effort for me. But for other people, they're higher effort. Don't try to do the thing that's high
Starting point is 01:03:18 effort in addition to your work, which is probably hard for you to begin with. You're not going to have the energy. It's like you've spent all day slogging away at this project. And then now you're expected to go write an amazing blog post for it. If that's not easy for you, do something different. I do think that all of us would benefit from spending more time learning how to write. At the end of the day, it's one of the most effective ways for us to communicate. So if I was going to give a tip, I'd say take a writing class. So I'm glad you mentioned that because I've read some of your blogs. I've listened to a lot of your talks over the last few years, and I've really enjoyed them. And I think it's not just, hey, I'm consuming so much information through reading or listening, but it's actually fun.
Starting point is 01:03:59 That's been my experience. I think that you owe, my personal opinion is that if you're going to take the time to read this the least I can do is entertain you a little bit I used to always at the end of anything I sent in a big internal email to Stripe which has a very well understood culture of writing internally and lots of stuff I would always put a funny gif at the end
Starting point is 01:04:18 I was going to guess emojis I do a little bit of emojis sometimes but if you get to the bottom and you've read all of this the least i can do is show you a funny gif with like a baby goat in it or something you know like that's the least i can do because you took the time i believe that if you're going to talk to me the least i can do is entertain you on occasion um because you've you're listening and and i appreciate that so so here's a funny so you mentioned getting better at writing
Starting point is 01:04:47 and it's something that is a goal of mine and i'm no i know guang's too uh and so many other people's as well how did you get better at writing what has worked for you and something that people can think about or try out because again like you, you need to know what works for you, but it takes a few iterations. So I would love to know what worked for you. I think that it's important to have a voice. That's hard to describe, but basically when you write,
Starting point is 01:05:16 don't try to write like somebody else. If you're good at tweeting, write in the form of tweets, write in small doses. I have a tendency, like over the last few years, I've steeped myself in a lot of literature around resilience engineering and the like. And it has meant that a lot of the blog posts I've written in the last few years have gotten a little more, I think, academic. I feel like I need to have more citations and less fart jokes. But I think that you have to stay true to what works for you. Because again, you need to make this low effort for yourself.
Starting point is 01:05:45 Do what works for you. Keep it in your voice. I also think that if you want to get good at writing, you have to read. Who are some good writers that you admire their work? And that you can read and say, ah, I'm learning from this. Like, oh, that's a cool turn of phrase. Or that's an interesting word. Or whatever.
Starting point is 01:06:06 Reading helps you be a better writer. And then there's no substitute for doing it. I also, I read something a while back that says you can't edit and write at the same time. Like, don't be critical of yourself when you're writing. When you say I want to be a better writer, sit down and write. And then the part of your head that's like, Oh, that's not good enough. No, no, no, no, no. That, that voice can go and like take five and then come back tomorrow and then look at that critically. I often admire friends of mine who like write stuff and then they like have a cool thing at the bottom that's like, thanks for these five incredibly amazing people who I think are so cool.
Starting point is 01:06:39 They read that. That's hard. It's hard to get people to do that. Don't worry about that. You don't need editors. Just write. And some good stuff will come out of it. Some of my earliest writing, I think, was like professionally bloggy stuff was about Java garbage collection. And it's still, according to when I bothered to open those emails from Google, like some of the most read stuff that I've ever posted. I don't even know. I don't think it's relevant anymore. But for whatever
Starting point is 01:07:01 reason, it's still there. And I was writing that for me. I was writing that because I, I mean, a good example, a good example of this is a Bork on Twitter, B zero RK, Julie Evans. Like she, she writes her way and it's amazing. And she's, she's done such an amazing job of carving that out. And it's, but if you had said before, like, this is the style of writing and drawing and zines you're going to see, you would have said like, well, I don't think that's going to work. But now it's, it's, you see so many of those hand-drawn type things because you're speaking to yourself. And I think she's amazing because she, I've worked with her and she is amazing. And she's amazing because she just asks great questions. And I think we have to remember that of ourselves. When we remember that not everybody in the room knows,
Starting point is 01:07:41 for your example earlier about pointing out what an IC was, not everybody knows. And so if you think, I don't know, we're getting a little off topic but i think you have to write you have to read and there's no shortcut for it some people are just better at it naturally but i don't know don't be too envious of those people yeah i would just plus one to uh bork or julia julia's writing is awesome i've loved her blogs um one thing we're trying to go back to which you mentioned was advocating for your team and for yourself and as as the lead for observability at stripe um i want to know how you balanced this because i think there is a fine line between advocating for your team showing your team's great work but also not having it come across as oh look how great i did it's the team's work how how are some effective ways
Starting point is 01:08:27 of doing that where again the team feels that they are the ones getting the credit when you are publicly speaking or over emails um i think that when you show your success you should show it in the frame of the person that you helped succeed. There is no success if it's yours alone, right? There's a person that I worked with many years back who was – and I was a manager and I had to manage this person. And they were good, but they weren't bringing the team with them. And I said, this is a team race. You can't cross the finish line and then stare at everybody like, why are you so slow? Because we don't win until the last person crosses.
Starting point is 01:09:04 And I don't know. That's a silly sports analogy. I'm not really like that. I just think that at the end of the day, if you have done something cool, show it in the context of where the cool was beneficial. So you see this with companies. What do you do when you make a startup? You try to find that customer who will let you put their logo on the front because that instantly transm transmit to other interested parties that you have helped this customer be successful. And so if you as a team have done something cool, then, well, first off, I wrote the email. So why do I need to mention myself anymore? Right.
Starting point is 01:09:39 So now you can take that opportunity to plug the work of your peers or your teammates or what have you, or of the person on the other team who helped you to get there. Like we did a huge migration from vendor A to vendor B when I was at Stripe. And we plugged the people on the teams that did the work. Because, yeah, we did it. And, yeah, that was my call. And, yeah, it was a big deal. But at the end of the day, we were only successful because a bunch of other people helped us. And so I think when you're advocating for yourself, first advocate for the benefit that you've done. Like if you, I don't know, solve a cool disease,
Starting point is 01:10:10 like you don't just be like, look at this cool microscope picture I took. No, no, no. You show the people who were cured. You show the benefits. You don't show the work. Nobody cares about the work. They care about the benefits. And I know we get caught up in that in technology because the algorithm or something like I saw somebody today working on iou ring to make it like really really fast doing a gazillion iops and that's rad but show me the database benchmark you know six months from now when it hits the mainline kernel and someone's able to deploy it and put it on their mysql cluster and they show a big advantage like you show me that and that's what you put in your highlight reel you know you don't put the check signing in your highlight reel
Starting point is 01:10:44 you put the touchdown catch i don't know why you don't put the check signing in your highlight reel. You put the touchdown catch. I don't know why. Why am I using sports analogies? It's working. I don't use sports analogies. Okay. But I think that's what you have to do. You have to.
Starting point is 01:10:52 But again, this is how I do it. I don't know how other people would do it, but it's how I do it. And I think that you show your success by the success of those people around you because that's what makes you successful in an organization. That's what's repeatable. That's why they pay me. They pay me to make the org successful not because i'm cool if they could if they would pay me because i'm cool i'll just do that like that'd be a way better job be way less work you were on the observability team i've repeated that i don't know how many
Starting point is 01:11:18 times at this point um so you saw everything at stripe uh when people are building new dashboards. A lot of old dashboards got migrated to the new platform, as I heard in one of the talks. What were representations of the systems that they monitor. So in many cases, they contain the biases and the, just the concept, right? Like we are attempting to visualize and to make understandable a system, which under the hood doesn't do anything that we think, you know, like we used to, computers used to do what we asked. We used to tell them to do a thing and they would do it. But now with the rise of Spectre and Meltdown and stuff, we know that the computer is actually just arbitrarily doing other stuff all the time that we didn't even ask it to do. It's predicting what we're going to do. It thinks it's smarter than us, that silly machine.
Starting point is 01:12:17 And so dashboards often contain that sort of thing. In many many cases, I'm, even though I'm, I have advocated so much for observability, I would recommend you kind of blow them up all the time and start them over again, because they are a series of, of past mistakes that you've tried to make visual for people. There's also a cool thing about dashboards. And I wish I could remember where I got this from. I did not consider this. And so there's no science behind it, but I know I got it from a scientific paper that when people look at dashboards, what they're an expert, like if I look at a system I built, then I look at that dashboard. I'm not looking at the dashboard for an explanation of how the system works. I'm looking for a confirmation of the things I expect
Starting point is 01:12:58 to see. So if I just did like, if I ran a command and I expect certain charts to have moved, that dashboard is sort of the visual representation of my expectations. That makes it pretty much useless for a newbie. So if you just got hired into the company and you're looking at that dashboard, it doesn't mean anything to you. It's all Greek, you know, unless you happen to speak Greek, in which case it's something else. And so for a lot of the things that we did, well, first off, like every dashboard is just full of mistakes. Like, oops, that's in gigabytes. It's supposed to be megabytes or oops those are nanoseconds not milliseconds there's a lot of that um pretty much all visualizations are terrible like going back
Starting point is 01:13:35 to the earlier discussion going that we were having about like they hire us for one thing and we're really talking to you know as meat sacks to one another we hire people by doing all these algorithm stuff and then we ask them to make dashboards and make data visualizations. Bar charts are bad. Stacked area charts are horrible. I don't know why people use all these things. They're really bad. They hide things. Our choices of visualizations are biases themselves. So I think the better question would be which ones of them aren't terrible because they're kind of all terrible. But they're the best thing that we've got. And so I think the biggest misadventures I would describe in there is just the silly idea of undertaking large migrations like that.
Starting point is 01:14:21 Like it's amazing to think, oh, man, I'm finally going to get to go and make this giant change until you're like 75% of the way through it. Because that's always the phase of the movie where the hero gives up, you know, and they have to, and somebody like has to come remind them why they do stuff like that. That was the worst part of that work. But, but at the end of the day, I'd say for dashboards, at least most of the misadventures are in how wrong they almost always are. And they're always missing something. You know, you, I think that what you should invest in is understanding how dashboards are made and how to ask questions of the system. That's what, that's what observability is about. It's about asking questions of the system that you hadn't thought of until just now. And that's really hard to do.
Starting point is 01:14:53 So one thing you tell me if we should move on from this one, uh, in terms of team building, like how do you think about the kind of people you need on the team when it's starting out or when you started the team, even at Stripe, like we can take that as an example. Because one thing recently that I'm thinking about is what kind of members you want on the team as we're doing a similar migration, but that's a different story. But like, as one of my colleagues pointed out, like, hey, you need to think about do you need multipliers or do you need leaders, do you need optimizers, etc. I'm curious, how have you thought about that in the past uh in some of your experiences i think the most important thing you can have on a team is diversity um because everyone brings some of my favorite co-workers have come from completely different walks of life or professions some like one of the most influential people in my career was an arts major. And, um, you need those different perspectives,
Starting point is 01:15:51 those different ways of thinking, because you make dumb thoughts all the time. Like, imagine the power of being working in user interface and having someone on your team who has like, I don't know who who has sight problems. Like I remember working with someone at Twitter who I saw through a meeting room that they were using the zoom feature because they had vision problems and they would like repeatedly zoom into different parts of the screen. Imagine how much different it changes your role at Twitter when you realize how important alt text is, or when you're working on a team. And I'm not just talking about diverse physical diversity, although it is very important, but there's also stuff like diversity of what your background is. If you work in
Starting point is 01:16:27 an infrastructure team and you've got somebody who is willing to come over from the product side, think about the amazing things they bring to your team to help you understand how to more effectively work. Or imagine working with someone from finance who can tell you why you're not getting budget approval for some of the things because you don't speak that language. And that's what we're all searching for in diversity is to try to find, you know, a more richer view of sort of what's going on around you. And so, I mean, I think you do need some of those things like settlers and builders and multipliers and whatever errs that you have. But I think at the end of the day, finding people who have a shared vision about what the team is supposed to be and what you're trying to build and who bring diversity. You want people who are at different stages in their career.
Starting point is 01:17:10 Think about the power of having a junior engineer in your team who can ask, why are we doing that? What's this? What's that? My parents taught me that if I can't explain something, I don't really understand it. And so if a junior is there, they're constantly helping you challenge that to make sure that you know why you're doing this, you know why you're doing that, you're questioning those things, or that you have the representation of someone in the room who comes from a different part of the world or a different part of life or a different background than you do. And so I think the most important thing that you can do is seek out people with other views, because ultimately, you may find, oh, now it's a little harder to get to where we're trying to go because I've got to
Starting point is 01:17:47 convince a more motley crew of people to get there. But I guarantee you, you'll have a more resilient and interesting end product than you do from finding the five people who all agree that you should do it exactly this way. Because that's probably scary and bad. Yeah. The reason I started smiling as soon as you mentioned, if you're an infrastructure team, get someone product. I can attest to that. And it happened yesterday in my team. We just have a new person joining. It's an internal transfer.
Starting point is 01:18:19 He's coming from product. And we are doing a migration that impacts a lot of product folks. We were internally discussing, hey, is this a required must-have, nice-to-have? I mean, priorities and trade-offs. And this person just jumped in and said, hey, by the way, I'm coming from that team. This is how we use it. Here are the links to why. So yeah, now make the choice. And that made the decision so much easier instead of speculating. So yeah, great advice. I mean, that's all I'm saying.
Starting point is 01:18:47 It does raise the bar of coordination sometimes because that means that going back to the information asymmetry, that is the definition of diversity. It's this asymmetry of people and their experiences and what they represent and how they do their jobs. But it does take more effort, but the end result is better because when you work on that information asymmetry and you come out, I mean, you know, you think about the number of features that get put into like social media platforms. And then you find some group who is underrepresented and they're like, oh, did you think about how this could be used?
Starting point is 01:19:18 And you're like, oh, you didn't because you didn't have anybody around for that. And at the end of the day, one of the biggest parts of resilience is about anticipating the types of things that are row of any of any stripe like it doesn't matter you know they if if you're if you're working on a database you hire a bunch of database nerds why do you not have somebody from the product side they're ultimately the ones that use this you need their you need their visibility you need their insight into how the organization works that wasn't a rapid fire answer that was a long answer i got to do better on these short answers this is why we're running out of time because i talked uh no. No, we appreciate it. Well, there are definitely more questions. I mean, I can tell you the things we wanted to dig into, like your time at Stripe again,
Starting point is 01:20:10 vendor management, your work you're doing at Jelly. And I feel that as soon as we go in that direction, we're going to take way too much of your time and it's already 6.30 there. So, what we would ask for instead is maybe around 2 some point, if you have the time in the future. Oh, yeah. Yeah, I'd be happy to. I think that the way I would summarize to what you just spoke, like, I jumped into my work early in the Bay Area because I really wanted to work in the Bay Area. I just thought, like, this is where it happens, you know, so often that's what we're inspired by.
Starting point is 01:20:39 And then I did it. And then I did it a couple more times because I was like, I wanted to prove I could, that it wasn't a fluke. And then, you know, I got did it. And then I did it a couple more times because I was like, I wanted to prove I could, that it wasn't a fluke. And then, you know, I got into it. Like, I loved what we were making. And I believe very much. And I picked a company that, like, aligned very much with what I was hoping for. And then now sort of I think I've changed in my career that I'm – now what I'm after is, but why? Like, how can I make this better for the people that follow me? I think too often we throw engineers because, you know, we've had this like side thread going on where we keep talking about how people are harder than technology.
Starting point is 01:21:13 But I think that at the end of the day, we are hiring for people who are good at technology and then we're forcing them to do a lot of stuff around people. And I just want to find ways that we can make companies better at people. And so the work around resilience, the work around incidents, the work with Jelly is about, at the end of the day, I can't imagine a more noble cause. I mean, there's a whole lot of noble causes, I guess. There are plenty of more noble causes. Let me re-say that. I can't imagine something that I can do more, that I can have more of an impact on than helping organizations learn how to make more effective use of these wonderful people that they've brought together and seeking out the information that they have and bringing that information in and sharing it. So going back to what we were talking about earlier about incidents, like how do we harness that trauma?
Starting point is 01:22:00 Because that's what it is in many cases, right? Like that poor person. How do we harness that trauma for learning and how do we bring that back through the organization? Because all these misadventures, we highlight the ones that are sort of traumatic that mean so much to us. I think I remember more about incidents than I do about the good things that I've done at most of the companies I've worked at. But at the end of the day, those were just so expensive. They hurt. And I just want to make them better for everyone and try to make us more effective at it. I feel like we burn up younger engineers by throwing them into this cauldron and expecting them to be good at it. And it's a learned thing. And so anyway, that's sort of the highlight of my career. And also to find ways to sort of do that without having to be the CTO or whatever. Like I don't want to have to be, I don't really want the power.
Starting point is 01:22:49 I, I've often heard that people who are best with power are the ones who don't want it because they're willing to give it up. So maybe that's why I'm doing it, but I don't want to be a huge organization and have a ton of people that do what I say. I want them to do it because it's good. And I want to show that and have them follow
Starting point is 01:23:05 that stigmergy that I mentioned earlier. They see me having a good work-life balance and being happy in my career and having lots of cool donuts and then saying, I want some of that too. How can I do that? I'm like, well, here's my way. But as you go through and you make this trail, please just leave a trail for the other people around you so that we all get better at this. Because at the end of the day, it's just a bunch of humans trying to make stuff work. Thank you so much for sharing your ways with us today and leaving that trail for a lot of listeners. I said this earlier before we started recording, and I'll say it again. You're very modest, Corey, and you're awesome. So thank you so much again for taking the time. This was a blast.
Starting point is 01:23:46 Yeah, no problem, man. I was happy to do. Hey, thank you so much for listening to the show. You can subscribe wherever you get your podcasts and learn more about us at softwaremisadventures.com. You can also write to us at hello at softwaremisadventures.com. We would love to hear from you. Until next time, take care.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.