Software at Scale - Software at Scale 10 - David Cramer: CTO, Sentry

Starting point is 00:00:00 Welcome to Software at Scale, a podcast where we discuss the technical stories behind large software applications. I'm your host, Utsav Shah, and thank you for listening. Hey David, welcome to the Software at Scale podcast and thank you for doing this. David, or as his username is dkramer, as I've seen around, is the founder and CTO of Sentry, which is an exception monitoring tool that I've used quite a bunch, and it's a pretty popular tool. David used to work at Dropbox, where he used to work on the internal CI server at the very least. And that's where I know of him from Git Blame and various things like that. Welcome, David, to the show.

Starting point is 00:00:45 Hey, thanks for having me. Good to be here. Great. Can you walk us through, you know, the start of Sentry? Like, it started off as like an open source project in 2012, is that correct? Or even earlier than that? Yeah, so honestly, I forget the dates these days. It's been so long. It started off as this very, very simple open source project. It was basically just an extension to a framework where a dashboard errors. I think it's May 2008 is the first commit back when Google Code was still a thing. And then 2012, I think, was when we started the SaaS business, I want to say. But yeah, so it's been around a very long time.

Starting point is 00:01:27 Just kind of naturally grew. It was always like a hobby project. Like anybody that works on open source, you don't do it because... You kind of do it just because it's fun. It's interesting. It's like community stuff, right? You just like interacting with your peers. And so that's kind of what Sentry was, among a number of other things.

Starting point is 00:01:44 And I think productivity was always a passion of mine just to make my life easier, you know. So just born out of that desire and need to improve things. Yeah. And clearly at that time, there wasn't like any great system or like even open source framework in order to like easily track exceptions or things like that, right? Yeah. It was kind of honestly, I feel like it was still early internet to some degree. If you look at the maturity of everything now, I think there was in 2008, I actually

Starting point is 00:02:12 don't know that there was anything public, even like a paid service. I do know that by the time we started the SaaS business, there was one other, there might've been a few other paid services, but the oldest one that I'm aware of these days is called Airbreak. Considerable one of our competitors, but they were very focused on Ruby at the time and we started with Python and kind of graduated from there. Okay. So that's, yeah, I mean, 2008, like, it doesn't seem like there were a lot of, like, services and all of that. What was, like, the major thing that got people interested in using Sentry other than you? How did it get, like, its initial traction?

Starting point is 00:02:57 That is a very good question. I think if you build something useful and you do it in a fair way and make it accessible, or more so if you build something and make it accessible, if it is useful, people will use it. I think that's definitively what you see in open source. So Sentry, I think just gained traction because it actually helped people. It was open source. They could run it for free, no strings attached kind of thing, which is one of the things

Starting point is 00:03:21 I would value out of open source is that as somebody who seeks validation all the time for my work, that I want my work to actually be useful and people to enjoy it, open source is a very good channel to achieve that, right? Because people will contribute. Um, Uh, and you'll, you'll get a lot of that feedback in like a normal cycle. So for me, early days, it was just like, okay, people liked it. They gave feedback. They, they contributed at least from ideas and bug reports. Um, I, yeah, I was still earlier in my career. So it was fun actually, uh, working with folks on like,

Starting point is 00:03:54 how do we make this better? And I would just spend a lot of my free time on it. Um, again, I was like a kid, uh, just moved to San Francisco somewhere in that timeframe. So yeah, and I don't know, I'm a big believer in that kind of methodology of just build good things. You'll be rewarded along the way, you know, kind of one way or the other. But I will say like over the lifetime of Sentry, a lot of the growth and traction was just sort of continued effort. So I was a very active member of the Python community for a long period of time, especially the Django community.

Starting point is 00:04:32 So I would be at conferences, I'd be speaking, I'd be hanging out with my peers, we'd be talking about things, which still today, even though now I'm an executive at a company, is like, I would rather spend my time that way, you know? And you do that enough and you mean well enough. And I think people, they're going to remember that. It's brand, right? Like even Dropbox.

Starting point is 00:04:51 You use Dropbox because it just works, because you enjoy it. You tell your friends that you like Dropbox because it's well-built kind of thing. And that's how we did Sentry too. It's like, well, we just try to build it. We try to build it so it's fun to use, easy to use. And people, you know, they take that with them. They go tell their friends. Do you remember the experience of like the first pull request or first issue you got? Because I remember like I had a really tiny library that I used for like a, like a hack

Starting point is 00:05:18 week, not even a hack week, like a hackathon in university and somebody actually contributed. I felt so good. do you remember any experience like that i don't really remember what my first was i remember the first one of the first tickets i created um on a different repository and when you're like young and inexperienced you often you're not professional at all right like you just do stuff you're like hey help me with this or something i was certainly like that um i mostly remember it because I could not get anybody to make the change. And it was like, even today, I would say it was clearly a good change to make. And it was just such this vivid thing for me, because it was like, how do I convince them that we should do

Starting point is 00:05:57 this? And I never did. I think eventually 10 years later, they closed the ticket as won't fix, which I was proud because I'm like, at least you said it's over with now. We don't have to think about is this going to change or not change. But that's probably one of my older memories. I will say when I got started though, it was fairly different because I mentioned Sentry was on Google Code when I first created it. A lot of my interactions back in those days were IRC and so you would have like a big community channel on IRC which if you're not familiar is like slack but not run by a corporation um and you would just be talking to the community about like hey what about this idea what about this idea kind of in the same way you would in real life like if you're just hanging out with people right um you know we

Starting point is 00:06:40 didn't have people requests I don't even frankly really understand how people contributed prior to GitHub. I do remember using track and these other technologies and trying to attach patches, but it was a wild different thing, right? Like fulfillment of the wild, wild west of open source. Yeah, there's a git submit mail or something, which I think you need to use in order to contribute to the git repository

Starting point is 00:07:04 because they don't use GitHub full requests and all of that. Yeah. That's cool. So you did kind of the user validation by asking people on IRC whether this is something they would use or not. Did you do any of that for Sentry or did you just start building it because you really needed it? So Sentry was actually born kind of in the reverse.

Starting point is 00:07:24 Like somebody's like, hey, like how would you go about, you know, capturing errors and putting them on a dashboard? I'm like, oh, that's kind of an interesting idea. Let me show you an example of how I would do that. And that was the first commit. Oh, interesting. So I didn't actually see the need necessarily for it at the time, but it was just compelling enough that I wanted to help first off,

Starting point is 00:07:43 which was important. And I'm like, oh yeah, this is actually kind of interesting. And then I don't know when that changed, when it became like, oh, here's like an example piece of code I gave somebody that works to, oh, I'm actually using this and find value in it. It was certainly over the next couple of years. But I do remember one of the defining moments for me was I had joined this company called Discuss. It was a comment is, I guess it still exists, a comments widget. And at its kind of, at its prime,

Starting point is 00:08:10 it was one of the larger consumer technology companies in the sense of it's just reaching scale of infrastructure. And that's why I joined ultimately, but they were using Sentry when I joined. It was called something else at the time. And I'm like, well, one, this is cool. It's like a company using my stuff, which was not that that uncommon it's just like when you join a company and they're using your stuff it's very different than you know what company is using your stuff right um

Starting point is 00:08:32 and it was also really bad and so my memory was mostly like i i don't remember what i did but in the first week actually let me set the stage a little bit discuss was like 10 people or something when i joined so it was not really a company It was a bunch of kids building some internet software. Right. And so just, you know, take that idea in mind. And then the first week I joined, I did something that caused a bug, shipped it to production, took down all of Disqus, which was like comments on CNN.com and stuff like this so it was not a great thing to do um and then it actually made it really hard to recover from the downtime or took longer because century was actually causing a problem because like century was having a scale issue and that was such a vivid memory because that was a forcing function to get me to kind of actually spend some time on century to like rewrite it into something that would scale um or at scale better. And so I did in like that first month of Disqus, I like as part of my job for

Starting point is 00:09:28 Disqus, technically speaking, I fixed Sentry to work better and to handle that situation better. And I also think for me, it was such a defining moment in my career because that's also when I recognize the value of testing because I'm a self-taught software engineer. I just mostly make things up. And so I didn't go through some rigorous study or have mentorship to teach me all the ways, right?

Starting point is 00:09:51 But that's also where, you know, eventually I started doing this at Dropbox. Like, I picked up all of those beliefs and habits from my time at Disqus, where we went from no testing to being sort of a technology leader in continuous integration, at least like from an opinion point of view. And so it was a really good experience. And I still rely on the opinions I developed in just those few years today. So. That's super cool. What was the initial architecture of Sentry when it completely went down because of that bug? Yeah. So it was basically like a plugin for the Django framework. So that meant it ran in the same process and it would use Django's database connectors and all these other things to talk to a SQL database. And so what you would have is each error was a row in a database. It would have like a text field to capture the stack trace and all this, right? And then you would have another table, which basically was a row per aggregate error. So kind of a straightforward relational model, which works in a small scale, no problem, right?

Starting point is 00:10:47 If you don't have that many errors, you're totally fine. But as soon as you get like a really high concurrency going on, especially at the same error, like you're constantly trying to write all these rows and it just doesn't work out for your database, right? But it was really, really simple. That's all it was. It was like a little middleware

Starting point is 00:11:03 and then it was a SQL database with a couple tables. Yeah. So it would write the exact same database as the production database. Yeah. For better or worse. Which, to be fair, I think early days, that actually was hugely important because it made it really, really easy to set up. All you had to do was, if you're familiar with Python, you pip install the package and

Starting point is 00:11:22 then you would add it to Django's version of a plug-in registry. You would just drop it in there and you were ready to go. It just worked from there. It would do something maybe like capture all uncaught exceptions or something and write them to the database. Basically anything that was like a 500 error, it would just grab all that information. Super easy to get started um phenomenal at that stage of even kind of the internet because like internet properties even though the internet was big it was such a different scale from today right like we've just seen

Starting point is 00:11:55 everything massively grow from there so you can no longer get away with some of those easy things but and like in your first week or two i discussed like when you said you had a change or your first week or two at Disqus, when you said you had a change, or your first month, you said, how did you, what was the next evolution of Sentry? Was it then moved to some other database? What changed? Yeah. So from there, I think the first thing we did, so we went from running in process to a traditional client server architecture.

Starting point is 00:12:23 We decoupled it from the process, right? And then not much changed. Like it was still SQL database. We threw in some cash and stuff like that, specifically like a write in cash to make it so when there were a lot of errors, it was a little bit cheaper to manage. And that actually is like the majority

Starting point is 00:12:39 of century history in a nutshell is just that. Like today it's a lot more, actually today it's immensely more complex, but that's only in the last couple of years. And I actually... I'm a big fan of getting into technology debate. I still would tell most people, you can solve many problems with just a SQL database. We push... We use Postgres primarily, and we pushed it to its limits still do today technically. But it works really well and it solves a problem. People understand how to use it and stuff. So, and you know, over time we adopted some other technology, like for example,

Starting point is 00:13:14 errors are really big, like the size, like bytes, total bytes. If you look at a stack trace and Sentry does not just capture stack traces, it actually will give you a surrounding source code. It will give you potentially variables. So in Python for every frame in a stack trace will give you all of the local variables that are present at that time. And so that ended up adding up to a lot of disk space. And when you put that in something like a SQL database, it's super unhappy. And so then we started just abstracting things out.

Starting point is 00:13:44 And so from a technology angle, that's as simple as like define the most constrained interface you possibly can for the data model you have, and then just leave it at that. So we took that giant amount of data blob, and we started something. Actually, this is not even that old. This was during my time at Dropbox when I was still working on it. We just created a key value abstraction. So it's like you needed that big blob of exception data. Okay. You write it to a key and you can get it from the key and that's the only interface. And then all of a sudden we could take that out of SQL, put it in something that was easier to scale or use. And that's kind of how we always approached, or at least how I always approached technology was like, you know, just

Starting point is 00:14:21 push it until it breaks and then go fix the problem and do it in the most pragmatic way, I guess, you can get away with. Yeah. And I think that also keeps things simple because you don't have to, you don't end up building over complicated things that you don't need. Exactly. So when you said like Postgres scales, like how much does it scale? Like are there any numbers that you could throw out?

Starting point is 00:14:46 Like how many writes were you doing? That's a good question. I don't even remember what the stats are. I can tell you that our servers were basically the maximum we could get for a node on single hardware. But the way I would encourage folks to think about something like a SQL database is that you are mostly limited on throughput because of the transaction model. So the MVCC, I believe is the right acronym. I'm not a DBA. I just have hacked at it enough years that I know what's going on. And so when you try to write to a row, it often has to take out a lock, like read or write or locks kind of thing. And so the way you scale it is you just don't write to the same row at the same time. It's literally as easy as that.

Starting point is 00:15:31 Now, it doesn't eliminate the cost of locking because it still has to do the lock. So you actually take a huge performance hit just by default from sort of a theoretical maximum write throughput. But if you actually just avoid, you minimize the cost of locking, which you don't have to be like a crazy educated or experienced engineer to sort of reason through what might need a lock. And so if you think about it that way, you'll be mostly correct and you just minimize that cost. So ours, for example, I told you we had this row in the database that was, we call it an issue today, but it used to be called a group. And it was like the aggregate of the errors, right? So these 10 errors are the same stack trace. So we write that group row for it. And what we would the same stack trace. We write that group row for it. We had a counter in that group row, which was the total number of times seen. You can imagine

Starting point is 00:16:11 if there's a lot of errors, incrementing that counter is really, really expensive. The solution was to just batch all the increments for that counter versus finding a new database or something like that. It's way simpler simpler to dispatch the rights, so. Yeah, and the interesting part is it's okay if you lose a couple of counts, right? Because it's not that critical if you're like, you have seen this exception 60 times versus 55 times. Exactly, yeah.

Starting point is 00:16:38 And that was the other good thing we had going for us is if you lost a little bit of data because there's a lot of errors and stuff, it generally wasn't the worst thing in the world by any means right um now some businesses that's a little bit different right maybe those errors are actually like very critical for auditability but we didn't have to provide those constraints it's like as long as you roughly got the shape of errors and you roughly knew what was impacting people the most good enough yeah so at what point do you like batch the errors is this in the application framework?

Starting point is 00:17:08 Or is there some other middle piece that does that? Yeah. So one of the first things I did, and actually, I'm pretty sure it's still alive today, at least in a significant way, but maybe not the same, is I wrote the system that I just called buffers. And the idea was basically just to debounce and buffer these counter increments and stuff. And all it would do, again, I'm just like, how do I solve this problem the fastest possible? So don't judge me here. But I would also encourage folks to recognize that Sentry certainly surpassed 20 million in revenue with just this. So whenever an error would come in, we would basically patch that error. We'd fingerprint it as we call it. And that's basically just MD5 or something similar to the stack trace. It's a little bit more

Starting point is 00:17:47 complex than that, but you get the idea. From there, we would write a key into Redis, which is one of the greatest, like, will you call it a database or not? But one of the greatest tools that I think has come into existence in the last 10 years. We would write a key to Redis on a hash that would contain basically sort of a last write wins update as well as an incremental update. So we can basically say the last time this happened was this timestamp and also increment this counter in Redis, which are both atomic operations. And because Redis has this pipeline or transaction-ish concept, we would batch all those writes into

Starting point is 00:18:21 one so it'd all be like a single network payload, which made it pretty efficient. So we would do that. And then the part that was a little bit harder to scale was every say 10 seconds or 60 seconds, we would just take all of those pending writes, flush them and just fan them out into a queue to write them all to the SQL database. And then we would just take out simple cheap blocks

Starting point is 00:18:41 from there. And the only goal was like, don't write to the same row at the same time. And it actually works surprisingly well. Yeah, that makes a lot of sense. And it's super simple. That's exactly how I would, I feel like I would go about solving these problems,

Starting point is 00:18:55 which is, yeah, avoid taking locks on the same, like row or whatever at the same time. And you've basically eliminated a lot of like issues or like a lot of contention. I had the question about sharding then. So how do you shard your database? You said that you have like the largest database instances, right?

Starting point is 00:19:15 So you don't want to provision one database per customer. Yeah, so I guess high level century is multi-tenant, right? So, you know, that's not entirely fact anymore. We offer multiple things, but mostly it's a multi-tenant architecture, right? So all the customer's data going into shared database instances. We to date, I'm gonna say I'm 99% sure on this.

Starting point is 00:19:39 So just go with me. We today have not horizontally partitioned to any single database table. And by that, but I mean, for anybody who doesn't know, we have not taken one table and split it into two tables, like partitioning it by key or timestamp or anything like that. We have vertically partition, we have moved tables to different database instances, and we have removed things like foreign keys and different database constraints. Now, that's not to say it wouldn't be better if we did some version of sharding to make a lot of our lives easier. But as we've grown, we've been able to upscale

Starting point is 00:20:11 hardware, we've been able to move tables around. And over time, we've adopted some different technology, which is arguably better for us than sharding would have been. I think we'll still have to do the sharding. But I think the primary reason it's never been done is because it's complicated. Like if you ever had to build this or manage any of these systems, it's just such a nightmare to get it off the ground. And like, especially when you're a small,

Starting point is 00:20:38 like this was a hobby project for a long time and it was myself and my co-founder, who's a product designer, he's not an SRE, right? And so I'm not signing up to manage a bunch of database servers that are sharded and have like crazy reliability concerns right and so i'm like how do i make it easy cheap understandable um and that's still somewhat true today century is a very pragmatic company we try to not um there's a stereotype of technology companies and i'm sure dropbox was there at some point but they often spend a ridiculous amount of money when they don't need to. Now, eventually, you recognize it and you fix it, but we never had that money early on to spend, so we didn't spend

Starting point is 00:21:13 it. And that's kind of stayed true for every century. So we would just fix things when they needed fixed, hopefully a little bit before they needed fixed, and kind of take it from that approach. But I think it's just an interesting thing. I think it's getting easier. And I know some folks that I know from Dropbox and other companies, they started a company around Vitesse, Planet Scale, I think it's called,

Starting point is 00:21:36 which if anybody's not familiar, it's a MySQL, I don't know if it's called a cluster, but it's a way to scale MySQL onto many nodes, right? Which I've never used it personally, but I've heard good things. And I think if you've had more of that kind of technology, Sentry's upbringing might be a little bit different. Like we might've just adopted that, right? But 2008, 2012, even, you want to do any of this stuff, you're either a massive tech company with a lot of money to spend on it, or you are doing it probably at the application layer,

Starting point is 00:22:05 and you're dealing with a lot of complexity. Yeah. Well, it is really surprising to me, though, that you haven't horizontally sharded like a table. And I think people like our listeners should kind of recognize that you don't have to do these things just because like large tech companies talk about these and like talk about these things in conferences and all that like you you don't need that for a really really long time and that's clear yeah exactly and that's if there was one lesson which so many people need it is that that a lot of these problems are actually not that complex and they can be solved as simple solutions and you know and i think actually one of my complaints of my time at dropbox because

Starting point is 00:22:44 you only know what you know right like you kind of get it from experiences you've had before, you've seen it before. And Dropbox, my time there was a lot of headcount growth. And a lot of the people were coming in were new grads. And so you often have not experienced a lot of these things yet. And you would always have all these complex conversations about like, well, how do we do this? And you would quickly end up in this overkill solution for whatever it is, which sometimes you actually need it to be fair. But a lot of the times it's just like, let's step back and think about what the requirements are of what we need. And what's, if you're like me, the goal is like, what's the fastest way to those requirements? Not what's the most interesting way to those requirements. Right.

Starting point is 00:23:24 Yeah. Taking a step back a little bit. We were talking about MVCC and Postgres, one of the larger outages that Sentry has had in recent history, not recent anymore, I think it was 2015, was the famous transaction ID wraparound. Can you talk a little bit about that and what exactly happened? Why that's a problem with Postgres databases? Yeah. So I'm going to give you my summary version, which is mostly correct. But it's just my mental model. So to implement transactions in a database, what they do is they have a counter

Starting point is 00:23:55 on each row effectively. So each entity in the database. And that counter is almost like a version is how I think about it. And again, this is not correct. This is just how I mentally model it. To make changes or to understand what the latest version of the row is, you sort of are using that counter to understand things. And that counter concept is basically it's not exactly like that, but that same idea is used in different ways. So transactions, I guess specifically in this case, it's like what I would say is first off, read up on how MVCC works. MVCC, if you actually want to understand this. But effectively what happens is that counter for transactions, which I believe is like something like per table and maybe per column.

Starting point is 00:24:41 And there's a lot of these, right? It's not just one counter, but it's a 32-bit counter. And so that counter, what it does is when the database does its garbage collection, they call it vacuuming in Postgres, when it cleans up everything, which in some other databases, I think works roughly the same as like merge trees and like compaction. But it's like a nightly task or a weekly task, depending on the scale, that will go through and it will sort of fix all of the data on disk. So it will get rid of data that was deleted on disk. It will make sure all of the other data is sort of merged together and all these other

Starting point is 00:25:12 concerns, right? Depending on the database implementation. Anyways, it uses that counter to understand what data is correct, what is like latest or whatever. And so in this particular case, the counter rolled over from 32 bits. It hit the maximum, I guess, number it could be before it had to run that garbage collection job, right? And now the reason that that's a challenge is because the garbage collection job, the vacuum, as it is in this case, is very expensive because it has to go through and rewrite a bunch of data,

Starting point is 00:25:42 right? It has to delete, like physically delete data on disk or scrub it. It has to, if and when it can compact data, though Postgres doesn't really compact data. I can talk about that if you want, but anyway, so we hit the 32 bit limit and what Postgres does at that point is it says, sorry, you can't do any more writes or anything that would require changing the transaction counter until we've run that garbage collection. And that garbage collection is not fast by any means. And if the database is under load, it's very expensive. So for many years, even still after that, we would fight with it. How do we balance the aggressiveness of the garbage collection with the available resources on a server? And what unfortunately makes some of these problems even worse

Starting point is 00:26:26 is once you add replication to other databases in the mix, you have kind of this problem compounded because you're fighting for resources for normal just database usage. You're fighting for resources for this vacuum process. You're fighting for resources for replication and all this stuff. But that's like the gist of it. Now, I told you the counter is not just database wide, and it's not just table wide. Fortunately, this has happened actually to us

Starting point is 00:26:52 multiple times. So I don't actually remember which one was 2015. But that was the most significant out of we had. When this happened one time, it was as easy as truncating a table that turns out was not that value. And it was a business decision, right? It was like, it was mostly a table that was used as a cache. We're like, you know what, just delete the data. We'll get everything back online. We'll fix this after that, right? I think the 2015, which is hugely embarrassing, but recognize that when this happened, I think I was the only engineering employee at the company. And my co-founder was the only engineering employee at the company. And my co-founder was the only other employee at the company, the designer. And so I think what happened is we hit one of

Starting point is 00:27:30 these issues. Actually, I'm not even sure. I think that was a different outage. We had a lot of outages when we were young. Let's just say that. Transaction ID thing, super complicated, still not fixed. Some might say it doesn't need fixed, but it's still 32 bit, most importantly. But I will say the learning lesson from it was just like, think about how you can recover from disaster scenarios, right? Like I told you, in the one case, we were able to just truncate a table, which might have been that case. Again, I would have to refresh my memory. I'm sure our blog post talks about it. Another case, we literally had to wait until the process finished, which I think took like 45 minutes. And that is not a fun place to be in when you don't have control. Like if you've ever used the cloud provider and the cloud provider

Starting point is 00:28:14 is down and all you can do is blame the cloud provider, it's super uncomfortable, right? So same kind of thing there. It's a very scary, scary outage situation. Yeah. If you don't mind me asking, like, what was like the customer reaction? Were people like understanding? Were people upset? And do these things get baked into like your contracts? Like, oh, this is how much outage, like, this is how much of an outage budget we have. Like, yeah, if you can just walk me through walk me through like you know what is the impact of something like this so in our case we were so this was early 2015 we probably had a couple thousand customers who paid us but they probably didn't pay that much money they probably paid like 20 bucks a month kind of thing right and so when you're paying 20 bucks a month if you're reasonable you also think about what you're paying for you're like well i'm only paying 20 bucks a month this sucks we deal with it kind of thing. In that particular scenario, I don't think anybody actually said anything. I think they were supportive, which you should not, you should be supportive, but you

Starting point is 00:29:13 should also say something as a customer. Cause like when you have a serious outage, like that does matter. A lot of people need to recognize the importance of that. Like, and I say this as like the product owner, the business owner, right? The business and product and engineering groups need to recognize the importance of uptime. And you don't really recognize it until you've caused an outage, in my opinion.

Starting point is 00:29:37 We had a significant one recently, which was way too big of a deal for how big we were, or are, rather. And it's just one of those things. It's like, you want to help your customers as much as you can. You can't solve problems or prevent all problems, of course. But again, it's like what you do when there is a problem is the most important thing. And, you know, I'm not even trying to boast here, but people have complimented me over the years on how I've been able to make decisions to recover from incidents. And I think if people just think about that, like what's the fastest way we can fix the

Starting point is 00:30:14 customer concern here? Not what is the most correct way or what kind of rewinds time, but what is the fastest way we restore commitment to the customer? That's how I think about it. And so I'm always like, what's the shortcut I can take here that might cause me a lot more work, but it's going to restore exactly the behavior they need right now. And I think a lot, I'm able to do that because I have a lot of years doing this stuff. But I also think it's, even as like an entry level person, you can kind of reason through that many times. And again, most important thing. And I guess the only other thing that really matters that I've seen historically in those situations is just communication. Like even our communication, this most recent outage, I'm like, come on, guys, like we've

Starting point is 00:30:58 got to be better than this. We're, you know, Century's like 140 people now. We're like a pretty significant company. I'm like, we got to, we're supposed to be the best at this, not like, oh, we forgot to do that thing. And I think if you communicate with your customers in those kinds of situations, they're still gonna be upset as they should be, but they're way more understanding than if you don't communicate thoroughly and quickly are the most important things. Okay. So like concretely, that might mean something like provide like a status update, like send

Starting point is 00:31:26 like a blast email saying we're having this issue and we will get back up as soon as possible. Thank you for bearing with us. Something like that. Exactly. And I think there's a good lesson in the industry from recent weeks, which is like Robinhood and the whole GME thing. And I was, I've been nitpicking at Robinhood for a little while because of that. Like, I don't think their comms were good.

Starting point is 00:31:49 It is an outage. It's the same as if your infrastructure went down, right? Like they took down trading and they didn't really explain why. There was no real communication. They're just like, hey, we're going to limit this, which is like, fine. That may be totally fine. But it's like, hey, Century is going to be offline. If that was our communication, like you should stop paying us money. Like, it's just super unprofessional. And so, again, you can only control so many things.

Starting point is 00:32:14 But I think it's how you react to the situation is the most important thing. I think that makes sense. I want to play a little bit of devil's advocate here, which is like, maybe the Robin Hood people were scared that oh we're losing money or like like they don't want that incorrect messaging to go through like oh we cannot fulfill trades and they didn't want people to panic so they had to be like a little opaque and i don't want to debate the robin hood issue too much but let's say like in sentry's case where you don't want people to think that you know you're going down all the time so how do you manage that trade-off like how do you restore that confidence while you're going through something like an outage that is hard yeah i think it's only by action right

Starting point is 00:32:54 like we try to be as transparent as we can without creating business liability i would say um security incidents are another good version of this right when? When there's a security incident, one, you do need to communicate very quickly, but you also don't want to like scare people. Yeah. And it's not just, you don't want to scare them because it will look bad on you. You don't want to scare them because if you, if you go to a customer and you're like, Hey, we had a breach, they actually are required by law. Many times they go through a ton of steps, which costs them a lot of time and money.

Starting point is 00:33:20 Right. And so like, and we've had those kinds of scares, too, where we've had a vulnerability that we identified, and we're like, oh, no, has this ever been exploited? And we have to very quickly say, what is the what is the potential risk here? Who do we contact? What do we tell them? And how much information can we give them to be as accurate as possible, even when we don't know exactly what's going on? And so I think that's, again, that's, that's, there's so many different kinds of situations where that comes up, which is like threading that balance. Century has been fortunate that we're a little bold headed, and we are kind of willing to just be fairly transparent and accept

Starting point is 00:33:58 the risk to that in many situations. If it comes to like a customer's data or something, we won't be like we'll protect the customer by default. But if it's something we messed up, we will generally be pretty transparent about how we messed it up. Like the transaction ID thing, right? Like it was about as transparent as I could be back in the day. And I think it's important because one, we sell to developers and developers value that transparency because it creates trust, right? It's a super important thing. Two, it helps people. Like it helps the industry learn and recognize things.

Starting point is 00:34:24 Like, I don't know if we're still number one, but it used to be if you Googled transaction ID wraparound, a century blog post came up for better or worse, right? And I think three, it helps the people recognize these potential challenges in software because not everybody's experienced them. And unfortunately, that transaction ID thing, almost everybody experiences at least once. And so it's an unfortunate day for everybody. But it helps weight the importance of things, right? Like if we all recognize that that's a super serious problem and there's a path to a solution, I'm sure we'll come up with a solution, right?

Starting point is 00:34:58 And so I don't know, same as open source, right? There's not necessarily a reason to give a lot of things for free under open source or even to make them public right um a good example is i i have a whole a bunch of home automation stuff and i made it open source nobody uses that it doesn't really matter maybe they'll learn something from it but in all honesty probably not but i did it because i value this idea that there's a lot of information that you can learn from and that you can see from prior experiences and see how somebody else did something and all these things. And it's the same way, our same recent century and a lot of stuff I've done historically, even changes when I was at Dropbox were open source. Cause I'm like, it's not really sensitive first off. And it's very

Starting point is 00:35:37 valuable for folks to understand how other people think, how we're building technology so they can learn when they haven't experienced it directly. So it's just kind of a broad umbrella just for how I think about things. No, I think that makes sense. And I think like Sentry's blog, when you search for transaction ID wraparound, I don't know if it's first, but it's definitely in the top 10

Starting point is 00:35:55 because I've seen it a bunch. And I think we had a scare for like our internal CI changes at Dropbox as well. And we might've even had an outage before I started. But yeah, it's definitely a real problem for like write heavy databases. One thing I want to poke at is,

Starting point is 00:36:13 you mentioned that there were only like two employees of Sentry in like 2015 or so, and now you've grown to like 140 people. So how do you like maintain quality when there's like so many more people? And of course you can't be the one in making every single decision or approving and rejecting every single decision like how do you maintain that you know we're shipping high quality stuff you're not adding

Starting point is 00:36:34 a lot of tech debt or something yeah it is a challenge for sure um you know i i think there's been a change in the industry so when when I joined Dropbox, I think, I don't remember if it was engineering was 250 or total headcount was 250, but it was not a tiny company anymore. And I joined as the first person on developer productivity. So that idea, like, if you think about that, 250 people, whether engineers or not,

Starting point is 00:37:02 no developer productivity before that. And it was certainly a lot of engineers, right? Century at 140 people has two people on developer productivity. So that idea right there alone has already shifted industry of investing into tool chain. And that idea that you're spending a lot of money on engineers, maybe it's worthwhile sort of compounding that investment, right? Just like we spend a lot of money on tooling for salespeople and all these other things. So I think that's already a good thing. And we invest in investment, right? Just like we spend a lot of money on tooling for salespeople and all these other things. So I think that's already a good thing. And we invest in that, right? I think you've got to focus on engineering culture. We use company values and things like

Starting point is 00:37:35 this, which are pretty important. The other thing we think about a lot is just like the tool chain and the processes around the tool chain. So I think everything important in a company happens from the top down. So if something matters to your company, it starts at the top and it's echoed from the top, right? My analogy here is like at Dropbox, it was instilled into me how important recruiting is. Like day one, it was talked about recruiting, growing the team. It's better for me to spend my time doing that than arguably doing what I was hired to do. Right. And I think about that a lot because it's a lesson I learned in terms of how you, well, one, not only like the recruiting aspect itself was super important, but also how you think about what matters to your company. So at the top,

Starting point is 00:38:18 I spent a lot of time making sure everybody recognizes that we care how much I care. And then it's, it's cascaded on engineering velocity on code quality and user experience and all these other things right like if you look at the design of sentry it's great because it started at the top because we said how much we care about great design and great product experience right like we've got a phenomenal team but we have a phenomenal team because we cared about it at the top right and so when i think about it it's like okay i approve budgets and stuff now that's how i can show what matters i can give budget to that developer productivity i can give budget to tooling and things like that um i mean you still got to trust that people will do the thing you need them to do but it is a lot of like how do you invest and where do you invest

Starting point is 00:38:58 and things like that so i think the the interesting challenge that we keep seeing is it's culture more than quality. Quality is hard, but it's really easy to do quality by a tooling process. It's like you're a bank, how do you maintain quality? Well, you implement infinite process so nobody can get anything done. So that's super easy. But culture to me is more about like, it's like, how do you build code? And for us, how you build code is just like, well, we want to be able to ship new things to our customers as quickly as possible. One, we just prefer doing that. But two, it's also the way that the best businesses are going to operate, right? Velocity first. And so I think that's going to be an evolving problem, probably forever at this point. Like the industry has changed a lot. The scale of technology is much

Starting point is 00:39:42 different. The scale of people within technology companies are much different um most of what we focus on today is just how do we take humans out of processes as much as possible and you know dropbox you know it's the same way when i was there it's like how do you take humans out of processes and i think if i had sort of a singular goal as the company grow uh grows rather given my position of authority, it is to continue that investment of like minimize process, asterisk when it protects the company. So my version of that is have a very annoying vendor policy so you don't send customer data anywhere, but make it very easy to spend money and spend time on tooling improvements and automations and invest headcount into people that are going to build tooling improvements and automation and stuff. Like an analogy I actually give people. So we're trying to hire like a security, kind of the first full-time security person at

Starting point is 00:40:37 Sentry to build that program, even though it's more like to take what we have and turn it into something real. And I always explain to people, one of the things i valued at dropbox was they took an engineering approach to a lot of these things which makes sense but it's not something i would have just defaulted to assuming right so um some of my favorite dropbox people who are people who did what arguably might be considered boring work by some so like it security and stuff like this. Because if you think about IT, it's like how you're fixing printers or giving laptops out, which IT does, right? But there was a huge investment in just what we're engineering solutions to IT security, right? Again, automation and tooling. It makes everything scale better. And it's straightforward

Starting point is 00:41:20 when you think about it. But I think a lot of us just don't sit down and think about what the problem we're trying to solve is. And I try to tell this to people every day. I'm like, just go back to like eighth grade or whatever it is where we learned scientific method and like critical thinking and stuff. Like just think through the problem and you're going to come up with a far better solution than what you might just jump to by default. Okay. So maybe like one more concrete example would help me. Let's say that there's like a process where people are reviewing metrics every week and you want to automate that or like you want to scale that out how would you go about like you know so i'm improving that yeah i guess i would start why do you need to review metrics every week or anything like what are the metrics there to

Starting point is 00:42:00 serve so one thing um so i recently brought in somebody, I transitioned out of CEO. So I hired a CEO for Century a year, a little over a year ago now. And one of the interesting things I've seen is I'm very in the weeds. I'm very much, I act like an IC in a lot of ways, where I look at the problems in that narrowly scoped lens. Whereas on the counter side, experienced executives look at it at a very strategic level, which I think you need both. And so one goal I've had, because I think it helps me and people like me, or it helps an IC persona, is how do you take a complex goal that it's really hard to understand what it is and how you contribute and stuff to it and break it down into something that you

Starting point is 00:42:41 understand how to move towards that goal. So a simple extrapolation would be like, okay, say we have a revenue goal for a year, right? How can an engineer contribute to the revenue goal? I want to give them the thing that is most effective. So you take that same kind of problem space and you go, well, like, say we have a goal to like, um, ship more features for, to customers or something when you just set it like that, it's really hard to think about how you might impact that, right?

Starting point is 00:43:08 Like, oh, I can't write code any faster. I can't do this. So everybody's like, if we just hire more people, we can ship features faster, right? That would be the go-to. But then it's just, you step back and you say, okay, the biggest thing blocking me, it's like root cause analysis, incident management, right?

Starting point is 00:43:21 It's the same thing. The biggest thing blocking me from shipping features faster is that our CI takes 20 minutes and then a release takes 20 minutes. So literally the biggest blocker. So if you just cut that down, we've already done it. We don't have to hire anybody. And, and that most things are as simple as that. But the way I would reason about it as I'd be like, okay, at high level, we see the numbers and we're able to extrapolate and you can do this. You don't have to be an executive to do this. You can do it as an IC. We're able to extrapolate that the best way we can impact velocity is to unblock engineering from releasing code. And they're blocked on this specific thing.

Starting point is 00:43:55 So that should be our plan. That's where we're going to invest money and time. And so I will say we spend an annoying amount of time on planning these days, but we do it because the goal is like, if we've got a solid plan, we don't have to talk about it after that. Like it's, it's go execute, get it done. Right. Complete freedom and autonomy. But again, I think it's just like, it is root cause analysis. And I don't know if it's just my engineering brain thinks this way, but that's how I always reason to people. Like you just ask why,

Starting point is 00:44:24 why is it the way it is that makes sense uh then maybe a follow-up like a kind of a follow-up is what was the hardest part in this entire journey like going from like having a product that doesn't that doesn't have any customers to having your first customer or like scaling that out to like the first few hundred or even further than that like you know hiring your first team and scaling it out even what was like the hardest part for you yeah that's tricky so when you start a company the first thing you do is you go find everybody that you worked with that you think is really good right you're like who are the people that i can trust to hire at this stage you know i could i can talk for hours about all the lessons learned in that, but that's

Starting point is 00:45:05 where you start. Once you go beyond that, just like hiring anywhere, it's really hard. Once you move beyond referrals, right? Like beyond network, it's really hard. And it's hard because like evaluating talent is way more difficult than people would have you believe, right? It's why we do these annoying whiteboard interviews and all this other stuff that people hate, right? But I think it's at least still straightforward. You find the people, you evaluate them, you hire we do these annoying whiteboard interviews and all this other stuff that people hate right but i think it's at least still straightforward you find the people you evaluate them you hire them to me the biggest challenge was actually as the company grows you hire different kinds of people right so you're not just hiring like a an engineer that can kind of do full stack you all of a sudden need specialized engineers or you need support people, but maybe the support

Starting point is 00:45:45 people need an engineering background or salespeople, but those salespeople need an engineering background. And most of the scaling is actually that. And that is actually, I think, I don't think I'm good at it. So I probably don't have great advice other than to say it's very difficult. And I think it's difficult because for me, I don't know how to reason about what a lot of these people look like for them to be great. Like I've never worked in support or sales engineering or professional services type situations, or technically I've never even worked in like an operations role, like a DevOps role. And so I think you really quickly end up in this place where like, okay, I can understand what the shape of things might look like because I can reason through it.

Starting point is 00:46:25 But I don't know who the right people are to help solve those problems or what the right solutions to those problems are kind of thing. And so it's just one of those things that I think everybody does it differently. It's the same way with software ultimately. Like you kind of know what you've got to get to. You don't necessarily know the path there and you just try to reason through what is a reasonable solution to get there, right? And so I think it's all kind of the same. You know, I actually think, I won't speak for VCs, but I assume that the good VCs,

Starting point is 00:46:56 they would tell you that it's better to invest in engineering founders than any other persona because we can solve a lot of our own problems and we take like a builder mentality to things like we try to actually go through like a logical reasoning to solve problems has its weaknesses and other sides of things but you know yeah i want to go a little deeper into the hiring part right like how do you evaluate someone once you've gotten out of your network let's not tackle like the harder problem of like hiring you know support engineers when you don't have a lot of experience but let's

Starting point is 00:47:29 say even regular software engineers once they're outside your network yeah that's tough so i mean i would do what dropbox does and send t-shirts to everybody and hire them straight out of college honestly like that's actually even like how we build our hiring program now. You find where all the talent is, and you kind of, sorry, what's the quote? Pray and spray or spray and pray or whatever. But yeah, I don't know. It's really, really difficult.

Starting point is 00:47:57 And it's gotten more and more difficult over the years, I think, for two reasons. One, because engineering has gotten more valuable. So senior engineers, kind of the people you would want to go after when you're hiring, right? The best people. Really difficult because one, they're actually really well compensated these days. They're given an enormous amount of flexibility. And oftentimes, like they're like late in their career, they don't really want to like deal with all these other things anymore. Some people do. Some people sign up chaos but so so you already have that

Starting point is 00:48:28 challenge with senior engineers right and on the counter side because the engineering ecosystem has gotten so big it's actually often hard to find the right kind of talent like the talent that you need at your organization or your scale within your team and i say that in the sense of like the stage of your company the stage of a product, like those things matter so immensely. Like it's easy when you think about a startup, like when you're, when you're just starting something from scratch and you have nothing, there's a very specific kind of personality that works really well there. You've got to be like motivated. You've just got to suck it up. You can't talk about work-life balance, right? Like you just got to accept it. Like it's going to be painful and hopefully there's a reward for it being painful. Right.

Starting point is 00:49:07 But then when you like get farther along, say, say your Dropbox and you want to start a new product at Dropbox, that's wildly different than starting a new company. You might still be starting kind of from zero on something, but you have like roadblocks here and you have like, like resources over here. And it's like, well, what kind of person succeeds in that situation? Like what's got to be somebody that like, either is really good at managing all those roadblocks, or just can like tear them all down, right? And to me, that actually is wildly different than like a founder persona, because the founder persona is just like, oh, I'm really energized and excited about this. The other one's just like, I will brute force, like deal with all this really

Starting point is 00:49:43 annoying complexity kind of thing. And what I think has been most problematic for me when it comes to hiring is when you look around the industry, you have all these different personas you need, but the industry has gotten so big that it's not easy to target these people. It's hard to know what somebody is going to be like from that kind of perspective and if they're going to be the right hire. And they're certainly not all clicking the apply button on your careers page, right? Like you're trying to source those people. So you spend a lot of time just like, well, this person's got a cool resume. Maybe they'd be good for the job. Oh, completely unrelated to the work we have at hand or something like that, right?

Starting point is 00:50:18 And I don't know how anybody does it anymore, to be honest. Which is, again, why it's like, okay, build a great new grad recruiting program as soon as you can. Because one, they're all eager to, you know, sorry, new grads, you got nothing better to do yet. You're still young. And you're also they're like naive, but I mean that in a really positive way. Like, they're not jaded by what's like already happened in industry. Like if you see a lot of like San

Starting point is 00:50:43 Francisco, it's everybody complaining complaining about everything or older technology, people were just like, yeah, this and that, and we hate it all kind of thing. Right. And it's kind of like fresh to not have that being like the constant state of mind. But I don't know. It's tricky. I've yet to find or find a good solution. I think it's because the problem keeps changing year over year. Right.

Starting point is 00:51:02 Thank you. Even if you look at the the problems you have to solve in software today, there was not such a thing as like a front-end engineer like five years ago. It's like a brand new idea that JavaScript is like a specialty thing or something, right? And so it's like, okay, now where do you find those people? Like, what university is teaching people anything remotely similar to JavaScript development, right? So, yeah, I don't know. I don't know that there's a good solution other than like my approach has always been

Starting point is 00:51:29 stick to my beliefs. And I guess ideally, or eventually if they don't work, if you can't succeed under your beliefs, you got to reevaluate that kind of thing. But like stick to your beliefs of what are the right kind of people you want to hire and just try really hard to find those people um you know many things i believe in today still come from what i learned at dropbox when it comes to hiring that's things like a rigorous evaluation program like like like real lessons at century like our interview process is not difficult enough and some people would say well that's okay because not every job is. And what I would tell you is you want it to be difficult

Starting point is 00:52:08 because you want people that are stimulated by that challenge. It's a, it's a persona thing, right? And Dropbox had a very difficult hiring bar, right? Like that. And I'm sure it's still similar today. It's probably changed a little bit, but it was like, no, we have a really rigorous thing. We're trying to hire the best people. Now, they had a lot of advantages that made that easier. But it was an important lesson for me about like, you've got to have sort of, you've got to know what you're reaching for and then just like really go towards that. You can't just keep changing directions along the way, right? I actually listened to an interesting talk the other day by a former Dropbox person, Jean-Anne, who's at Plaid now. He's the head of engineering there. And I had not heard this concept, but he talked about spikes. And I didn't

Starting point is 00:52:50 know what he meant, but I think of a video game where you've got like these stats and like your blob of stats, which direction is it moving towards? And he's like, well, you might be able to do recruiting really well, which Dropbox did, or you might not. And so maybe instead you should focus on mentor or like individual growth in your company, or maybe you can't do that. So you should focus purely on retention and just continuing at the pace you can, which I thought was a really interesting way to think about things. It's not an idealistic way, right? Like the idealistic way is you can do whatever you set your mind to but it wasn't very practical and probably accurate way that things work um so it's been top of mind for me for a few weeks since i heard that idea so yeah yeah i think it's an interesting idea like how much can you focus on like hiring

Starting point is 00:53:36 new people versus retaining existing people and sometimes i've also heard that the people who were at like an early stage of a company might not be the best fit at when the company grows bigger and becomes like more mid stage or late stage. Yeah. I used to think that retention should have just been like top of mind for everyone, but that might not actually be the best for every company. I will tell you that we take into account attrition when we do budget and planning for headcount because and we actually uh we classify people who left the company as regrettable or not regrettable so sometimes and it's not necessarily their fault sometimes people like you said maybe they're early stage and they can't grow with the company and it's actually healthy for them to move on and go find something

Starting point is 00:54:21 that they're going to be happier and it And it's healthier for the organization or, you know, maybe the job change, especially at a small company, like you can't have very narrow scope roles, right? So it's really actually hard to be a specialist at a small work because six months from now, that specialty may be meaningless because you had to change what you need to do or what was important. Um, and so it's, yeah, I don't know. It's a challenge, especially when hiring is a challenge, you know, if you lose anybody, it's like, well, now we got to go rehire somebody and just the endless cycle.

Starting point is 00:54:50 Yeah. So one thing you mentioned was transitioning off the CEO role and going into like the CTO role. And that's something I've been thinking about even before, because I've seen that on your LinkedIn, right? You transitioned like a year, year and a half ago. How do you evaluate that you should be doing that, first of all? And then the second step is, how do you know who the right CEO is?

Starting point is 00:55:12 Especially when you've co-founded the company, it's your baby. You don't want it to be in the wrong hands. Yeah. So I think it's probably a little bit unique to people. So I guess my perspective on how everything went down and why. So first people might not recognize this about me, but I am fairly humble. Like I will quickly tell you when I'm not good at something or when I don't want to do something. Um, I'm also very confident in my ability to just like find a way to like through

Starting point is 00:55:42 stuff to get it done so when i was ceo for century like i started the company with my co-founder and i raised all the money and everything right and i started the business well first off at that stage ceo is not a job you're just a founder so it's a make-believe title um probably the prior two years so once we got to like 60 people maybe um ceo kind of felt like a job it felt like i was actually doing something different than i was doing before you know i was spending time talking to investors which i you know i'm sorry investors i hate it um i really don't enjoy spending my time evaluating talent that is not in the engineering curriculum i don't enjoy spending my time hiring marketing

Starting point is 00:56:21 leaders and folks like this right it's just i don't think i'm very good at it and if i don't enjoy spending my time hiring marketing leaders and folks like this, right? It's just, I don't think I'm very good at it. And if I don't enjoy it, I'm going to be worse at it because I'm not going to try as hard is ultimately how I view things. And so basically the things that ultimately the CEO is responsible for were not things I was interested in. Now, the reason I didn't hire somebody earlier is because I was not under the belief that it was a good investment for the company. Because my belief was like, I can do the job well enough for the stage we are. I don't believe I can hire somebody that's going to do it better for the stage we are.

Starting point is 00:56:51 And there's a cost structure associated, right? Like bringing in executives are expensive. CEO is the most expensive. So I was mostly under the assumption ever since like our first investor asked me if I wanted to be CEO. And I said, you know, I'll do it for now. It was always my response over the years, even when somebody asked me that. And my assumption was always at one point, if this company succeeds, I will not want to do it. Which, you know, was kind of a good, I think it was a good timing where the company was big enough that we could hire somebody that was like good.

Starting point is 00:57:23 That was not just like a temporary CEO, but somebody that could take us all the way to the finish line um and that i had recognized fully that i didn't want to do it anymore and like i was failing at some of the things because i just was not enjoying some of the things um and so that's kind of how it got there and then the process uh effectively it's like you make the decision to do it. And then it kind of looks like other hiring processes, but the way you evaluate talent is really different. So one, it was a nine month process, I think, for me, give or take from the time we started it to the time I think the offer was signed-ish, maybe called a year from end to end to start. And so during that, basically all you do is you meet people

Starting point is 00:58:06 and you talk to them about their opinions. And like, you're also selling them at the same time. And it's weird because I'm like, say hiring an engineer where it's like, come and do the onsite. Let's whiteboard five different things today. Straightforward evaluation. This is kind of like, well, how is this person going to make decisions?

Starting point is 00:58:22 What are they going to focus on? Because now they're going to be my boss. Is it going to be the right focus for the company? Are they aligned with what we want to do as an organization? And you also just get to know, will I enjoy working with this person? Will they be a great partner for the leadership of the company? And that seems like a really hard thing to evaluate, but I actually don't think it was that hard. One, all the candidates are awesome. Like they're all like really qualified. And if they're not qualified to be CEO, they're at least like they're successful already. And you learn something out of every conversation, which is kind of fun. And two, because you have so much

Starting point is 00:58:57 time to get to know them, you kind of really quickly learn like how their behavior is going to be and if the communication style is good and all these other things um and so when i met uh malin um i actually remember like the first conversation he was more excited than any other candidate i had talked to about the product which was important because like you're not just here to sell the company and do all these other things you actually have to believe in the product and get the product because especially for us it's a developer tool right um and so that passion was and that interest was obvious. And then what I would tell people is the probably the single biggest decision factor I had for choosing the CEO was he was like, sending me messages at 7am and at midnight. And I have one one very specific belief in success. And it's that you have to work hard.

Starting point is 00:59:41 It does not come, you work your ass off, you make compromises along the way, there is no substitution no matter what anybody would try to say. The people that say otherwise are not successful. They just want to believe this is how they would get there. But it is just a lot of hard work. And there's a famous saying, I don't know who said it. It turns out it's a quote. But one of the CEO candidates actually said it to me, or maybe it's a CRO candidate. It doesn't matter. But do the work. It's just three words, right? Like you've got to do the work. And I'm like, okay, I believe this guy would do the work. He will figure out what needs to be done. He will work to make sure it happens. And that's kind of all the job is, right? You just got to figure out what happens day to day. It's really most jobs, if you're honest with yourself. And it's also something that I kind of look for when I hire at any level now. I'm like, how can I evaluate if they're just something that I kind of look for when I hire at any level now. Like, how can I evaluate if they're just going to like kind of roll up their sleeves and do the work when it needs to get done? That makes a lot of sense. And wow.

Starting point is 01:00:35 Yeah. And I think the first part, which you said, like, you know, reflecting on yourself and trying to understand what you enjoy and what you don't enjoy basically gives you the answer. And I think it's also great to hear that, you know, as an engineer, you might not be interested in all of these other things like evaluating these leaders who you have no experience on. And that's fine. You can still start a company and maybe eventually hand it off to somebody else who is more excited about doing this role. Absolutely. And I think, yeah, I think a lot of people have different reasons for wanting to start a company, but I don't know if wanting to be a CEO is a good reason to start a company as an analogy, just because the skills are very different. Just like the skills for starting a company. I've learned so much stuff that has no other purpose in life now. Like fundraising, what am I

Starting point is 01:01:20 going to do with that ever? It's like, I actually sort of get it now, but I'm mostly done doing it in my lifetime. I will probably, you know, we'll see, but I will probably not start another company. I will probably not ever have to go and raise a seed round ever again. So the only value I now have out of that knowledge is like, I can pass it on to other people. And that knowledge is just my one singular experience, right? But it is really interesting. I do think, one thing I regret though about Sentry is so I had started the Sentry business three years before I worked on it full

Starting point is 01:01:51 time. Same with my co-founder. He was like two and a half years. And that's interesting because one, it gave us validation. Like we had, like I basically, when I went full time was the day I left Dropbox. But it gave us validation. We had a safety net. Like I left Drop basically, when I went full-time was the day I left Dropbox. But it gave us validation. We had a safety net. Like I, I left Dropbox and I was able to match my salary kind of safety net. Right. And that's a good place to be in. I didn't know what I was going to do at that point.

Starting point is 01:02:13 I didn't know what we were going to do with the company, but it was a good situation to be in. Right after that, we raised money. We're like, let's, let's build it to the moon. Like as big as it goes. That was the goal right then and there. But because we were already pretty far along, we actually missed that traditional early stage founding company,

Starting point is 01:02:30 founding product thing, which I had been through it before a few times because I'd always joined early stage companies. But that stage of the company is so much fun. Like you're just building something and there's literally no roadblocks. Like you have very little else going on. And it's such a good experience for people that I don't think everybody needs to start a company, but everybody should go through like, especially engineers where, you know, we can apply ourselves right off the bat, just working at like a sub 10 person company and just having that level of ownership and trust is such an empowering thing and it's such a good experience that I would encourage everybody tried at least once in their career. Yeah. yeah so so what are some specific things that people learn just like

Starting point is 01:03:09 the idea of talking to customers and having like basically very high bandwidth conversations learning a lot about how people actually use your product so you learn a lot but it's just the freedom to actually make the change right like? Like you learn something today, you might have it like in front of customers today kind of thing. It's so empowering. Like my favorite experience is early days Sentry. So, you know, Sentry collects errors, right? We could see when a customer hit an error, what we would do. And they were just, they were so enjoyable. And this is when I was the only engineer. So still working on it in my spare time. Eric comes in, and maybe it's evening or maybe, I don't know, it doesn't matter. But Eric comes in. I see it, I'm like, oh, that's not good. I can fix that really quickly. A lot of these fixes,

Starting point is 01:03:55 they're like five minutes, right? Especially at a smaller scale. And you would have something that's like, oh, they hit an error on our billing page or something like this. I would go in, fix the bug, commit publicly, it's open source. I would take that link. I would email any customer that hit the bug. And I'd be like, hey, I just saw you hit this issue. Really sorry about that. Here's the commit that fixed it. Should be live in the next couple minutes kind of thing. And like the wow factor from those customers that are like, whoa, like this is insane to me. And you cannot do that at any large organization. It doesn't matter how cutting edge you are, right?

Starting point is 01:04:30 You can never get that idea back. But it was such a cool experience. And I'm like, it must be amazing to be a customer and have that kind of experience, right? So I think it's just that idea that you just have so much freedom to do stuff and to try stuff and um to try different jobs even to some degree right to like figure out where you can apply yourself best to to try different skill sets in the same job right um it's a good chance to learn out about a lot of different technology because the projects the product is still young um yeah i don't know

Starting point is 01:05:01 lots of different learnings i would say you can have. Yeah, if I was a customer, I would basically commit to the product at that moment. Like I got an email like that unless I didn't need the product at all. Cause like I've been using a product that they just had like a half an hour onboarding session with an actual human being. And it's an expensive product,

Starting point is 01:05:23 but I just can't stop using it cause I feel like the whole experience of that onboarding, which was like a year ago, it just felt really valuable and felt like, you know, it's a great customer experience. The things that like leave the mark, you know?

Starting point is 01:05:37 Yes. But yeah, it's actually funny because as we've gotten bigger, we've lost that sort of openness where it's like we would customer support ticket, we'd link them to the GitHub issue kind of thing. We lost that openness along the way. Talking about engineering culture, it happens everywhere, right? It's not just in engineering. And we lost it because I was no longer doing customer support. And I really thought that was a fun idea, cool idea, and it's how I engaged. And so we actually now have an

Starting point is 01:06:00 open source team, that's two folks. And their mandate right now is just to reopen Sentry to like bring all that back, but bring it at scale, right? Like to bring back the things we thought were great about the company and the culture, um, things we thought made customers love us. Like they still do, but like the things that were extra special. And so they're just trying to figure out how to do that with the, you know, 140 and growing headcount now. So how do you do things like that?

Starting point is 01:06:23 Like, do you, yeah. What are some specific things that the team could do? As an example, instead of using Jira, which is internal for us, not public, let's bring it all to GitHub. Let's bring the conversation there. Okay, support team, when there's a bug, you can report it via Jira or whatever, but what we're going to do is take that GitHub link, give it back to the customer. We're going to let them see the openness there um we have a discord a public discord channel for a century it's like

Starting point is 01:06:47 hey everybody you should be on discord you should interact with customers every single day you're gonna you're gonna be so come so much more knowledgeable so much faster than just the few enterprise conversations you have right so just like you've got to force people to be in these situations they might not just default to right like i Like, I think I, as a founder and a builder, I default to like, how do I go find more people to talk to about the thing, right? But not everybody's that way. It doesn't mean they can't be that way. They just, maybe they don't know that that's an opportunity or they don't understand why

Starting point is 01:07:16 they might want to do that and stuff. So it's just kind of like, well, how can we build a lot of automation again, going back to how we scale to make things more open? How do we make sure it's in the culture, the investment uh behind it right from the top um and then just find good ways to engage that community and customer base and stuff like that yeah the whole experience makes it sound like you're just like a partner team or like a sister team in the same company rather than you're trying to like be a completely different company because that's that's what that experience is if you can just directly talk to people and yeah yeah i agree it's it's a i've never thought about it that

Starting point is 01:07:49 way but it is very much the same it's like almost like an internal customer base but you treat them the same you treat them as if like you really have to help them and you need to get the best feedback you can and stuff like that okay so maybe like one or two questions just to wrap up, which is like, how would you think things like building like a developer tools company has changed? Like, or like, you know, you started Centri in 2008. It started as a company in 2012. I think the ecosystem has gotten much bigger, but like, what are your thoughts? And like, what is your advice to like an engineer who wants to start like a developer tools company today? Like how would you ask them to think about stuff?

Starting point is 01:08:26 Yeah, that's interesting. Five years ago, they weren't really a developer tools company, which is super important. Literally, in some cases, maybe not laughed out of the room, but silently laughed out of the room if you went to a VC. Trust me, I was there. So that's changed immensely. Open source has also changed immensely.

Starting point is 01:08:48 Century was free open source. It was almost unheard of back in the day. In fact, I don't know that there's another single SaaS business in the world that was ever started as free open source, right? Where there was not an actual requirement to pay for the product to get certain functionality. So a lot has changed over the years. I think one of the best changes we've seen over the years is people have credit cards and they have budgets and they can just spend money. Again, you would always spend a lot of money on Salesforce if you're a sales team, but if you ever wanted anything in engineering, you had to go through some complicated process of like, well, why do you need these tools? And that's actually shifted immensely. And you combine that with like, and I don't know how Dropbox is these days, but from what we see in the industry is like the CTO, the CIO is not buying software. They're not making purchasing decisions anymore. Right. It's like, okay, this team wants to buy something, whatever, go buy it. There's some compliance and some vendor controls and stuff that exists at large scale. Right. But there's a lot of freedom from developers that choose their tools, which means you can actually focus on your customer. And that's what we do, right? Like I always told people, so ever since we raised the first round,

Starting point is 01:09:50 I said, our target was New Relic. We were going to beat New Relic from a business landscape point of view. And I did that for two reasons. One, New Relic is the APM industry, which encapsulates basically any monitoring that matters. That's how I view it. Two, New Relic was like a hugely successful company. And most importantly on two, it was successful with small businesses and mid-market companies. It was not just an enterprise company, right? It was like, you could sign up with a credit card and that's the model I wanted. So I took that. And then I would always explain to people like why New Relic has failed.

Starting point is 01:10:23 And I say that and people might be like, well, they make like, I don't know, a billion dollars or something. They haven't failed. And sure, that's true. I'm exaggerating. But I believe they failed because they once had a great product. Their APM product when it was launched was cutting edge. It did all this great tracing stuff for developers. Really, really valuable. And then over time, they stopped actually building for their customer, the developer. They built for the IT manager, whoever else it was that was buying their software, right? And one would actually argue that the reason Sentry is here and so successful is because New Relic did that kind of thing, or companies like New Relic, first off. So that's a really important takeaway. But we said, we are never going to stop building for the developer.

Starting point is 01:11:01 So we have north of 25,000 customers now. And even at this scale, we don't build anything that is not used by a developer. We don't build complicated analytics or I don't know even what else. I don't even know what you would do. Like what's a VP even do, right? Like they just look at numbers and stuff. So I don't know how to offer value to them. Like the numbers I show them, they're either going to believe them or they're not, but it's not changing their life kind of thing. And so, and this actually reminded me of Dropbox a little bit. Like when I joined Dropbox, I remember getting the spiel, like we don't sell ads. We don't do any of this stuff because we don't need to. The customers pay for the product. So all we got to do is keep giving our customers what they want. And that's how we've treated

Starting point is 01:11:40 Sentry. It's like, we just keep giving developers a better product for the developer, right? And that to me, it's not like a tactical thing you can do to grow faster or anything, but it's like a strategic long-term position of how do you build something that is actually going to be relevant and matter over time? You just keep building a thing that the customer needs. So that's how we approach it from a developer landscape point of view. No, I think that makes a lot of sense. And I guess some advice that you can tease out of that is engineers should focus on building stuff that developers need rather than trying to go upstack

Starting point is 01:12:14 if they want to build a successful developer tools company because they don't need to sell to the CTO anymore. Yeah, in my opinion, absolutely. Now, it's not to say you can't build a traditional enterprise product, even if it's for developers. Like a good example of security tooling. Yeah, there is security tooling that is more like bottoms up style. But realistically, it's like big companies, big purchase prices and stuff, and it goes from the top. So I think you've got to think about it from that lens. But if you want a product that is, you know, thoroughly for developers, you got to know, ultimately, the developer is going to have a product that is you know thoroughly for developers you got to know

Starting point is 01:12:46 ultimately the developer is going to have a say in what you're building so if you're not really focused on serving them nothing else matters at the end of the day cool yeah well thanks so much for taking the time here i had a lot of fun i think i learned a bunch yeah for sure like i enjoyed the conversation it's always fun fun, especially with COVID, we're all kind of trapped indoors. I used to go to conferences a lot, you don't get the hallway track when you're trapped indoors. And I honestly very much miss it. So it's always fun. Yeah. Thanks for taking the time again and I'll talk to you soon. Thanks a lot.

Starting point is 01:13:22 Thanks.

Your Ad Here

Software at Scale - Software at Scale 10 - David Cramer: CTO, Sentry

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.