Software Misadventures - Early Twitter's fail-whale wars | Dmitriy Ryaboy
Episode Date: August 13, 2024A veteran of early Twitter's fail whale wars, Dmitriy joins the show to chat about the time when 70% of the Hadoop cluster got accidentally deleted, the financial reality of writing a book, and how to... navigate acquisitions. Segments: (00:00:00) The Infamous Hadoop Outage (00:02:36) War Stories from Twitter's Early Days (00:04:47) The Fail Whale Era (00:06:48) The Hadoop Cluster Shutdown (00:12:20) “First Restore the Service Then Fix the Problem. Not the Other Way Around.” (00:14:10) War Rooms and Organic Decision-Making (00:16:16) The Importance of Communication in Incident Management (00:19:07) That Time When the Data Center Caught Fire (00:21:45) The "Best Email Ever" at Twitter (00:25:34) The Importance of Failing (00:27:17) Distributed Systems and Error Handling (00:29:49) The Missing README (00:33:13) Agile and Scrum (00:38:44) The Financial Reality of Writing a Book (00:43:23) Collaborative Writing Is Like Open-Source Coding (00:44:41) Finding a Publisher and the Role of Editors (00:50:33) Defining the Tone and Voice of the Book (00:54:23) Acquisitions from an Engineer's Perspective (00:56:00) Integrating Acquired Teams (01:02:47) Technical Due Diligence (01:04:31) The Reality of System Implementation (01:06:11) Integration Challenges and Gotchas Show Notes: - Dmitriy Ryaboy on Twitter: https://x.com/squarecog - The Missing README: https://www.amazon.com/Missing-README-Guide-Software-Engineer/dp/1718501838 - Chris Riccomini on how to write a technical book: https://cnr.sh/essays/how-to-write-a-technical-book Stay in touch: - Make Ronak's day by signing up for our newsletter to get our favorites parts of the convo straight to your inbox every week :D https://softwaremisadventures.com/ Music: Vlad Gluschenko — Forest License: Creative Commons Attribution 3.0 Unported: https://creativecommons.org/licenses/by/3.0/deed.en
Transcript
Discussion (0)
There was one outage that I distinctly remember.
Long story short and without blaming anybody,
a bunch of data in Hadoop got deleted.
Like 70%, no backups, it got deleted, just gone.
And the two emails I sent, one to Eng and one to the whole company,
are I believe still linked at Twitter.
Twitter has a Go link service.
So if any current Twitter employees want to go to go slash best email ever
and go slash second best email ever, those are both mine.
And it was like, you know, good news.
We have lots of space in Hadoop.
Bad news. It was totally an email I wrote with a sort of, this might be the last email I write to this company because I might get fired after this because we lost all the freaking data.
It might drop the data.
It's sort of pressure makes diamonds.
Later, I heard people say, you know, the joke was that like, I love hiring ex-Twitter people because no matter how much everything is exploding, they just go like, eh, I've seen worse.
Because there was stuff really, really bad, but also sometimes like the worst times are the best times.
One thing you mentioned from the acquirer side, like doing the technical due diligence, which is something you were involved in once you joined Ginkgo.
So what does technical due diligence look like
from the acquirer's standpoint?
At what point do you feel like,
okay, I'm satisfied enough that this looks okay?
You're looking for deal breakers, right?
If you're doing technical due diligence,
unless it's specifically like
we acquiring the magical technology
that it's going to be magical.
And if it's not magical, the deal is not worth it, right?
You're usually acquiring for some other reason.
You're probably acquiring for a combination of there's some code and really good talent.
And, you know, it positions us well for whatever like strategic reasons, right?
So if you're at a point where you're doing technical due diligence, you're looking for deal killers, not for like, I wouldn't have done it quite that way.
Right?
Like that's not a deal killer.
That just adds to your integration estimate. interested in not just the technologies, but the people and the stories behind them. So on this show, we try to scratch our own edge by sitting down with engineers,
founders, and investors to chat about their path, lessons they've learned,
and of course, the misadventures along the way.
Welcome back to the show, Dimitri. Fun place that I thought we could continue this conversation at
is your LinkedIn profile. You mentioned being a veteran of early Twitter fail whale wars. Maybe you can take us back a
little bit and talk about that. Yeah. So I joined Twitter on whatever the first working day was,
January 2nd, January 3rd of 2010. And Twitter was a bit north of 100 people
and it was sort of very popular,
but not nearly as popular as it was going to get eventually.
And already in the throes of the infamous rewrite
from Ruby to Scala,
the fail whale was the error page.
So whenever there was an internal server,
what the users got was like a cute cartoon of a whale that was then christen, while also rewriting it from
the original Ruby on Rails app into kind of a microservices architecture on Scala,
and trying to keep up with the user growth and all of that. And that was a really kind of
stressful and fun environment at the same time. A lot of very, very rapid learning as the whole
company grew, had to grow very rapidly. It was kind of the hyperscaler situation.
When I joined, we were in a colo.
Cloud services existed but weren't as popular.
We essentially were renting servers in a co-located facility.
And it was kind of a virtualized situation.
So we didn't actually have good visibility into where the servers were in relation to each other.
And by the time I left, we were running in several data centers that we were running
ourselves.
So it was kind of a major migration.
And there's just kind of one aspect of the scale and the growth.
And a lot of, I think for me, those were like hyper growth personally years as an engineer,
just because of the exposure to so many
people working on these problems and so many of these problems and kind of how they were emerging
it was it was really fun it was really stressful there was a lot of uh i really learned how to be
on call well and how to respond to incidents and you know firefight without losing your mind
this was the data this was like the data platform team, right?
Like, did you guys have like dedicated?
I was on the data platform team, but like,
first of data platform wound up getting involved in surprising ways.
But second, when I started off,
the company was small enough that I just sort of knew the other teams pretty
well too. And like,
I would get pulled in just to like figure stuff out.
But to give you a sense of like how the data platform would occasionally get looked into
total site outages in 2010, that was the first year I was there. We were still in the colo and
we were extremely constrained by the network bandwidth within the data center between the hosts and so
we were getting the situation where we were getting a lot of errors because the web servers
couldn't access the memcache the memcache servers so there was like a big memcache cluster and there
were the web servers that were like very very light servers which were meant to be stateless
and their requests were timing out like they just couldn which were meant to be stateless. And their requests
were timing out, like they just couldn't get through to the cache servers, which is basically
where all the timelines are on Twitter, or at least were at the time. The way the whole thing
worked was you post a tweet, there's a background demon that sees that a tweet happened, and it
materializes the timeline for everybody who follows you and updates it and sticks that in
the cache. So when you actually read stuff, it's not coming from a database.
It's kind of pre-built for you.
Right.
And I'm eliding a bunch of technical detail, but that's the big picture.
And so it's a very big problem when your web server can't hit the cache, right?
Because that means that you can't get anything.
Right.
And long story short, we were like, we have to kill anything that uses any bandwidth whatsoever.
So, for example, we turned off the Hadoop cluster.
Like in the particular moment when things got particularly bad, because everything was sitting on top of each other, we were like, okay, when the web server logs are being logged to Hadoop, to HDFS, that's taking up too much bandwidth.
And if, God forbid, HDFS decides to rebalance and shifts a bunch of things between the data nodes,
that will just flood the network. We can't have that. So we just shut down the whole Hadoop
cluster. And the day we did that was the day that Jimmy Lin joined Twitter. He was a professor at the University of Maryland.
He is now in Toronto, I believe.
And he literally joined because he was like,
I'm going to, he was analyzing like graph data,
social networks.
He's like, man, I'm going to use
their awesome Hadoop class.
Sorry, there is like literally no compute for you.
Let's just whiteboard some stuff.
And the issue turned out, it was like, it just didn't make sense.
But the issue turned out to be that kind of the virtualized network in Arcola provided to us,
hid from us how they actually grew Twitter's growing footprint.
And so you start off with Twitter a certain size and it grants whatever,
and servers, right, of different configurations.
And they're sort of in a physical same pod somewhere in that data center.
And so you've got your memcache servers, you've got your web servers,
you've got your Hadoop servers that are like different configurations, right?
Hadoop servers, massive disk, memcache, lots of RAM, web servers, very thin,
but like you have lots of them because you want lots of parallel cores.
And then you want to grow your web servers,
or maybe you want to grow your Hadoop cluster,
maybe you want to grow something else.
And there's no physical room in the pod anymore,
so they get the next pod, right?
And they start allocating your things there.
But to you, it looks like a flat network, right?
So now what's going on is eventually you get to a situation
where all of your cache,
like two-thirds of your cache are over here on the left side,
and your web servers are on the right side because maybe you didn't need to scale them as fast.
And so they have to go through the interconnect between the pods, which is like really thin.
And your data nodes for Hadoop are spread across all your pods.
And so whenever you do any MapReduce job
and it does a shuffle,
it just completely saturates the network
of your web servers.
And any server, any call between your web server
and your cache server is trying to get through
like MapReduce trying to shuffle 100 petabytes.
It was probably 100 terabytes back then.
But, and like, we had no idea of
that network topology. One of our first sort of data viz hires actually spent a while working
with the network engineers and the ops folks to do like PCAP captures of network packets
and figure out which protocol is talking to which IP and sort of create a map so that we kind of figure
out that like, okay, all of these servers are over here and all of those servers are over there.
And this is talking to that on the MySQL protocol and that's talking to this on the TCP protocol.
So now we know like who's doing what because we needed to reverse engineer that. So when you ask
like what the hell was data doing to, you know, involved in the
fail well stuff, like you think it's adjusting settings on your web server or like dealing with
timeouts, it's that kind of stuff, right? Because the world was bananas. So that was one. There's
also one where like our first move to data center was hilariously disastrous and there was both
fire and rain. Sorry, sorry, before going into that. Wait, so, but basically shutting down Hadoop,
so poor Jimmy, but then that
did solve the problem. It's like from
that decision, then you guys were
like, okay, well, now the problem's gone,
so obviously it's the Rishapalooza,
or like, how did you...
No, no, no, that was just to get through
like World Cup or something,
because it's spiky.
Like, let's shut down Hadoop for today.
One story leads to another, because then that was like, obviously.
So first we were like, we'll run the HDFS, but we don't want to lose logs.
Logs were writing, you know, describing to the, to the distributed system.
We'll just not run any MapReduce jobs. And then we discovered that the occasional rebalance of the data nodes
making extra copies on other nodes and not knowing about the actual layout,
even that would cause enough chatter that it would create a spike of errors
on the Twitter website.
So then we had to shut the whole thing completely down.
We just ate not having logs,
which is hilarious when you're trying to troubleshoot a problem
on your first hand.
Right, that seems a bit drastic.
I don't recommend this.
This was definitely a kind of last-ditch thing.
That story just comes to mind
because it is so particularly egregious.
Most of the time, it wasn't stuff like that. But yeah, there was a bunch of yelling at our hosting provider to get
them to move servers around, to get them to add network. That's how you actually solve it.
And also, try to... I think we did some stuff where we added some small caches on the web servers
so that they could be more user pinning.
So there was like, we started getting clever, but the first thing is just address the immediate
pain and then like re-architect, don't re-architect and then eventually it will get fixed.
Right, right.
That was one of the lessons of the first, restore the service. Second, fix the
problem. Not the other way around. The engineers really
want to but why is it happening? Hold on, like, let's really
observe it. No, don't really observe it. Like the site is
failing. It's ended up.
I literally said that this morning to a colleague of
mine, was like mitigate first and then figure out why it's
happening.
Yeah, like capture. And that's where like, as you get experienced, you also learn how to capture the right telemetry, Uh, or just like mitigate first and then figure out why it's happening. Yeah.
Like capture.
And that's where, like, as you get experienced, you also learn how to
capture the right telemetry, right?
Like get your core doms, like capture all the possible state that you can capture.
And then like, if the solution is rebooted and it mysteriously works, like reboot it.
Yeah.
For now.
And that's how you reproduce the problem on a staging server.
Right?
Yes.
With the decision that big to be like, okay, let's be done with this Hadoop cluster,
how do you go about even escalating that of the decision? Because there's multiple teams,
not just Jimmy, that are impacted by this, right?
Yeah. I mean, it was a very small...
Now people are like, well, how would you do it at Twitter?
Right now, I guess Alan would just say shut it down.
But it was a very engineer-driven culture.
And so, you know, when there was a big problem,
I think, I mean, we had a CTO,
but I couldn't tell you who was between the CTO and my boss.
And I was an engineer at that point.
I wasn't managing it.
I don't know that we had VPs or directors or anything.
It was the people who are sort of like, well, there's the guy who knows about the web services
and the woman who is like queen of cash and an ops guy who knows about the web services and, you know, the woman who is like queen of cash and like an ops guy who knows all the
ops things. And they're like, Oh shit, everything's on fire. Hey,
what do you think about shutting down the Hadoop cluster?
Well, we have to ask Dimitri about that. And I go like, well,
I want the site to work. So yeah, let's shut down the Hadoop cluster.
I'll tell everybody that's kind of it.
It was very organic for a lot of war rooms, you know it's definitely it was very organic a lot of war rooms you know what were you saying
i was just saying it sounds sounds a lot more fun environment to it there's a different kind
of fun in being in this environment where you're just going through a bunch of fires and learning
things together um yeah yeah absolutely it's sort of pressure makes diamonds later i heard people say you know
the joke was that like i love hiring ex-twitter people because no matter how much everything is
exploding they just go like i've seen worse because there was stuff was really really bad
but also sometimes like the worst times are the best times oh yeah for sure i think like this
was an experience that a lot of engineers don't necessarily go through
unless they worked on the op side
at some point in their careers.
And this is something I see pretty regularly
when it comes to incident management.
Like you see a bunch of, let's say, for example,
folks who have been SREs in their past lives
or are SREs today,
no matter how big the outage is,
like you see the calmness and how they're
dealing with the incident.
You see how they are able to walk through the problem, able to mitigate it in time instead
of like, what the fuck's happening?
Sight's on fire.
Yeah.
Yeah.
Well, there's definitely like not freaking out helps.
And also there's a lot, I think I put some of that in the book, or maybe I didn't
and just meant to, but there's a set of skills you learn for dealing with an incident. Like,
like, it's, it sounds simple when I say it, but like, so many people don't do it,
giving updates, right? You know, the thing is broken, you're working on it. Even just saying,
I'm still looking at it, or I am looking at this
particular area. Is anybody else looking at another area? Hey, how's it going with looking
at that thing? Dropping in observations in the Slack, nowadays it's Slack threads, right? In the
Slack thread of just like screenshots and, oh, this thing looks funny or whatever. Just like
being vocal, being loud in a sort a dedicated space for a serious incident,
not for just some minor troubleshooting.
It is extremely helpful because then you can get multiple eyes on it and people know what's
going on.
There are so many even very experienced and smart engineers who, when faced with something
is failing, they kind of go dark and like they might show up at some point you
have no idea right they might be working feverishly hard and just like going all out but because
nobody knows anything it's it's very hard to manage the incident it's very hard for your support
people to like support you and support the users it's hard for managers to answer you know ceo
emails who are like, what the hell?
And a lot of it is just about communication.
You can be doing the same exact things, but good communication, the incident will go so much smoother and probably be resolved so much faster.
Yeah, I remember a lot of the feedback that I've gotten from like the second manager I
had pretty early was like, you need to bring the team along, which at the time, right, I was very young and foolish while still foolish today. I was like, what do bring the team along which at the time right i was very young and foolish while
still foolish today i was like what do you mean man i'm like trying to do all this like you know
fixing all these things like i ain't got time for that but now that you know i'm on the other side
like holy yeah like that's such it makes such a huge difference in terms of actually creating
a sense of like oh yeah yeah, we have this under
control. Like there's a process to it rather than, yeah, just having things being very chaotic.
And when a lot of people are involved, because like maybe there is a problem where it's just
unclear what exactly is going on, right? Like we're seeing these errors, but, you know, we live
in a microservices world, right? Like things propagate in weird ways. Back pressure builds up.
Who knows, right?
What you're seeing is, well, the site's erroring and MySQL has errors.
Are the MySQL errors at all related to the site erroring?
We don't actually know.
Where else can we look?
There's 5,000 Grafana dashboards, right?
It's just like there's a lot to do when it's not clear what's going on
and the system's complex. And now I'm thinking more like Twitter circa 2015, right? It's just like, there's a lot to do when it's not clear what's going on in the
system's complex. And now I'm thinking more like Twitter circa 2015, right? I mean, it's just,
there's a lot and there's a lot of instrumentation and it becomes its own problem.
Having experienced people who just take on the role of incident manager, not just from a like,
sometimes incident manager is like the person who does the reporting who says, we will update you in another 15 minutes.
But just sort of coordinating, traffic cop, right?
Okay, we're exploring these three hypotheses about what's going on.
Here's where we are on this.
Here's who's working on that.
If anybody has any new ideas, right, like funnel them through me so we can keep everything organized.
Right?
And just kind of keeping everybody coordinated on that also super useful. And so there was a lot of stuff like that,
that we either invented or reinvented or learned in those first few years of Twitter going bananas.
Yeah, it was really fun. There was a data center that caught fire.
And after it caught fire... That's a great start to any story.
It never ends.
This podcast could go for a very long time.
Very entertaining is what I would say.
Okay, so the data center that caught fire.
Yes, please go ahead.
You know, we moved in a bit early.
The data folks were the first people to move into this data center that was building out. And it was sort of our Colo host couldn't
keep giving us space. So we needed to move somebody out. And it was like, well, the offline
data doesn't need to be as physically close. And if the connection drops or whatever, we can survive that. Nobody wants anybody to be out. But offline data processing, it's both big and separable. So it makes sense
for us to be the first through the breach. But they were still building the data centers,
it turned out, when our racks got installed in there. So at some point there were like guys welding something on the roof and they
didn't protect it properly. And I guess there were sparks.
And so.
Spark in this context usually means something else.
No, no, no. Literal, literal fire, fire through the roof.
And then, so there was a small conflagration and then the sprinklers turned on
right and so it's flooding so first you have fire then you have flooding all the while like the
servers are supposed to be running that one was fun there were yeah there was there was a bunch
of things there was one outage that i distinctly remember, long story short and without blaming anybody, a bunch of data in Hadoop got deleted.
Like 70%, no backups, it got deleted, just gone.
It was sort of a combination of misconfiguration of a tool, a tool that allowed you to do that sort of thing in the first place,
an intense environmental pressure that caused kind of fairly extreme usages of the tool to be sort of normal in
the first place, because normally you wouldn't do it, but we had to. And then in that case,
like a slight misconfiguration was catastrophic, right? As usual, this happens over the weekend.
You know, the person who ran the faded command reported to me, we figured out this was happening.
We're like in the office, the small office,, literally sitting back to back. He's trying to fix the thing. I'm trying to recover what we can recover and
update the whole company about what happened. So there were several things out of that. One is
I wrote several emails updating the whole company. It was just all at or something, whatever the
handle was at the time. And the two emails I sent, one to Eng and one to the whole company,
are, I believe, still linked at Twitter.
Twitter has a Go link service.
So if any current Twitter employees
want to go to go slash best email ever
and go slash second best email ever,
those are both mine.
And it was like, you know, good news.
We have lots of space in Hadoop.
Bad news.
It was totally an email I wrote with a sort of,
this might be the last email I write to this company
because I might get fired after this
because we lost all the freaking data, my job with data.
And, you know, the engineer was mortified. I don't even know how many people still know who that was because I just kind of wrote it all from first person and was like, this happened.
If you have any questions, ask me, blah, blah. Kind of tried to give him space and cover and and i remember my boss at the time basically was like
one you know this happens don't worry about it i got you and two eventually this was gonna happen
it doesn't feel like it right now but it's it was like that first year it was either 2010 or 2011
like it is gonna be such a relief that this happened now and won't happen anymore
in this company than if this happened three or four years from now.
And sure enough, the amount of data we lost, it seemed so massive.
It was like everything.
We weren't generating that much data in a single day, three or four years later.
Losing that data would be like, eh, it's a day's work.
It sucks.
And when public reporting is based on data in Hadoop,
when all the machine learning recommendations,
just like everything is so tied into the data platform that we built that would be
so much more impactful and so much more hurtful that first year it was like okay well some of
the operations we used to do wouldn't be able we can't do and some but we can recover but it'll
take us a few days and it felt huge at the time and And in retrospect, it just didn't matter that much.
It didn't affect the trajectory of the company at all.
In the big picture, it didn't matter.
You don't want those things to happen,
but also it's useful to have this sort of perspective
of how bad is it now versus where are we going to be
in a couple more years, right?
And obviously doing the right thing when it happens, right?
Like when it happens, making sure it can't happen anymore.
It's so wise what your manager said, right?
In terms of like, oh, this is like bound to happen in that.
Okay, now when I say it, it sounds a bit cliche,
but where has he worked before that gave him sort of this perspective?
So this was Kaylee Torgerson and he had his own sort of data consulting practice before
Twitter convinced him. He was consulting with Twitter for like setting everything up
and then they convinced him to join full-time. I think before that he knew some of the folks who wound up being at Twitter through maybe his work at or with Yahoo, something like that.
Remember when Yahoo was a super legit tech company?
Back in the day.
It's interesting.
I've heard this sort of argument before, right,
where if you're having a lot of issues
really trying to fight for this idea that's really about prevention and that's super important and
sometimes the answer is just hey just let it drop right like let it actually you know catch on fire
and then people like actually look at what happens in order instead of you wasting all this time trying to, you know, advocate for it.
Is that something that you've like, you, I guess, agree with, or you've seen it done well?
I think it depends on the specific thing that we're talking about failing, right? There are
some things that can't fail live, right? I don't know, missile control systems, right?
Like things involved with people's lives,
operations that are regulated, right?
Like you can't do fraudulent transactions
and things of that nature.
But there's a lot more that can fail than people think.
I'm trying to remember the expression.
It's something like, we all test and prod.
Some people also have a staging environment.
And I think there's less talk of it now.
A few years ago, people talked about it a lot,
but the notion of testing in prod
and if you know the book accelerate you know they sort of lean on that stuff where the idea isn't
that you only test in prod the idea is that you know that stuff is going to fail in prod
and if you trust so much your staging testing in your qa and acceptance and everything else
then you're gonna fall really hard when things fail in prod.
Nothing tests your system like real life and real users.
So if you start from the assumptions that things will fail
and you need to have the observability, the ability to debug,
the ability to capture all the data you need to understand things,
and the ability to recover from errors.
So save the appropriate state so you can do replay,
expect that a transaction might happen several times
and you need that impotency, things of that nature.
You'll be in a much better place, right?
So it's less, I think, about just let it fail,
but more like expect that something will fail
that you didn't expect.
Think about how you're going to recover, so some of us are lucky enough to have been actually taught something about
distributed systems all of us who are working on the web are writing distributed systems
it's just like some folks haven't been told that anytime your web server makes a call to a database
congratulations you're in a distributed system now like if you made a right to a database, congratulations, you're in a distributed system now. If you made a write to a database, the database might be able to understand transactionality,
but you're outside the thing.
If you have multiple web servers trying to make the same write, they might write it three
times and the database won't know it's the same thing unless you thought about it ahead
of time.
It's a distributed system and a lot of folks are just sort of, I make the call and then
the thing happens.
And well, what if i make the call and then the thing happens and well what
if you make the call twice what if you need to undo the call because it's actually part of a
you know the classic example shopping carts right i tested it i tested it with my unit tests
how would this happen yeah or you create this crazy elaborate systems that just take
in some cases they're very good investment, but not everybody can make
that investment and sort of being prepared for these kinds of errors and being able to detect
the errors, being able to recover from the errors is hugely important. And I think that goes, I mean,
that's a data engineering thing, but it's also any kind of service engineering thing, right? And operating a 24-7 web service or web-based service,
you need to know how, you need to think about those things
and not just rely on sort of testing.
And testing is good.
Definitely test.
Test as much as you can.
So some of the things you're mentioning are also in the book that you have,
which is The Missing Readme.
We'll definitely link it in the show notes
and highly encourage people to go check it out.
I get like 10 cents per copy.
Please do.
We also encourage people to buy the book, not just check it out.
What prompted you to write the book in the first place? I think it was running into situations multiple times where
folks who were new grads or kind of a couple years out would be good programmers and very capable of
doing whatever we needed them to do, but wouldn't know these things that a lot of us take for
granted that are kind of unwritten. Once you're in the industry for a while, you kind of pick up how the industry does things.
And because we sort of just picked it up from our peers,
we forget that it's knowledge you have to acquire.
And then the new person comes along
and they're sort of struggling for a while.
And then maybe they pick it up.
It doesn't have to be that way, right?
And a lot of problems can can be avoided if you just explain
why things are the way they are.
And so after yet another round of sort of explaining to somebody
who is very bright, intelligent, capable,
but is doing things kind of the wrong way,
just because nobody ever told them
what observability is or
like you know logs are not just print to standard error or something right or like what it what goes
into a log right it's just stuff like that that is it's not rocket science it's just stuff that you
need to know and you can eventually learn right they just why don't we just write it down in one
place so that people can... My
vision for it was a tech lead has a stack of these things on their desk and they get their new batch
of new grad hires or interns or whatever, and you're just like, read this over the next three
weeks. If stuff is weird, do you understand why we're doing a standup or something, right? There's
answers probably in there, but I can also, I'm happy to help you,
but like here's an answer so that you can just read that and we'll be like 80% there.
And so I shared that observation with somebody I knew from Twitter, actually,
like on Twitter, not from the company.
Chris Riccomini, who was also tweeting about something along those lines
and sort of we decided that it would be fun
to actually write a book.
We played around with ideas of courses and other
things, but we settled on a book.
It was really fun.
And it was surprisingly hard. Sort of write down
the basic stuff. Oh, how does the basic stuff actually
work?
In that process, did you learn
new things that, even though, right,
like you said, these concepts are fundamental, but did you learn new things in the process of writing it?
I think so.
There were definitely some concepts that I knew, but I never examined and tried to explain that trying to write down caused me to re-examine. I also read a lot of sort of different approaches to how to build
the valuable systems or what kind of log goes where or how to handle exceptions, you know,
dealing with null pointers. And at some point you sort of just have to decide which of the sets of
ideas you go with. But that part was really interesting. Like most of the time writing the
book isn't the actual writing, it's reading
and trying to synthesize like,
what am I trying to say here?
And what do I actually believe?
And I read this argument,
do I actually believe that argument?
So there was a fair amount of that.
We also, we have a chapter there about Scrum and Agile,
even though I'm not a particularly strong believer
in capital S Scrum and capital A Agile, it is hilarious to me that like the Agile, even though I'm not a particularly strong believer in capital S, Scrum and capital
A, Agile, it is hilarious to me that the Agile manifesto, the first thing it says is something
like people over process or getting things done over process, something like that.
And then we codify these super elaborate processes and there's the retros and the-ups and the planning poker and this and that.
There's just like so much of it.
And people talk shit about Agile and Scrum
because what they're exposed to is the process
and the process doesn't work for them, right?
So I spent a bunch of time in the book
kind of explaining what is the process trying to do
and trying to say that, understand why it's there, and
then you'll be able to use it or toss parts of it out.
What you can do is either just say, this process is stupid.
I'm not going to do it.
And then just have no process because you're probably going to fail.
These things immerse for a reason.
Or follow it in a sort of cargo cult way, right?
Just sort of, well, we're supposed to write stories
as a blank, I want to blank.
So I'm going to write a user story that says,
as a sysadmin, I need, you know,
the package version to be updated to 1.3.7.
And it's like, no, that's not a user story.
That is just a waste of words.
If that's all you want just write
down when you top reading i remember writing exactly that and being like this seems really
stupid i mean you know what's funny like the number of hours i've seen spent on how story
points should be should they be based on number of days, number of hours,
some random thing you make up and you say, well, it's logically this thing or relative to how much
time other things take. And so the reason I was laughing so hard through the process, because I
had a colleague on my team who was a scrum master. I love working with that colleague. Anytime he
would have a stand up, he would basically say, I'm going to wear my scrum master hat. And I would always laugh out
loud when he would say that, because I'm like, Hey, man, I care the least about your scrum master
hat. Anyway, everything that you were saying was just reminding me of all those conversations and
how many hours people burn to with the process and to plan to get stuff done. And sometimes like
the amount of time it would take to get stuff done and sometimes like the amount of time it
would take to get that done is less than the amount of time it takes to just put the process behind it
blind following of the process is very bad and the guy who invented points is now like that's
i would like to take that back because people did not get what I was trying to do here. Whatever it was meant to be,
the agile consultants have taken over.
But there is merit to trying to say,
these tasks are different sizes.
How much can I actually take on?
Having some sort of methodology to not commit.
Because without that stuff,
every single team I've seen, you commit to a bunch
of stuff and at best half of it is done at the end of your time box. Because you're kind of bad at
estimating. We're all bad at estimating. Let's acknowledge the fact that we're bad at estimating
and bake it in. How do we get better at estimating? Well, if we had some sort of projection,
we could see how big our error is and then we could drive it now but then so like the logic makes sense it's there like oh is a point a day like should
it be a fibonacci sequence and like there is reasons for making it a day there's reasons for
making a fibonacci sequence but if you take it as religion and dogma it doesn't do anything like it
fails when you kind of take it as dogma. It works when you understand why it's there.
And just like with any kind of, you know,
like when you're in high school and they teach you the,
the essay structure and it's very stilted and they insist that you write it
that way. And you're like, but when I read, you know,
John Didion or somebody like who's an excellent writer,
they never do any of this. And that's just like,
that's because they know the rules.
When you know the rules,
you know which rules to break
and which rules to stretch.
When you don't know the rules,
you just write nonsense
that is impossible to follow.
So learn the rules,
understand what's there.
Once you understand what's there,
absolutely toss it.
And don't follow the thing
without understanding why it's there.
I did not expect this to be an agile i was hoping to hear about data engineering i think every engineer and talk to them about
agile has some strong emotion associated with it either they hate it or they love it
or hate to love it or love to hate it all All of that. So in terms of writing the book,
like you mentioned,
it was a very hard thing to get through.
And some other folks have also mentioned
that writing a book is not just a lot of hard work,
but it might not be as lucrative as one might think.
Did you know, like what, no,
well, one, I want to know your opinion,
whether that's true or not.
And again, I know the measure of lucrative can be very different.
I'm giggling because Damesh is giggling.
That's why.
I don't know what's going on.
But I'm trying to validate what I heard and whether that's true or not.
The other part is like, did you already know going in that this is what would happen? So first off, unless you have some sort of ridiculous hit on your hands, and there are a few, chances are you're going to make like pennies per hour for the amount of time you spend writing a book.
So from a financial perspective, it doesn't make sense. For some people, it makes sense if they're using it to advance their consultancy, right?
Or kind of create a brand and part of the brand is an author of technical books.
Or there are some people who write these runaway hits and they, and they probably do make
decent money from it. Although I suspect the ones that I'm thinking of, we're not setting out to,
to make money. They were just like, this book needs to be written. So I'm going to write it.
And then if I make some money from it, for example, you know, in the data space,
Martin Kleppman's book, right. Designing Data Intensive Applications is, it's like 10 years
old now. And it's an absolute classic that everybody gets as their first recommendation for
if you're involved in data
or distributed systems at all,
you have to buy this book.
By the way, you have to buy this book.
So he's probably made
some decent money out of it.
I highly doubt he wrote it
because he was expecting to cash in.
He wrote it because he felt
that this book needed to exist
and he's very good at explaining things to like cash in, right? Like he wrote it because he felt that this book needed to exist.
And he's very good at explaining things. And he has a very encyclopedic and broad knowledge of the space.
So, yeah, I knew that going in.
I know a few folks who have written books before,
and they all said that.
I think Josh Wills showed up on your podcast,
so him and a bunch of others.
I was pretty lucky to work with folks who were very knowledgeable and published a bunch of RLE stuff.
It was the sort of thing that I felt is a service.
I was hoping, like, the vision wasn't piles of money in my bank account.
It was stacks of books on tech leads' desks, like the vision I described before.
And I was like, that would be awesome and and cool and i want to make that happen so we can just generally lift up the level
a couple inches yeah you know so that was it what is the process of writing like especially when
you're working with the co-author so do you want to know like the actual writing or sort of the
pitching and everything else actually both so one aspect is like you both of know like the actual writing or sort of the pitching and everything else? Actually both.
So one aspect is like both of you, like you saw the co-author on Twitter and you're like, hey, seems like there's some common topics we are talking about.
So let's get together and write the book.
But then when you come together, like what does the process of writing a book look like?
Because and the reason I'm asking this is we've spoken with folks about writing in general before.
It's something that is a topic
that both of us are interested in.
And we've also heard a lot of our listeners
interested in like how to get better at writing.
And one aspect is like,
it's hard enough to write a blog post,
let alone write a book.
So what does that process look like
going from outlines to chapters to the final book?
And I would also love to know some aspects of like finding a publisher, for example, like how do you go about doing that and things like that?
So I guess going backwards a little bit in terms of the actual writing, when you have
sort of an outline or at least a general idea of what the different chapters are, I don't
know how it works for other co-authors.
For me and Chris, it worked pretty well because we're both pretty experienced open source
contributors.
And we're very used to this general idea of you say, I'm going to do a thing.
There are some people weigh in on the design and what to watch out for.
Maybe you have a first draft of it and they say, oh, don't do it that way.
Do it a different way.
And you rewrite it.
And maybe they put a patch on your patch to sort of adjust some things.
And then some third person says, hey, you should add a test here or there.
And you just have this very iterative, heavily reviewed, collaborative development process. And both of us sort of marinated in that kind of environment.
Like both of us are Apache committers and non-Apache PMCs.
So that part of the collaboration came very natural, right?
We just, I wrote a thing.
I have hated, but there's some stuff here.
Please take a look.
He does a review, edits it with the edits
visible in a Google Doc, accept, accept, accept. Oh, let's talk about this gigantic sidebar
discussion and comments, find the resolve and so on. But it felt very natural. It felt like writing
code except in English. So that part was great.
We had some hijinks in like going between Google Docs
and trying to use Git
and trying to use Microsoft Word Live
because of like publisher constraints.
That part wasn't so fun,
but the actual sort of exchanging comments
and figuring out what we're trying to do,
I think we had like maybe three calls
slash video calls through the whole time.
The rest of it was just back and forth on the docks.
Wow.
I wouldn't say like a month ago,
Chris and I hadn't met.
We've known each other since like 2011.
We hadn't met in person until a month ago.
So you wrote a book together,
but you met the person.
Wow.
Yeah.
And we live like an hour and a half
away from each other. But it was like pandemic kind of, you know, but, but yeah, that part was
great. And we sort of, in the first step of that was writing the pitch and sort of making sure that
we're on the same page and talking through like, what do you actually want from the book? What
should we include? What shouldn't we include? Why is this important, right?
And just making sure we're on the same page.
And I think both of us kind of,
there's enough mutual respect, you know,
that we don't feel it's a loss of face
to give on something or,
and we trust the other person,
like if they insist on something
that like they know what they're talking about, right?
So that was good.
In terms of finding a publisher,
most tech publishers have very clear sort of proposal guidelines.
O'Reilly has a Google Doc that you can copy that basically you fill out.
And it asks, you know, what is this book about?
Why are you the right person to write this book?
You know, give us an outline.
Are there other books about this?
How is yours different?
Those kind of questions, right?
And you just like email workwithusatoriley.com
and like they get back to you.
And then for Missing Read Me,
we actually wound up going with No Starch,
which is a different publisher
because we thought that they got
what we were trying to do a little bit better,
because Missing Read Me is kind of a weird book
in terms of defining its audience,
because it's a technical book,
but it's not about code,
but it's not a, it kind of straddles a bunch of things.
And for that one, we also, we had a draft chapter.
It was chapter six.
The first thing we wrote was a chapter and test,
because that's what everybody loves right and and we had uh we asked for a sample edit from from our
publishers like the publishers who were like we would like to publish a book we said okay we will
have several publishers who said they would like to do that we want to go with somebody who gets
what we're trying to do can we see what an edit looks like because we we didn't we hadn't ever
worked with an editor.
Wait, so this is before you said yes to the publisher,
they would edit the chapter that you wrote?
Yeah.
I don't think that's particularly common,
but for, I don't think that's common,
but we were in a situation where we were choosing between publishers
and they were pretty involved and both of them were sort of pushing the book in
different directions and we're like well how do we decide i think the real work will be like what
is the actual value of working with the publisher versus self-publishing on online right it's a
professional editor and the distribution network right um and so we to find out, we understand what the distribution network does.
We don't understand what the professional editor does
because we hadn't worked with one.
So we wanted to see what that feels like.
So we got notes back.
Was it super different?
Like the, from like the different editors?
Like the tone and the...
Yeah.
Yeah, it was.
Now it's been years, so I don don't quite remember but i do remember that
both the sort of the kind of comments we got and the volume of comments we got was quite different
and i think we actually went with the ones who gave us more you know reading because we wanted
the tough love and they were very helpful youcially, self-publishing probably would have gotten us more money,
but fewer readers is because you take a much bigger cut
when you publish on Amazon or something.
You don't get as much distribution, not as much visibility.
But I think the editor contributed a lot
and sort of pushed us to, to stay to some conventions and structure that we were deviating from being
sort of newbie writers really helped us also clarify things,
be more succinct,
synced kind of very,
I would sometimes go on as I do in this podcast and have like flowery
language.
And we're just like cut it out ruthlessly, just like make it very uh very easy to read you know like appeal to a broad audience
not of all of whom are native or very fluent English speakers you know things like that
so I think it was very helpful so there's one interesting thing like some of the blog posts
that I like reading are people kind of describing things in a very organic way, as if like if you're reading their blog post, it's as if the person is talking to you.
One example that comes to mind is Tim Urban. He drafting a blog post and I was thinking about this on me
and I'm like, okay, I can make it super succinct and cut it and make it crisp.
On the other hand, I can just write as if I was talking.
And I liked the latter part in terms of when I read stuff in general i just tend to i tend to
read those fully read those logbooks as opposed to uh the ones which are already formally written
because it's just easy to read that so when you're trying to can i can i ask you a question
you ever look up recipes online recipes online yeah yeah yeah all the time. Yeah. So do you read the, like, I was going out for a walk with my dog
and I was watching the fall,
fleeing?
No, I just don't.
No, no, no.
It's just like the bullet points,
like step one, step two, step three.
Yeah.
Okay, it makes sense.
What you're saying is
it depends on the content of the book,
of the book in this case.
And how much you vibe with it, right?
There are actually some recipe writers
where, like, I enjoy the essay enough. I'm like, yeah, this is the stuff, actually. I'm kind of enjoying this. It's because of the with it, right? Like there are actually some recipe writers where like I enjoy the essay enough.
I'm like, yeah, this is actually,
I'm kind of enjoying this
because it's creative writing, right?
Like I just like write an interesting essay
about like whatever,
what the smell of the soup evokes for them.
And it's engaging.
And some of it is just like it's tried,
like you're just padding the page
so you can shove more ads in front of me.
Like just give me the ingredients, right?
I think it really depends on the writer and and the audience egg tarts and uh state for the writing um
no no when i was trying to go with this is is it you who defines the tone of what the content should be?
Or is it up to the editor to push you in a direction and abide by the voice that they want you to have in the book?
I think the publisher reserves the right to not publish your book if they think it's bad.
Oh, okay.
But you don't have to take the editor.
Ultimately, you're the writer.
The editor is making suggestions.
Interesting.
Yeah.
So like on my personal blog and Chris on his blog have much more of a sort of individual voice maybe.
But yeah, I guess it was a little bit more sort of clear, succinct.
The more you insert asides and stories and other things of that nature, the more sort of you stand the
chance of losing the reader. Yeah, that's fair. In this case, for folks who might
noodle with the idea of writing a book, would you recommend it?
I'd say know why you're doing it, because it'll take longer than you think it'll be more work than you think
and it can be highly rewarding you know i hear from people who've
read the book and and i get such nice feedback you know it's
it's it is it is very rewarding to to know that it's out there and that it's actually helped some
people definitely don't do it for like money or fame, you know? But if you're driven to sort of like,
I know this book needs to exist
and I don't know who else will write it,
so I might as well do it, right?
And you think you can actually go through it.
In the end, like any large project
that you finish and are proud of, right,
has its own reward.
But that's how I view it, right?
I got some money out of it.
It paid for that stupid Microsoft Office Live license.
It was more than that.
It was more than that.
But it's definitely not, you know,
I could earn way more money with that time
doing like, I don't know, expert interviews or something.
In terms of roughly number of hours, I know it's super hard to quantify this.
How long do you think this took?
I don't know.
It was so spread out over the pandemic year and I didn't keep track, but easily in the
hundreds, like low hundreds, maybe 200.
I see. I think Chris published, Chris Riccomini, my co-author, published a blog post about
writing a technical book when it came out, and he might have an estimate for how much
he spent.
True, true, true.
Because he's a cool author.
People could Google it.
Yeah.
You will find that on LinkedIn in the show notes.
Go ahead, go on.
Do you think you would have done it if it was just you?
I'm asking because for me, for this podcast, for example,
Ronik has been huge in terms of kicking my ass,
being like, yo, Gual, where's this thing you promised to do two weeks ago?
Go, go, you know, just kidding, just kidding.
But did that kind of play into a factor
in terms of like getting the thing actually out the door?
I think definitely having a commitment to somebody else and knowing that they depend on me to do my part definitely helped me actually finish it.
I probably wouldn't have finished it if it was just me being like, I could sleep an extra hour or I could get up and go, right?
Like, I'll go for sleep.
I'm also very grateful for the product.
Thank you.
Thank you.
Oh, same, same one.
Uh, these days one has been thinking me every day, what things I need to do.
Excellent.
Anyway.
Um, cool.
Yeah.
So maybe the doing a bit of a pivot in the last 15 minutes, but
yeah, let's talk about acquisitions.
So this is something that you mentioned before you've been on like sort of both sides of it and the reason why i'm very curious is that recently
i had a friend whose sister kind of went through this process of they worked at a startup for like
a few years and then they like got bought up by like a bigger company and they seem pretty
interesting in terms of right like how you think about like how much
equity is worth right like in the interview process right like you know if the company is
really good they maybe give you some projections right in terms of like oh yeah if we exit for
this much money right this is how much your stocks are worth yeah like just very curious in terms of
how to think about acquisition i guess from like from an engineer's perspective. And also, how do you integrate with a company and all those sort of things?
Yeah.
I think I don't have a lot of insight about the financial part of it.
You know, stock options are not real money until they are.
And all of those projections about, like, if we exit for this versus that, like, this
is how much you get.
Don't factor in dilution
or if they do, it gets very complicated
and there are calculators online
and like you can play with those,
but it's monopoly money, you know?
But in terms of integration,
yeah, so I was kind of on both sides of this.
I've been at Twitter when we acquired companies
and we integrated them into our teams,
also at Zymergen and at Ginkgo Bioworks.
At Ginkgo, I led acquisition on a couple of things or did technical due diligence.
Actually, I had all three of them.
I did technical due diligence.
And the biotech company that I worked for, Zymergen, where I was the CTO and the PN, got acquired by Ginkgo.
And so my job was to make sure that the software team gets properly integrated
into a new environment.
And I think there are several different ways
that this can work.
And it's important during the acquisition process
for all parties to agree on what they're doing.
If it's a acqui-hire
or intended to be a growth of the acquirer's engineering team,
it's one thing.
If it's the Facebook acquiring Instagram or WhatsApp kind of situation where you're acquiring
a product and a team and it's going to be standalone, it's very different. You definitely don't want... And I think the second
case is you want to keep the team intact. You want to have a conversation upfront about what does and
doesn't need to change and on what schedule in terms of transitioning onto the parent company's
infrastructure, in terms of adopting their onto the parent company's infrastructure,
in terms of adopting their standards versus not adopting their standards and so on.
Most of the acquisitions I've been on either side of have been more of the first nature where you integrate the teams. And the danger there is that the integration doesn't go. And like for
years afterwards, people are like, well, you know, this is company A
engineers or company A practice, you know, and that's why things look the way they do.
And you want to be very clear about sort of how and in what order you're going to integrate things
or if there are competing solutions, which one's taking over and when, and kind of work aggressively
to merge. One thing I did very intentionally in the Ginkgo's Amargin acquisition
was work with my counterpart on the acquirer side
to as much as possible have teams that are,
because the software engineering teams were actually roughly similar.
I think we added like 30% to their engineering team, maybe even more.
So shuffling the teams so that there are acquired engineers on multiple teams within the acquirer's department.
And that does two things.
One, it sort of prevents that us and them kind of mentality, right?
Because like your team is your team now, and that helps get over that barrier mentally.
But two, you also have these kind of tendrils in multiple teams, right?
So it becomes that much easier to find out how things work
when you're like, what's going on
with whatever core services team?
Well, I know somebody there, like it's baked in, right?
Because I worked with her in my previous company.
So whatever good stuff kind of comes with your culture
can sort of be introduced to multiple teams all at once.
And the communication flows a little bit better
because so much in a larger company of communication
flows through not like organizational lines,
but who you know.
And a team coming in as an acquired team doesn't know anybody. Right? So fundamentally, the disadvantage and I don't mean disadvantage is like a competition of us versus them. But like,
it's like you're joining a new company, except you didn't have an interview,
right? They have no idea what they're doing or why you're having this kind of support
network that's read throughout, lets the organic network work.
So I think that's pretty important.
And yeah, remembering that the acquirer is the acquirer and there's a reason they
acquired you.
If they were hoping to, I don't know, modernize their architecture and you come
in and the teams are like, but our stuff works.
Well, with all due respect, the whole point was that we're going to change some things.
So let's talk about what things make sense to change and which things don't make sense to change.
And maybe not change all things at once, but we're going to modernize the architecture because that's the whole point of us being acquired.
And you can't come in and be like you all are
idiots you're doing everything wrong we're gonna do things our way you know like how could you
possibly have made the classics or like it's a legacy system like of course there are gonna be
things that are broken if you looked at your system with fresh eyes you would also see a bunch
of things that are broken right like? Like chill out a little bit,
cut out the well at whatever, Ginkgo Wii, at Google Wii, at whatever, right? Just try to find what to appreciate in the new environment. And remember that there's a
reason they brought you in and there's value in what you're bringing to.
So in a way, as you're going through the acquisition process, sounds like that
I might do a poor job of describing it, but two goals you're trying to fulfill.
One is from a technology stack standpoint, a successful merging of teams means you don't have two independent stacks, but you have one stack.
Where aspects of the new stack are kind of embedded into the new one, slowly evolve, parts that you would still still keep but towards the goal of reaching one end with it looks the same and the second aspect is the team where
the teams don't end up being two independent teams but rather one team and people are bringing some
different ideas to the table from the acquisition side but at the end of the day you have all of
the new folks spread across the entire company to kind of spread that connectivity issue in a way. Yeah. Yeah. I think that's right.
I think you captured it better than I did, actually. Yeah. You got it. And also, especially
when it's a small startup being acquired by a much larger company, knowing that there's
opportunities to do something else. And part of the benefit,
something you get at a larger company
that you don't get at a small company
is an opportunity to sort of get a different job
without changing your jobs, right?
So if you've been doing whatever,
data infrastructure for a couple of years
and now you want to do, I don't know, front end, right?
Like you're just really, like there's a way to transition.
Like you can just make the connection,
you can transition
and like they might have the support systems
and all of those things.
Or just being like, you know,
I've been on, I don't know,
support systems and I want to go into ads, right?
When companies get acquired
by the likes of Google or Salesforce or Meta, right? There's just so many different things that engineers can do, right? When companies get acquired by the likes of Google or Salesforce or Meta,
right? There's just so many different things that engineers can do, right? It opens up opportunities internally. And a lot of the times, I know from the time we were at Twitter and Twitter was pretty
acquisition happy for a while and acquired small teams that were really good. And some of the best
acquisitions wound up being, you know, it's four people that a year or two later all work on different teams.
But it's like, oh, yeah, they came from that startup.
Like, they're all really good, you know.
And their influence is felt kind of throughout the organization.
They're not necessarily like a small little team that you kind of deploy to different places to fix things, right?
It's just that they wind up having an influence all over the place.
And one thing you mentioned from the acquirer side,
like doing the technical due diligence,
which is something you were involved in once you joined Ginkgo.
So what does technical due diligence look like
from the acquirer's standpoint?
It really depends on the context, right?
Like you might go through the tech stack,
understand, start developing a vision
of how it would integrate with your tech stack.
Depending on the nature of acquisition,
you might get to interview people
or you might just get to talk to sort of the CTO
or sort of like the engineering leads,
but they don't want to tip their hat
and they want internally the rumors to start
so they don't expose you to the team.
So it can get a little bit nuanced.
So maybe you'll see some design docs and things like that.
There's usually something called a clean room.
But once there's a fairly strong understanding that this might actually happen, the company that's being acquired starts sharing a bunch of information about their financials, about their cap table, about their product, about their customers, like all that.
And it goes into a sort of a third party where you can look at the documents there, but then if the deal falls through, your access gets revoked.
So as an engineering leader, you might get access to that sort of thing and look through and ask
additional engineering questions and so on. As an IC, you might be invited into sort of
architecture discussion or presentation if things are a little bit more open.
And so they're like, we really want to understand whatever, how their streaming platform works,
right? Like let's get our streaming platform person.
They're the only person who understands any of this.
We need them in the room.
I see.
And in this case, like many times you,
the actual system in the works is rarely the same as what you have in the
design docs, partly because in design docs, you start somewhere,
you start implementing and the
system evolves just like a living organism of sorts.
When you're doing the due diligence, you want to make sure you understand not just the good
parts of the system, but also the limitations.
And it's not that the person on the other side is trying to hide something, but they
might not see the limitations the same way you do.
So they might not be as forthcoming with some information as you might expect it to be.
So at the end of the due diligence, like what does the goal look like?
At what point do you feel like, okay, I'm satisfied enough that this looks okay?
Yeah, I think you go into it assuming that things don't work 100% of the time or 100% perfectly.
And you're looking for deal breakers. If you're doing
technical due diligence, unless it's specifically like we acquiring the magical technology that it's
going to be magical, and if it's not magical, the deal is not worth it. You're usually acquiring for
some other reason. You're probably acquiring for a combination of there's some code and really good
talent and it positions us well for whatever strategic Right. So if you're at a point where you're doing technical due diligence, you're looking
for deal killers, not for like, I wouldn't have done it quite that way. Right. Like that's
not a deal killer. That just adds to your integration estimate.
Makes sense. I think that's, yeah, that kind of puts perspective, I mean, puts things in perspective for me, at least. So that makes sense.
Yeah. An important thing there is not so much, yeah, to like look for problems, but look for what will it actually look like to integrate or incorporate this into what we're doing, right? Like, you know, how are they doing authorization and authentication?
Is that going to be an easy lift and shift,
or are we going to, like, have to rethink the whole thing?
Some subtle gotchas that are kind of deep in the weeds
but can really change the timeline for, like, getting the thing to actually work.
And maybe, like, adjust the strategy, right?
Like, okay, we should let this run separately for a while because it you know they run on gcp we're on an
aws that's a massive migration maybe we should just never do it right like it'd be easier to
rewrite it like just have them write a new one on our system or whatever right like those kind of
things like you're working out a high level initial plan of like what's likely to be and hopefully talking to your counterpart there so that you're all on the same page about like what technically needs to happen.
You're not getting down into like, you know, how they deal with transactionality.
Yeah, you use Memcache, I would use Redis.
That's not the conversation you're going to have.
Yeah, like they're using Redis, we're using Memcache.
Maybe you talk about that. Like, do we run it for a while? I see. Yeah. But like they're using Redis, we're using MAM cache.
Maybe you talk about that.
Like, do we run it for a while?
Like I was a little bit sure.
I see.
So in a way, technical due diligence is kind of a data point for, I would say, CEO and
others who are the decision makers in this process to figure out at what point this thing
would be operational as part of an integrated stack and how much it would cost us to make
that happen. And if there are any deal breakers in the process yeah that sounds right
uh well dimitri uh sorry about running late on time i didn't pay attention but this has been
another awesome conversation with you uh and we wanted to talk about what you're doing next uh but
maybe we'll talk about that some other time
and thank you so much let's totally do that and thank you so much for sharing all the stories
all right my pleasure thanks so much take care bye
hey thank you so much for listening to the show. You can subscribe wherever you get your podcasts and learn more about us at softwaremisadventures.com.
You can also write to us at hello at softwaremisadventures.com.
We would love to hear from you.
Until next time, take care.