Software at Scale - Software at Scale 21 - Colin Chartier: CEO, LayerCI
Episode Date: May 19, 2021Colin Chartier is the co-founder and CEO of LayerCI. LayerCI speeds up web developers by providing unique VM-like environments for every commit to a codebase, which enables developers, product manage...rs, QA, and other stakeholders to preview code changes extremely quickly and removes the need to spin up a local environment to showcase demos. This enables interesting workflows like designers signing off on pull requests. Colin was previously the CTO of ParseHub and a software design lecturer at the University of Toronto.The focus of this episode was on developer productivity, management of a CI system and company, and even a little bit of cryptocurrency mining.Apple Podcasts | Spotify | Google PodcastsHighlights0:00 - What does LayerCI solve?2:00 - CI is generally resource-intensive and slow. What makes LayerCI fast? A lot of similarities to the Android Zygote. We’ve even floated the idea of Python Zygotes at a previous job.5:00 - The story behind LayerCI.12:00 - The architecture that serves LayerCI. The cost of nested virtualization and each additional Hypervisor. OVH.15:00 - Rate limiting. The impact of rising cryptocurrency prices on free tiers of CI providers - read more.30:00 - The power of building high-quality infrastructure. How both developer tools like LayerCI, as well as low-code/no-code tools like Retool and Zapier are important for the future.37:00 - Colin’s course for DevOps academy47:00 - Hiring philosophy for startups This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev
Transcript
Discussion (0)
Welcome to Software at Scale, a podcast where we discuss the technical stories behind large software applications.
I'm your host, Utsav Shah, and thank you for listening.
Hey Colin, welcome to another episode of the Software at Scale podcast.
Quick intro for listeners, Colin is the CEO and co-founder of LayerCI, which is
a really unique and interesting CI company. And I'm not going to try to butcher the details so
much. So maybe you can talk a little bit about it and welcome to the show.
Yeah, thanks for having me. I mean, yeah, LayerCI is a different CI company. It's primarily focused on people making websites.
And it doesn't really focus so much on running tests as getting more subjective feedback. So
the experience that led to making LayerCI is you're reviewing and other developers change, and they edit a CSS file. And you want to know what the ramifications of that are,
because no unit tests will fail, obviously.
And I guess that eventually led to LayerCI.
So at a very high level, that's what we do.
I've seen companies like Facebook and maybe GitHub, they provide functionality like this
similar to their internal developers on pull requests.
And you can maybe, you probably know more about this than I do. Have you seen
like bigger companies is generally do this? Yeah. I mean, there's there's a concept of like a PR box,
which is like, usually it's in your cloud environment. So if you're doing like AWS
production environments, you'd set up like a slack bot, you know, you'd get like 100 engineer hours.
And you'd make a bot that you could send a Slack message
and it would create a production environment.
So that's not really what we do.
I guess there's a bunch of downsides to that.
One is it takes a long time to provision an environment.
Two is it's expensive.
You need to micromanage when to turn them on and off.
If each of those environments is 1% of production and you have 100 of them, then you're doubling your production cost. So we're more of an
approximation. We primarily focus on just running front-end, back-end
database, something along those lines. So a reviewer
can see at the application level how things are going, but you don't necessarily have all of your
lambdas, all of your data stores, everything. It's all just
basically the MVP of what you need to review.
Okay.
That makes sense to me.
So how do you make it more efficient?
Do you just run, you run less stuff,
that's the first thing.
But I would imagine that it's still pretty costly
to spin stuff up on every commit.
So what are some interesting tricks that you do?
Yeah, well, I mean, the big idea
for layer CI is that, like, when you set up some environments, you're doing a lot of repetitive
work, it's always, you know, set up a database, put some fake data in the database, you know,
run the database migrations, start some microservices, you know, like start your
GraphQL provider, like all of these minutiae
that you always have to do on every pull request. And like, it's not really different per pull
request, like you don't really change your infrastructure configuration very often. So the
idea for layer CI is you do all of that, it kind of automatically gets snapshoted. So we take a
memory snapshot, as if we were hibernating the machine doing all of
the setup. And then we just make a bunch of copies of it. So the next time you make a change,
instead of running all of the setup, again, it's like if you don't edit the files that do the
environment setup, it just loads the snapshot. So it's like five seconds to get a new copy of
everything for your new change. This reminds me of Android Zygots.
I don't know if you're familiar with that.
My brother actually worked on that team.
Oh, interesting.
Yeah, for listeners,
since spinning up a new JVM is really slow for Android,
what the Android OS does is it keeps a hot JVM
plus some stuff running in memory.
And this is super outdated
knowledge of mine from maybe seven, eight years ago of Android development. But basically,
the Android OS just clones the Zygote process so you don't have to spin up a JVM. You basically
get a warm JVM for free for every app and that's how it keeps things snappy. That's
kind of a similar idea, right? Yeah. I mean, we also get a lot of other advantages. So I guess we can maybe talk
about architecture later. But since we're focused specifically on developer environments,
you know, we're not we're not promising reliability, we're not promising,
like your customers will have a good experience if they use these environments.
That means we can do things like really aggressive disk caching. We can basically make all of the I.O. act as fast as it was like a RAM disk, and we
can have really good internet connections and caching for everything.
We basically made our own AWS clone that is specifically tuned to developer production
needs.
Interesting.
So the concept of caching these intermediate layers
sounds remarkably similar to Docker, right?
So can you maybe talk about the difference?
Like one thing that stands out to me is that
Docker is not doing any of this stuff in memory.
And plus I'm sure they're not trying to be super efficient
because it has to be more correct
because you use it in production, But what else is there? Yeah. Yeah. So I guess the original idea for
Layer was based on, I mean, my experience before Layer was at another tech company.
And we used Docker CI. So we used GitLab CI, the Docker base images or whatever. And that experience is okay, but it's just annoying in very subtle ways.
So like you need a base image because, you know, Docker, like you have your configuration
file for the stuff that needs to be in your CI.
And you also have your configuration for what the pipeline actually is.
But they duplicate a lot of the same things.
It's like if you add a step in your CI pipeline that needs a library, you'd edit your base image to add the library,
rebuild your base image, push it. There's basically duplicate work between the two places.
So that's one thing layer files solve. And yeah, the memory snapshotting is also really annoying
because Docker can't keep processes running between build steps. So you can have all of the files you need, but files are a small part of a CI environment. You often want a microservice running or a web
server or if you want to use something like Bazel, these build agents that keep running in the
background, you can't have a Docker file that's like start Bazel and then using the running Bazel
like in the next directive down, using the running Bazel builds something because Bazel would shut down between
the build layers in Docker.
So I guess we just took the idea of
using Docker and CI and then
cut out the extra configuration. We just
extended Docker files
with what you'd usually use in CI,
like if statements and stuff like that.
And
we just made them automatically be detected.
So the same way way you have Docker files
that are built every time you push,
it's just like you push to CI, all the layer files run.
There's no extra configuration needed
for like the meta, how to glue things together,
which services run in which order.
So that makes me think that, you know,
base images are basically just like a performance hack
for Docker.
That's what you're
implying in a sense. I mean, base images for the CI use case are a performance hack.
Yes. Yeah.
But I mean, that's what lots of people end up doing, right? They make their own Travis clone by
making a base image that has Ruby and Go and Python and all of the production versions of
everything installed in it. And then they do all of their pipelines based on that
because they don't want to reinstall Python all the time.
But that's no better than Travis,
which is the way CI was 10 years ago.
So there's not a huge amount of value
that Docker brings at that point.
Yeah, I can personally say that I can feel this pain.
One of my first tasks at my new company
was trying to upgrade the
version of Mongo that's on our CI base image.
Cause it differs from the version of Mongo we use in production and,
and move coming from a world where we use like Basel to basically like
hermetically build everything.
This felt like a step in the opposite direction,
but it also felt like since we don't upgrade Mongo every day,
it makes things more efficient.
But what I'm thinking now is that layer is fast
because it basically takes the state of your RAM
or your memory and serializes it on disk
or it just keeps it around somewhere.
Is that accurate?
Yeah, so I mean, another way of describing layer as a platform is it's
just a snapshots platform. For example, you run a pipeline on Friday. You've done stuff.
It fails at 4 p.m. You're like, well, I have to do this meeting, and I'm not going to have time
to look at this today. Monday rolls around. You look at your pipeline. There's some nebulous error.
How do you find out from the logs what failed? You can rerun the pipeline, wait another five
minutes or whatever. It's kind of annoying. Or if you have the memory snapshot around,
you can just wake up the memory snapshot and shell in. The fact that we keep all of these
memory snapshots around means if any pipeline fails and the memory snapshot hasn't been deleted yet,
you can just shell directly into it. If you want to view the web server in a pipeline, it's like you visit the web server,
we show you a spinner while we wake up the memory snapshot, and then we forward your requests to it.
So you get these free ephemeral environments. If you want multiple services running in parallel,
you just load two memory snapshots, one for the front end, one for the back end.
And then because we tie the memory snapshots to which files were front end, one for the back end. And then because we tie the memory
snapshots to which files were read to get to that point, you don't really have to micromanage, oh,
copy package lock.json first so that the cache doesn't get invalidated. You just copy everything,
you do some actions, those actions cause files to be read. And then we'll map that back to which
snapshot we can load that's consistent with the files in your new diff.
So all of these things are relatively annoying to do in Docker,
but they're all kind of magically done
if you make a CI provider
that specifically cares about these snapshots.
What have you seen?
Have you seen any customers' jaw drops?
What have you seen as an interesting,
when you're doing a demo
or some interesting customer feedback that has been like this has changed my life like
maybe if you can just share some stories around that sure um i mean we we just talked to a
customer we have a i mean i think the video will be published soon uh we interviewed them and they
said that in their inno days like their innovation hackathons that, you know,
like I think this is a common policy, but a lot of companies,
like once a month,
like all of the engineers get together on some Friday and they they make some
cool things and they like see what kind of like wacky ideas they can launch.
And then they, they did this.
And then it was the first month or the first Inno day after they'd installed
their CI and everyone could just demo live what they built because you have these links for each environment
instead of fighting for the three staging servers or asking Infra to provision more specifically
for InnoDay or whatever. It's like you just push your code, the environment exists, you didn't need
to configure creating the environment, and you can just share the link with your... You give the demo
on Zoom and then you post the link in the Zoom chat and all of your coworkers can just share the link with your, you know, like you give the demo on Zoom and then
you post the link in the Zoom chat and all of your coworkers can just play around with things.
So, like, just the ability to have these snapshots around and demo things per branch is really
useful. What is the memory requirements for saving one of these snapshots? I have no idea how much
this ends up being in terms of file size.
I guess there's some magic going on in the back end. But memory for most applications goes
from one to 100 gigabytes. So if you're shuffling around 100-gigabyte files, it kind of lends itself
poorly to using an existing cloud provider. So we have some
unique architecture on the back end to deal with that. And we do a lot of copy and write stuff,
if you know what that is. It's like avoiding making entirely new copies of things. Because
if you have a chain of snapshots, you can deduplicate only the parts that have changed
between them, which is how Docker works. Okay. Yeah. And that brings me to your, your architecture, right?
So you don't use a cloud provider and we were just talking about the fact
that you use Kubernetes on bare metal. So like, first of all, how is,
how's the experience of that?
How did you decide that that's what you need to do?
It sounds like you were just like efficiency bound and like that,
that that's what your constraint was,
but maybe you can talk a little bit more about your
setup. Yeah. So I mean, I guess by nature of having these snapshots, we're really limited
in what we can do. We need to run our own hypervisor, which means we need access to KVM.
We need access to the kernel hypervisor stuff. And so if you're running an EC2 instance that's VM-based, every level of nested virtualization
is 20% slower.
So we're going to get huge performance loss if we use spot instances to run our worker
nodes.
And also, it's going to be really expensive because these worker nodes have hundreds of
gigabytes of memory to be able to run all of these VMs.
And I mean, we're at the point that each production node has a terabyte of memory.
So like these AWS bare metal instances that we'd be setting up would be both expensive and difficult
to maintain and not really significantly better than doing it ourselves anyways. So production
for us just looks like a bunch of like bare metal servers in OVH, which is like a French server provider. We're in
Canada, so we're familiar with it, I suppose. And over that, we have a Kubernetes cluster
where each service is sharded kind of onto different nodes. And individual customers
are also sharded onto groups of nodes so that we don't have to copy these gigabyte memory snapshots
all over the place. They're just on the node that the customer last
ran their stuff on. Interesting. And I'm guessing that also provides like isolation because if
you're like a CI company, one customer can easily overload like, or without any checks in place,
like one customer can easily like use up a lot of capacity that's meant for everyone else. There's C groups and all of that. It's even more interesting because we run our
hypervisor within containers. So we expose KVM into the container and then the hypervisor
interacts with KVM but still is limited by C groups, which are like the Linux way of like stopping certain processes from using
too many resources.
So it's not actually very difficult for us to limit users interacting with
or like overloading the node or whatever,
because it's exactly the same like Kubernetes configuration you'd use in a
cloud provider without doing hypervisor stuff.
So, but how about the case where a customer just has
like 100 commits or like 1,000 commits coming in?
That might not overload the node itself,
but it might just use up capacity of your cluster.
Or does that just generally not happen?
So we have rate limiting.
We only promise, I mean only,
but we promise 12 parallel VMs per seat.
So a company with 10 engineers can at most make 120 things.
That should be more than enough for most people.
Yeah.
And I mean, like nobody really hits the cap legitimately,
but sometimes if they have a poorly configured Dependabot or whatever.
Sounds like something that might've happened.
Surprisingly often Dependabot is not very good at rate limiting.
Okay.
That brings me to a blog that you've written super recently
that was doing really well on r slash programming, at least.
The rise of crypto mining and how that's annoying CI providers.
So maybe you can talk a little bit about that.
Sure.
I mean, to summarize the blog post, crypto has gone up like 10x in value in a year, like the market cap of all cryptos. A lot of top 10 cryptos, two or three have a system called
proof of work, which basically means like you can burn CPU time to make money. It's not very much money.
You can spend $100 of AWS credits to get $10 of money.
But if you can somehow find free tiers available in the wild and you have nothing better to do,
then you can just make a full-time job of attacking these things.
So AWS has the free tier, but they are
like really restricting. I don't know if you've tried to set up a free tier AWS account any time
recently, but it's really difficult now. You need a phone number, you need a credit card, you need a
two-factor authentication. To get any free credits, you need to have an incubator partner.
And the same thing's happening in CI. So CircleCI is being attacked.
We were being attacked.
Shippable was being attacked.
And GitLab and Shippable have both
worsened their free tiers in the past year
because of the nonstop attacking.
And because it's profitable for people
to just make a full-time job of attacking,
and attackers have an advantage in this sort of thing,
it's really difficult to defend because even if you have a full time team of defenders, the attackers will just keep making new accounts. When we
were being attacked, it was really bad for a couple of weeks. And we banned the entire
country of Indonesia because that's where a lot of the attacks are coming from for some
reason. And then the second we banned the country, there was like 15 new IPs on corporate networks in the
US. So they had the resources to pay like a dollar a month for the IPs. And they're probably paying
crypto for these IPs. So it's not traceable regardless. And there's not really much you
can do defense-wise, except for just
huge broad stripes like banning countries
and banning...
We use Cloudflare, so we use Cloudflare's
VPN detection,
and we just made Cloudflare ban all VPNs.
So some developers
might not be able to access us with a VPN,
but there's not really much we can do about that.
But these people
will be attacking mostly through the free tier.
They'll be automating creation of accounts
and then just running stuff.
Is that how it works?
Yeah.
So, I mean, they basically use the free tier
as a command and control script.
So they'll set up a fake desktop.
They'll set up something that connects back to their connection.
They can bounce it through any IP they want.
They can do the same thing in GitHub's IP range
so that you can't block list their IPs.
You can't block list any AWS IPs
or your customers won't be able to connect to AWS.
And so they can just rent a free UPS in AWS
and they'll command and control out of that. And then they'll, you know, they can
run a browser in there and sign up for other CI services with your CI services IP, like they can
use it as a proxy, they can mine crypto. It's like, you know, you get arbitrary code execution
by nature of the able to run things. Yeah. When we were running CI, it was just one of the biggest fears
that that's how people can use.
We were worried of a supply chain attack, just like SolarWinds,
because it's literally remote code execution as a service.
And we worried a lot about all of that.
But ultimately, there's only so much you could do.
We tried to just have audit trails
and just prevent people from ever deleting branches and stuff
just so that you can at least see if there's somebody attacking you,
they can't just hide everything.
But blocking everybody completely was considered really hard.
So at least let's just have audit trails and stuff.
What is the legality of all
of this? Are there vectors of where you can say that this is illegal or that doesn't matter,
like regulation hasn't caught up or regulation has caught up, disenforcement is really hard?
Yeah, I mean, it's certainly illegal in the States. There's unauthorized use of a computer
system, obviously. It's against our terms of service.
It's against GitLab's terms of service.
But are they going to send FBI agents to Vietnam or whatever
to catch these people in internet cafes paying with Bitcoin?
It took five years to bring down Silk Road, and that was in the US.
So if people are using Tor and are using crypto to pay for things,
it's both difficult to find them, and it's difficult to even persecute.
That makes sense to me.
And it's just, it's also like, unless a bunch of companies come together to try to drive enforcement or something, I can just think of like, you know, Epic itself can't sue Apple for a monopoly, but like five companies like Epic, Spotify, all of these different companies together that might stand
a chance, but it might be maybe something similar where one company alone probably can't
do much.
What do you think the future of this is?
These attacks keep happening.
Value of crypto just goes down.
Maybe that doesn't seem like that's happening anytime soon um i mean i think the the only long-term solution is to like restrict proof
of work so like if the if the people making the cryptos choose different sorts of difficulty
metrics like uh i mentioned in the blog post but ethereum is moving to something called proof of
stake which you don't get more ethereum for hacking computers at that point, or unauthorized.
You're just burning CPU cycles.
It doesn't give you more Ethereum once they switch to proof of stake.
So if the popular cryptos make it not profitable to just burn CPU time, then it'll become
unprofitable to just make a full-time job of attacking free tiers.
And if that doesn't happen, then basically free tiers will just go away because there are forums
on the other side where they're trading information about how to circumvent security
measures, IPs you can use. I read some of the comments to my article, and people were saying,
oh, why don't you just block the IPs?
Why don't you just block the connection to the pools?
And it's like, people are essentially using corporate networks
as IPs to connect to LayerCI already.
So if they're using random IPs from AWS to connect to us,
they can just make a tunnel from there.
There's just so little you can do. And they're using Selenium to run crypto miners, so you
can't even do executable analysis. It's just very, very difficult as a blue hat to deal
with this.
Yeah, the Ethereum migration to proof of stake, it's going to probably take a year
plus.
And I don't even know if the others like Dogecoin, which is just so popular suddenly, maybe because
of Elon Musk tweeting about it all the time.
I don't even know if they have a plan to move to proof of stake.
But yeah, I think that would also be better for the environment if we finally moved away
from proof of work.
But even Bitcoin, a lot of the people that were attacking us were mining these
random little small caps.
They were mining sugar chain and they're mining like Monero.
Sugar chain, yeah.
I think Monero is really the only one that's in the top ten that is ever really
used for this.
And that's because it's specifically designed to be profitable to mine with CPUs.
There's big memory requirements so that you can't use graphics cards or integrated circuits for it.
So I think Monero has a big ethical problem with their design.
And it's also hard to track down Monero users versus Bitcoin, even though Bitcoin
is like Skudo anonymous.
Yeah.
I mean, I don't think tracking down is the problem, as I mentioned.
If you track down one of these people, there's thousands of people that have the same capabilities
that have the same lack of scruples.
If $200 a month is enough money for you to make a full-time job of it,
it's like there's a lot of people in the world that $200 a month
is worth doing something full-time for.
Do you think it's individuals that are attacking or is it like consortium?
Do you know or do you have like a gut feeling about that?
So when we got attacked, it was all at once.
Like, you know, we were only founded a couple years ago. feeling about that? So when we got attacked, it was all at once.
We were only founded a couple of years ago, so as we were growing, we didn't have to have
all these security protections in place.
We didn't have to have the heuristics for whether a job was good or not.
We were happy if people were coming in and trying our product.
And then basically all at once, 10 different individuals were trying to mine Bitcoin on
our service.
So I'm pretty sure what happened is we got added on a list
in some Onion forum somewhere about past companies
that you can platform as a service companies
that offer a free tier that you can attack to make money on.
So I'm sure they're individuals, but acting sort of as teams
where they share resources and
have their own like, you know, like the GitHub list for developers that are like
free things that are great for developers. Like there's a list somewhere that's like free things
that are great for mining. Yeah. It's like an easy list for like ad blockers, but just the opposite.
Philosophically, you don't have to answer this or go too much in detail but like what's the impact of like having a free tier um how often do you see customers i'm sure now you do like what do
you think the impact would be to your company if like you remove the free tier and if not your
company even like to circle ci like what's your intuition on that i mean in 2021 you essentially
can't have a sas company without a free tier.
There's very, very few companies that someone will try if it requires a credit card.
AWS can get away with it because they're big and they're ubiquitous.
And it's like, well, I understand why I need to put in a credit card to use AWS's free tier.
But just the identity problem is essentially unsolvable for most companies. Would you try Calendly if you had to pay for it?
I probably wouldn't pay a dollar a month for Calendly in the early days.
And so you always choose early on in your usage of a tool,
when it's just starting to be important to you,
you'll always choose the free one because you're not getting enough value from it yet to pay for it.
And then as you get enough value for it, then you upgrade to a paying customer.
So basically all SaaS companies need a free tier or something that people can evaluate.
And if you don't have that, then you just lose to the people that do have a free tier.
That makes sense.
It's just a competitive thing because there will always be someone who will be willing to burn VC money to give people free tiers and get customers.
Or someone that's as big as AWS that can have a full-time team of...
AWS has teams of 10 or 20 engineers working on crypto detection and crypto prevention.
It's like 20 engineers cost millions of dollars a year.
So small companies can't afford that sort of resource.
So in the end, the only people with free tours would be the Heroku's and the AWS's
because they're the only ones that can afford, you know, the huge public company
spending million dollars a year is a blip to them.
What about like no charge credit cards?
Like you have to put your credit card in, but you don't get charged.
Is that like still a super high blocker to people?
When was the last time you did that? I probably wouldn't give my credit card to
some random company just to do their trial. Gym memberships have burned people from this.
You sign up for your 30-day gym membership and you need to mail them something for them
not to charge you. A lot of companies will do that. So
it's difficult. Yeah. I've seen companies trying to roll out no charge credit cards,
but that's only companies that previously had credit card trials. So clearly like the metrics
would help there, but I've never seen it go from, I've never seen a company just start off with like
a credit card zero dollar trial
I think it would be an interesting experiment that maybe somebody has an idea of
I mean another funny thing is that uh like there's all of these companies that let you
make virtual credit cards now right like yeah these on MasterCard have these APIs that you can
use to make virtual credit cards and you'd say like well, why don't I just use one of those to make a virtual credit card to sign up for LayerCI? And then they can't, like, I'll use that
for the free tier. But it's like, you can detect those, like Visa tells you whether something's a
prepaid credit card or an actual credit card. And obviously, we'd have to block them. Because if
someone stole someone's credit card, they can make 100 virtual credit cards on top of it,
and then make 100 accounts. And then, you know then same problem. So there's basically no world in which anything but a
credit card or a phone number can be used for authentication. And even then, it has to be a
plus one phone number because there's VIP providers that give you these random international phone
numbers and you can get thousands of them for a dollar.
And if you can get thousands of phone numbers,
then it's not a good rate-limiting resource anymore.
Here's a thought experiment, and you can maybe shoot me down,
because I just thought of it.
What would you think of a government-provided,
like Auth as a Service, where you can basically check
with a government API, like a US government API,
that this is a legit person or not
and provide like a free tier to them
without a credit card or anything
would you be in like
support of an idea like that if suppose
if like lawmakers created a bill
tomorrow and send like a request for comments
would something like that make sense
I mean
I think this is more
broadly tied into the problem of identity
I'm in Canada but like the US and Canada
have the problem of like Equifax
so you have some identity
number some nine digit identity
number and if people
have that number they can claim to be you
and there's no other way for you to authenticate
yourself so like
if you're buying something online
and you fill in the
information and it gets breached for any reason, like there was a famous thing where it was like
British Airlines had a credit card skimmer attached to their JavaScript on their payment page. And so
like you'd pay for your flight in British Airlines and they'd steal all of your personal information,
the three-digit code on your credit card, all of your birthdate,
all of your information because you have it for your boarding pass. And then they just bought a
bunch of stuff with it. So the problem is that it's symmetric. It's like a password stored in
plain text. Countries like Estonia have something that is like a digital ID card. And it almost acts like a YubiKey. So you
can tap it as an NFC device, and it proves that you are the owner of the card. So I think
that's something that most countries are going to need in the long term. Because again, it's
profitable to just breach people's identities and trade these huge lists of things and buy
free tiers and mine Bitcoin with the free tiers or whatever. So unless crypto gets worse, there's going to be a big identity problem coming
up. That makes sense to me. So then let's just take a step back. And we spoke about the initial
motivation for LayerCI where you were annoyed at Docker. But can we talk a little bit more about
that story? So it's one thing to be annoyed at Docker, But can we talk a little bit more about that story?
So it's one thing to be annoyed at like Docker,
but what made you decide that this is something
that you want to build and, you know,
start a company and go through IC and everything?
Yeah.
I mean, I guess my personal motivations for doing things
is I just like building stuff.
I think a lot of humanities problems will be solved
by the people building
the automation. I mean, you hear a lot of people claiming, oh, when the truckers are automated,
there'll be this huge work problem, blah, blah, blah. Automation is bad. Society's not ready for
it. But for hundreds of years, it's been like, oh, the people in the textile mills are going to
be replaced by machines and they're going to lose all their jobs.
So I think automation is really the only way to get out of all of these problems that we're facing.
The climate problem, adding more resources to people automating cars and automating shipping and automating politics so that these things can happen.
Automating identity.
All of these are important problems that need to be solved in the next 50 years. Housing. Toronto has a big housing problem.
It's like if people make factories to build skyscrapers, which is technology that exists
in China, for example, that would solve a lot of housing because you could just make
prefabricated buildings. But people aren't building these things because there's not
the infrastructure for it. People are driving coast to coast because in the 60s, the US government prioritized infrastructure for cars.
But no one's really prioritized the infrastructure for the internet.
With maybe the exception of Starlink lately.
So I think as someone building developer tools, we like building the tools that will move humanity
forward because it's like the developers and the people building automation and like making
supercharging them so that they can build things faster that's actually going to affect the world
that's a fascinating and like wonderful response to that um when people talk about the future and
developers often like people bring up no-code tools.
And I'm sure you've seen things like Retool
where you don't have to write that much code
or it's low-code.
You write a little bit of code
and mostly everything gets solved for you.
Does LayerCI ever have a vision
of helping test those things?
Because I can see a symmetry there.
You build a low-code tool, but it's really hard to test.
Maybe layer can help you with that.
Does that make sense at all?
Could layer help with low-code someday?
So, I mean, I think philosophically,
low-code was never really intended to replace programming.
It's like as more things, you know,
as the
internet eats the world or programming eats the world, more and more things involve programming
and the programming gets more and more complex as like the baseline of things gets built.
So you know, you can make like these ridiculously complicated video games with one person studios
now, whereas like Quake took, you know, I mean, took John Carmack and all of the resources of id Software.
Assuming I'm remembering the studio correctly.
But now you can make things because of all of this automation.
And so no code basically backfills that.
It's like the things that were hard to make 10 years ago are now being made easy to make
with things like Zapier and Retool.
But it's just opening up higher up, like the programming challenges that used to be impossible.
So self-driving requires lots of programming now.
But 10 or 15 years ago, it was essentially impossible.
So I think Retool and Zapier
aren't even necessarily competing with programming.
They're just building a new world.
Also things like Webflow,
these tools that facilitate things
that were traditionally programmer tasks.
You need someone doing the programming
to make it an industry. And then once the industry is big enough that it supports a
no-code solution, then it can be backfilled by no-code solutions. But unless there was the
WordPresses originally, then Webflow wouldn't exist because WordPress needed people to know
what a website was to exist. So we're not ever going to focus on no code. We're going to
focus on the bleeding edge, people that are developing and making hard things.
And I mean, that starts with web apps because that's where a lot of the innovation is going
on right now with things like Vanta, for example. But in the long term, it doesn't necessarily mean
web apps because I'm sure a lot of the things that people are currently doing in web apps will get automated. Next.js is less and less code. And then as Next.js becomes more and more standardized
and Auth0 and all of this identity stuff becomes more standardized, then you can build whole
websites without needing programming. So we'll keep chasing the people building novel stuff.
Mm-hmm. I think soon enough, there's going to be a day where there's going to be a billion dollar
business running on something like Replit this is just like my prediction I don't know if it's
going to be in like five years or 50 years but there will be that one person in their basement
or whatever and they have like a billion dollar business just running on Replit that would be like
an interesting world to live in I think but. But there are already businesses you can make without programming,
and you can't do it alone.
So I think the stereotype of just one person is probably not legit.
But I do agree that there's probably big businesses
that can be made with no programming.
And I think a lot of businesses hire programmers too early
for the repetitive tasks.
Why hire a programmer to make a website when
you can just use Webflow? Is your custom-coded website really going to bring value to your
developers? So I guess maybe tangent, but I was thinking of for this course I'm making,
the DevOps Academy, there's like a system design component.
And I was thinking, like, what are the components types? And like, what do people use for those
component types? So there's like databases, there's like, you know, MongoDB versus Postgres.
And there's like all of these traditional system design components. But then I realized that like,
no code is now a system design component. instead of choosing like what front-end technology
am i going to use oftentimes it's like what website builder am i going to use and am i going
to use like cloudflare pages or am i going to use uh like github pages am i going to use so it's
like for early startups it doesn't make sense to build these websites and host them yourself anymore. You choose that as a system design component.
So I think engineers should warm up to the fact that no-code or low-code tools are part of system design.
The same way that you wouldn't code a database from scratch anymore, even though you would have in the 80s.
It's like you just take that as a no-code.
Postgres is sort of low-code
in the same way that Webflow is.
So like I think more and more components
will become low-code.
And developers should just like use those as tools
for building these like interesting novel things.
That makes sense to me.
And that brings me like to a bunch of new questions.
Like one thing is that I think about,
and do you worry about this?
So Postgres and Python, as you said,
they were the original local tools
so that you don't have to write that much code on your own.
But they're open source and they're kind of community driven.
With Cloudflare Pages and Webflow and everything,
it's all individual companies building out these new components.
Does that worry you or is that just a fact of life
and the way things are
going to be like moving forward? I mean, I think usually the way it goes is there's like a closed
source offering that kind of bootstraps things, and then they're disrupted by open source and
open standards. And then it kind of settles into like an ecosystem of closed source and open source
building over open standards. So you know, So there was Mosaic in the early days,
then there was Netscape, and then there was Firefox, which was open source. And Firefox
and its contributors really moved the internet forward. And then there was Chrome that had to
be open source to compete with Firefox, and then that moved the standards forward. And then there's
closed source stuff that was built on top of that, like Internet Explorer, and then Internet Explorer
turned into Edge, and then Edge now uses open source stuff for their rendering because they had to backfill.
So I think Webflow is a great idea, and Figma
is a great idea, but if you can make versions of these that are
self-hostable, like a self-hosted Figma
is probably a company that will exist 10 years from now. And people
will switch to it because they'll want the ability to like,
or they'll want the certainty not to get switched off of.
And we saw that in our batch even.
We had someone making an open source version of Intercom.
So it's like, again, like about 10 years
after the closed source version,
the open source version becomes possible
to make a business of.
And we saw an open source Firebase
to plug them in Supabase and PaperCups. If you want to set those up in your developer,
you're probably more likely to go with the open source versions that are hosted somewhere
because you want the certainty that if something goes wrong or if you need to scale it or add
features, you can use the source code. I, I mean, I think it's just natural that closed source, like, leads the way.
And then they'll have to embrace open source on the back of that, or they'll be disrupted
by someone that does exactly what they do, but open source.
That makes sense to me.
And it's like the free market, right?
You make the closed source version in order to, like, capture the market and gain money.
And the open source version is another part of the free market where there's a competitor
that has this one feature,
which is a really important one,
which is being open.
It's like GitLab versus GitLab.
Android versus iOS.
It keeps happening that someone makes something
closed source and then five or 10 years later,
someone makes the open source version
and eats their market share.
They can both coexist.
iOS and Android coexist on different merits.
Yeah, and now let's talk about your DevOps course.
Like, I don't know too much about it.
So maybe you can just tell listeners, like, what are you building and why?
Sure.
So the course we're building is called DevOps Academy.
We're releasing it with a partner.
And the idea for the course is like DevOps,
like most of the DevOps stuff you read
is by like solutions architects.
It's like, you know, someone at AWS
is trying to like teach you DevOps concepts
within the lens of AWS.
Or someone at like some consultancy
that does DevOps for you is teaching you AWS
in the context of
buying their services. So like, it's very exclusive right now. Like you hear DevOps,
and you think of like, consultants being paid $300 an hour to set up like Oracle database for you.
And you think of like Java and like unit tests and Jenkins and kind of like the old world of
DevOps. But DevOps is not actually a scary buzzword
and it's like reasonably accessible even to startups.
And startups that adopt DevOps practices
do go much faster.
Even if they don't know that they're doing it,
it's maybe intuitive in the modern world.
So the idea for DevOps Academy
is teach things from like a startup perspective,
like make these processes, but don't over-engineer them
and realize that they're an investment.
So I didn't set up CI when I was making Layer of CI initially
because why would you set up CI before you have any customers?
CI is a tool to reduce churn, and it's a tool for collaboration.
So if you're one developer that's been working six months on something,
you don't need to write tests and you don't need CI.
So that's not something that like a solutions architect would usually tell
you, but you know, like a,
from a startup lens and from like a DevOps company lens,
that's something we can, we can talk about.
So like what are the components of like code review automation and when
roughly should you set them up like when is it a
reasonable time to invest in them that makes total sense to me and i will be so interested to read
this course once it comes out because not not enough people talk about like there's a lot of
podcasts and everything that talked about like a ceo's experience of like building a company and
like scaling and all that but like the actual technical details that's one of the reasons why i started this podcast just that i can get those kind of conversations going
so i'll be really interested to read that once it's out
what is something as part of like you know researching and like developing this course
that you found like was interesting like maybe you can give us like a sneak peek
so the ci one was interesting like what else have you found you know when Maybe you can give us a sneak peek. So the CI one was interesting. What
else have you found? When should you do code review? Yeah.
So we've been researching by talking to our customers. One of the benefits of being a CI
company is we talk to customers that are prioritizing developer tooling and setting
up things. One of the surprising things was nobody really has a clear picture of the space.
So you have engineers that have worked at previous companies.
They know how things are done at those previous companies,
and nobody really compares.
So it's like if at your previous company you used Lambdas,
at your new company you'll use Lambdas.
You don't really understand the pros or cons of it.
If at your previous company you used Docker,
at your new company you'll use Docker. It Docker. You're just used to that technology.
And it basically just boils down at a lot of companies to who's the first technical hire
and what have they used in the past? But obviously, there's a whole world of pros and cons
out there. And it's important to have someone on the team that knows the scope of things.
We've taught a lot of our customers with linting,
which I thought was something that was ubiquitous. Don't have code reviews that involve comments
about semicolons. Seems like a reasonable thing in a team of any size because it's so easy to
set up ESLint or GoLint for our case.
But people don't know it exists. So they do their
code reviews. Their developers
are bogged down for a day doing repetitive
code reviews where half of the comments
are, like, needs whitespace, needs semicolon.
And you
don't realize that workflow is broken if you've never
had an automated version of the workflow
in another company.
That makes sense to me.
And I think it's also like,
it's hard to inject some linters into like people's workflows, right?
People don't, it's hard to,
it's hard to like set up like pre-commit checks unless you have like a setup
script that runs for everyone initially.
So it's very hard to like backfill linters if you already have like a lot of that runs for everyone initially. So it's very hard to backfill linters
if you already have a lot of developers not used to that.
So that makes sense to me.
Yeah, we talk about that in the academy.
But you can actually set up workflows
that automatically lint and reformat things.
So developer pushes a linted version of their branch
will be automatically created from the CI.
How do you do that?
Is that just through GitHub Actions or something?
Yeah, I mean, if you're using GitHub Actions as your CI, then you can just get checkout
in GitHub Actions using a deployment key.
It seems obvious on hindsight that that's something you could do.
And then instead of opening your merge request from the base branch, you open it from the
linted branch.
And then when the person merges it, it's all linted.
You don't get any comments about whitespace. Interesting.
But unless you hear that,
you don't really think about how annoying the workflow of linting, repushing, rerunning all of your tests is.
I think it's getting more and more obvious
that I need to read this book or this course
as soon as it's out.
Maybe we can talk a little bit about LayerCI, the company,
which is, how big is the company
now?
How many employees do you have?
We have six full time.
Okay.
And any engineers?
So what was your process like hiring the first engineer?
Like, did you hire just someone you know?
Or what was your framework for deciding this is who the first engineer should be?
The first employee engineer in a sense?
Yeah. So I mean, I guess as a CI company, well, my framework for hiring was I wanted
people with good fundamentals that were willing to learn quickly. I guess my previous experience
was if you hire people that check all the boxes, they often actually do worse than people
that don't check all the boxes, but are actually do worse than people that don't check
all the boxes, but are hungry to get equity and hungry to learn. Because if people check all the
boxes, they're basically acting as a consultant. If you're hiring them because they check boxes,
then they'll do what you hired them to do, and then they'll stagnate most of the time, at least.
Because if they're interviewing
for places based on their technology stack or whatever then that's what they're comfortable
with and if you ask them to go outside of that then it's just like you know that's not really
the way startups hire so we our hiring was like uh in the job posting i didn't mention what
programming languages we used it was like we we want someone that knows kind of operating system fundamentals,
because that's required for for editing things. We want someone that has like, an intermediate
level of experience. You know, you don't want to hire a junior developer that needs to be
told exactly what to do for your first hire. So we interviewed for that. And the interview
itself is just questions that I've had to solve while building LayerCI.
Like building the MVP of LayerCI, there was some like graph theory, there was some operating system stuff, there was some math.
And you just take the questions you had to solve while building your product.
And then you find people that could also solve those problems.
And if they could solve the existing problems, it's a pretty good indicator that they'll be able to solve the upcoming problems as well.
That makes sense to me.
At what point did you know that you had to hire someone? Maybe the answer is
obvious, but I'd just like to hear it.
Yeah.
I mean, it became obvious once
we started getting people liking our product.
Before you have product
market fit, or before you have people that actually
like your product,
it doesn't really help much to hire engineers.
Mythical Man month and all of that,
if you have one engineer working on something and you increase it to three engineers,
it'll be maybe 20% faster.
So it's a very questionable value to hire people
in the early days before users actually enjoy your product.
But then in about September,
right after we finished Y Combinator, we noticed that people were actually starting to use
our product consistently. They were starting to... Usage had increased 10x in four months.
And so people were really starting to push lots of commits and we're starting to find
bugs with the product and we're complaining when it went down. That's a good indicator
that people like your product. And that's when we started kind of making job postings and hunting
we didn't end up hiring until uh april though so it took like about six months to
to get people on full-time and you know it ended up being like i as ceo don't want to be programming
i want to be doing podcasts and explaining,
or doing the CI course,
like all of the educational content that'll build a developer following.
And that's something I can do as CEO,
but you can't really hire people for very easily.
But if you have a product and you need features built
and you want to collect customer feedback
and start doing stand-ups and scrum
backlogs and stuff, that's totally something you can hire for.
So it was around the time that users liked the product and I wanted to do other things.
It was obvious that there was better things I could be doing with my time than programming
that we hired.
Yeah, that's a very simple and understandable explanation.
And that makes total sense to me uh how do you think
i would so you already mentioned this where like you want to think about podcasts and like marketing
but like how do you think your role is going to evolve over this year so you're working on i would
say some kind of building a following is that's basically what you said. What else do you think you're going to be doing over the course of this year? Yeah. So, I mean, it's a bit complicated because LayerCI doesn't currently
have a CTO. And it'll depend a lot on whether we get a CTO or not. Because as CTO, I'm doing a lot
of product management. I'm doing a lot of talking to customers. You need your technical leaders to actually talk to customers
because otherwise your product vision gets bad. So I'm currently doing a lot of that.
And I will probably still be doing that by the end of the year because it takes a long time to
either hire or groom someone to be CTO. And it's a big mistake to rush that.
But as CEO, a lot of the work is just aligning people.
I realized at my previous company,
but if you don't consistently tell people what the company is doing,
they'll very quickly forget.
If you hire an engineer,
and the engineer is working on some particularly interesting optimization problem,
it's really easy for them to veer off the
important stuff course.
They're like dockerizing your CI or if your second developer you've hired is dockerizing
your CI or whatever, it's like, oh, is that really necessary for our company?
Which is funny for me to see as a CI provider, but it's an easy mistake for developers to
over optimize things really early on. And so like as a CEO, the big thing is just like continuously telling people like,
this is the direction we're going. Like I include everyone that works full-time in our
investor update, which is something that a lot of people don't do, but I think it's like,
it's best for people to know like, this is the cash in the bank. You know, this is what
the company's doing. This is like the customers we the bank, this is what the company's doing,
this is the customers we've closed, this is what went wrong, this is the future of the company,
this is what every individual person is doing to further that goal. And as the team scales, that's going to become more and more of my time. So with a 30-person company, it's basically a
full-time job to just keep everyone aligned. So I guess the CTO hat will shrink, the developer hat
will shrink, and the leadership getting everyone aligned hat will keep growing.
Yeah, that makes sense to me. And maybe some final questions or final thoughts on
what will the developer experience, in your opinion, look like in five years? So today,
there's standard workflows. People like NPM serve or some version
of that locally to test things.
Layer CI is kind of changing that and making it easy to share a version of your code on
a particular commit and do it extremely quickly.
What do you think in like five years, like how are people going to be developing differently
than they are today?
And maybe a flip side of that, what's going to stay the same?
Well, I don't think npm serve is going anywhere anytime soon.
We use ngrok at LRCI for various things.
If you want to test something locally,
if you want to iterate very quickly without pushing to a repository,
npm serve and ngrok does very well for that.
So there are always going to be the local developer workflows with VS code, code sharing
and ngrok.
But I think pull request automation is going to become more and more popular.
So our vision for LayerCI is to do something like Slack, where your specific company has
specific needs for their workflow. For lunch in Slack, you can
install a poll bot that will automate the workflow of choosing a place to go to lunch, or you can
install the GitHub integration to automate notifying people when something is pushed.
But in GitHub or in whatever source code management tool you use, it's always just like, here's the code diff, and here are some buttons, like here are some checkmarks with buttons next to them.
And for a lot of teams, that interface just isn't good enough.
Like for websites, it's not good enough for evaluating CSS changes.
For apps, it's not good enough for running QA.
Like basically every company isn't perfectly suited by that. So our vision for the future is
make something that is extensible as Slack so that people can put blocks for the various
parts of their workflow. Run Cypress. If Cypress fails, put the screenshot directly in the view.
Assign these reviewers. If something visual changes, assign the designer and require their
check. Otherwise, assign the code owner for that piece of code. And then you don't have to wonder
who you should assign to this. You don't have to ask your manager who should review your code.
It's just you push code. All of the relevant stuff happens automatically. And then if you pass all
the gates, like you pass the eye, the reviewers all say it's okay,
then you just merge it and it's shown to customers. I think that's the future where instead of
fighting for days and weeks with notifying people and pinging them on Slack and setting up screen
sharing sessions and wondering why your CI is failing or whatever because you have bad
observability into your tests.
If you can put all of that in a Slack block
system right in the pull request,
I think that's the future of code reviews.
That makes sense to me, and I
would like to live in a world where we have
tools like that that are super easy to
integrate and not too expensive so we can
convince our managers so that
we should buy them.
Well, thank you so much for being a guest. I think
I learned a bunch from this conversation. It was great being on your podcast. Thanks for having me.