The Infra Pod - Turning Gaming PCs to Serverless CI for AI! (Chat with Aditya from Blacksmith)
Episode Date: August 25, 2025Tim (Essence VC) and Ian (Keycard) sat down with Aditya, CEO of Blacksmith, to explore the inception and innovative approach of Blacksmith in the CI/CD space. Blacksmith utilizes gaming CPUs and NVMe ...SSDs to deliver high-performance, serverless CI compute. They discussed the challenges faced by large companies with CI systems, why GitHub Actions was chosen as their primary CI system, and how AI-induced code generation is pushing the need for faster and more efficient CI solutions. Adya also highlighted the future potential of CI observability and other improvements Blacksmith is focusing on to maintain their edge in the market.00:29 Founding Blacksmith: The Origin Story01:16 Understanding Blacksmith's Serverless CI Compute02:21 Challenges in CI/CD and Blacksmith's Solutions05:14 Technical Deep Dive: Performance and Optimization06:53 Why GitHub Actions?08:08 Innovative Hardware Choices for CI10:06 Scaling and Managing CI Workloads19:14 Future of CI/CD and AI Integration28:02 Spicy Futures: Predictions and Hot Takes
Transcript
Discussion (0)
Welcome to the InfraPod.
This is Tim from Essence, and you let's go.
This is Ian, a lover of performing efficient compute in my continuous integration delivery pipelines.
I'm super excited today to be joined by the CEO of Blacksmith.
Aditya, please introduce yourself and tell us why in the world you decide to start Blacksmith
and also what in the world is Blacksmith.
Thank you, Tim, and Ian. Nice to meet you both.
And thanks for having me on the pod. I'm a pretty huge fan.
I have two co-founders, Ayush and Maru.
We were engineers before starting Blacksmith.
They used to work at Cockroach, and then Ayush used to work at the startup called Superblocks.
I was an engineer at Fair, and we'd interned at a bunch of companies before, and every single
company struggled with CICD, especially at Fair, which had about 400 engineers, they had a
pretty big platform team, and a big chunk of the platform team spent their time working
on CI.
And I'm happy to go into the challenges of it, but at a high level, we saw teams do the same thing over and over again at every large company, and we felt there was a better way out.
And we started looking into that, thinking about CI from the ground up from first principles, and we landed at Blacksmith.
Now, what Blacksmith does is we offer really fast compute that is instantly provisioned for,
companies to run their CI workload. So we're like a serverless CI compute provider.
Incredible. And like what exactly does that mean like a serverless CI competitor? Like what does
it look like to use Blacksmith and how what do you what do I get over the box? Like how do I just like
pick up and use it today? Yeah. So the difference I want to call it is that we're a compute provider.
We're not a CI system. The only CI system we work with today is GitHub actions. So developers can still
continue to use GitHub Actions as the control plane, use the same like YAML file, use the GitHub
UI. All they'll have to do is install the GitHub app and, you know, replace one line of code
in their GitHub Actions workflow file, pointing it to Blacksmith. And their CI workloads will
run on our compute, and it runs much faster than GitHub hosted runners or if they're self-hosting
it on their cloud account. Very cool. And like, you know, take
taking a step back, like, what are some of these things you kept seeing people do that
were causing problems, right? Like, what were these pains that you kept feeling at all these
different companies? I mean, I spent a lot of time building CI-CD systems or on top of
CICD systems, you know, like, what type of challenges are people running into that? And you're
like, hey, someone's just got to go and get in there and do something about it.
Yeah. So the number one challenge that we saw at much larger companies was just about
making sure CI runners or VMs were available
when developers needed them.
This comes back to the nature of CI.
Say workloads are peculiar.
They're not like production workloads.
They tend to be sparse and spiky.
So if you visualize the VCP utilization for a company,
you'll see that for a big chunk of the time.
It's actually zero, especially when developers are not pushing code.
It's going to be zero for like most of the day.
And depending on the company you're at,
let's assume that every time a developer pushes code,
they need to run 50 CI jobs.
And let's say, you know, each of them, like collectively,
they use 500 BCPUs.
So every time a developer pushes code,
you're using 500 BCPUs.
It runs for five or 10 minutes,
goes down to zero.
And there are times when three or four developers
are pushing code at the same time.
So you might go from zero VCPs being used
all the way up to like,
thousands and then back down. And that's what I mean by it's super spiky. So consider this scenario.
Let's say you're self-hosting, which a lot of companies do, like what I saw at fair,
when a developer pushes code, you can go and ask AWS for 50, you know, EC2 instances. And the
drawback to doing that is you'll have to wait to get all these instances before the job's even
start. And if it's during peak business hours, that can take up to five, 10 minutes for all your
jobs to start. I'm not even talking about the duration of the CI job. I'm just talking about
the job starting. And that's a pretty big drag on productivity. So what do you do to like get around
this? So what most companies do is they maintain a warm pool of compute that's ready to go. And
they're typically using like Kubernetes for this. They're using Carpenter. So they're going to like
have some amount of like nodes warm. Let's say, you know, that covers like 500 BCP. So if one developer
pushes code, things are good, instantly gets picked up. But what happens when four developers are
pushing code at the same time? You again have to like autoscale. That needs to kick in. It's not super
quick. Now, one thing that companies could do is they could like over provision to always absorb
the load for four engineers, but then you're burning one. So there's this tradeoff between how much
do you want to spend versus how much do you want to keep your developers waiting? And that was the
biggest problem. The second problem is performance. Now, GitHub hosted runners are running on
pretty old machines. These machines have much lower single core performance than the latest
machines, and the same for machines that are running on AWS. At Blacksmith, we run them on
consumer gaming CPUs, not server chips. And gaming CPUs, you know, because they were optimized for
gaming and making sure the game runs really fast actually has the highest single-core performance
compared to the same generation server counterparts. The other peculiar aspect of CI jobs is that
you're copying files around constantly. Think of what you're doing during code compilation. You're
constantly copying files around. And the hyperscalers most of the time tend to push you towards
using network-attached storage like EBS volumes. But for CI, you actually want the opposite. You
actually want locally attached NVME SSD so that it's super quick and you're not bottleneck by iops.
So we identified those bottlenecks and we started, he started Blacksman.
Now, the way we're getting around those problems are, A, we're using gaming CPUs with locally
attached NVME drives.
That helps with the performance problem.
Now, we have, you know, over 500 machines right now, and they're only running CI workload.
So when we're only running one workload, things get a lot more predictable.
and smooth at scale.
And even if one customer wants 500 VCPUs,
we can provision that instantly.
And at scale, this works out.
That's super cool.
I think there's so many questions I want to ask here.
So I want to try to divvy up.
Let's start with why GitHub action maybe?
Because I think obviously CI, people are running all kinds of CI systems.
We've seen a lot of like basil base, a lot of other things.
GitHub Action certainly is like probably the most common I've seen since for a startup.
especially, like I don't want to install yet another new thing. GitHub is already there,
Actions already there. But maybe tell us the reasoning you've picked GitHub Action as the point
to go at starting points. Is it an ecosystem or performance just sucks or just everything
combined? So it mainly had to do with the fact that GitHub Actions is the most popular
CI system today. Most people use GitHub and it works out of the box. Most companies starting out,
they don't really want to consider any other solution.
And we think that's where the market is consolidating.
If you look at, if you look at CSED at large, most companies are on Jenkins, like legacy
enterprises, but companies that were started five years ago are almost always on get-up actions.
And even companies like five years before that, so from 2015 to 2020, they're probably
on Circle CI and they're migrating to get-up actions in most cases.
So we were like, to start with, let's better.
on the actions market.
Okay, so I think that kind of confirms my suspicion, but I'm really intrigued about this gaming
PC thing, by the way. I'm picturing like a 500 alienware machines running somewhere,
you know, with a shiny logo and a neon license. Like, I wouldn't actually assume that's the thing
you're doing, even though it does make sense that it's optimized for workload. Is this something
that you just knew before going into Blacksmith? Or that you tested, like, I don't know, like a bunch
of different kinds, you know, of servers and just somehow, like, my game that I've been running,
you know, a bunch of games is much better. And how do you even host this damn thing? Like,
is it like, I have to go actually go on server rooms and do it myself now? Because I don't think
there's a gaming PC file for you. Yeah, I can go more into that. So the way we landed on this was
we were asking ourselves, you know, how can we make CI faster? And my co-founders who worked
that Cockroach had this, like, observation.
And it was pretty simple.
When they were working on Cockroach,
they could either build CockroachDB remotely on a server in GCP,
or they could do it on their gaming rig at home.
And they noticed that the gaming rig was substantially faster.
And it started with that, and we were like, why is that the case?
And we found out that, you know, there are two factors here,
single core performance and locally attached NVMS SSDs.
And so we found that.
And when we were starting the company, we were like, okay, where can we get these machines?
And turns out, Hetzner was offering these machines because they hosted a lot of call-of-duty servers.
And now we work with like, we have another region in the U.S.
where we work with like a much larger provider where we have a contract with them and they manage our machines and they rack it up and lease it to us.
But that's how it got started.
We were pretty scrappy.
And, you know, who would have thought that you could get call-of-duty servers and repurpose them for, for,
for CICD.
Okay, very cool, first and foremost.
You know, you have this Bersi problem case, problem statement.
Other than solving, like, what is kind of like a large scheduling issue, really.
Like, what we've talked about so far is a scheduling problem and then like a hardware
optimization problem, which is like, okay, let's make sure we're using like hardware that's
suitable for the job.
And let's also make sure that we schedule things so that people actually, like, we can offer
might you get. You can use CI to much as you want, and it doesn't break the bank, right?
Which is broadly the problem you're talking about. What are other improvements or optimizations
that you're making under the hood than just like the raw, like, hey, look, we just have
better servers and better scheduling? Yeah. Yeah. So I think this is one of the pros of running
a single workload like CI. We can make a lot of optimizations tailored towards that.
Before I go into that, I want to say that when you're running production workloads, like you're
running a database, you care about a lot of durability guarantees that don't necessarily apply for
CI. One good example of that is something like f-sync, where you're flushing the page cache to
disk, that's important when you're running a database. But it does not actually matter for
CI workloads, because if your CI job fails, if your Lint job or unit test fails, you can
rerun it. And that's something that we disable on our end. And we've made a number of optimizations
like that to just make CI run a lot faster.
And we've experimented with a number of different file systems and made a lot of modifications on our own purely for performance.
So that's one example that we're doing on like the software layer.
But if you're only running one workload, you can also do workflow-specific optimizations.
Let me give you an example.
A lot of our customers do Docker builds.
Like that's at the heart of like how most companies like deploy.
And you probably notice this when building Docker images on your lap.
When you do it for the first time, it's slow because it's building each layer.
But when you rebuild it the second time, and let's assume that your Docker file hasn't changed,
it's instant.
And the reason for that is the Docker layer cache is in your machine.
It doesn't have to rebuild all the layers from scratch.
But when people build images in CI, you're often doing it in a fresh VM.
The Docker layer cache is not present and it's slow.
And so we looked at that problem and we were like, hey, how do we solve this for our customers?
And right now what we're doing is we're mounting a Ceph block device with the ORG's Docker layer cache bind mounted into the runner.
So it's just there.
There's no downloading the layer of cash from, you know, that's another approach that beats in company's day.
There's no doing any of that.
It's just there ready to go.
And it's almost like having your Docker layer cache persisted across the I runs.
And in a lot of cases, your Docker builds are near.
instant. And this is extremely important when you're trying to get a hot fix out.
And so given its Hessner servers, right? I'm not super familiar with Hezner, by the way,
because I know they're more like a, I guess, boutique, you know, hardware provider with more
selections here. But I don't think they have the extensive like S3 and all that kind of stuff
with it, right? Like, you basically have to do everything yourself. And so it's able to handle this
sort of like, I have a cache that I can bimount in. Most people use either EBSs or some sort of
variance in the Amazon, right? Because this is readily available. It costs more, but you have more
knobs to tune. How do you do that with Hedzner servers? Like, do you have, like, a separate
net app servers running out next door? So we have two types of, like, caches. One is a drop in
replacement for the GitHub Actions cache. So these are cache artifacts. You know, you can think of your
NPM modules if you're, like, downloading each time. So we run our own Minio cluster, which is
an S3 compatible, like, object store as a replacement for S3 effectively like that.
And for the Docker layer caching primitive, which we call sticky disk, we're actually
running a Ceph cluster ourselves.
So there are a lot of benefits to, you know, being on AWS.
But, of course, when you go bare metal, you're trading off a lot of that for, you know,
in our case, like performance for our end user, but that also means having to do a lot of
things ourselves.
How much does locality matter?
and CICD workloads
like can you just throw
these things anywhere
and just like go for like
the cheapest power
like cheapest data center
with the best hardware
or like you know
because oftentimes like production workloads
we're sitting here
be like okay how do I get this shit
as close as possible
to my customer like
can I move the data to the edge
how much this kind of cash to the edge
how much can I make this
like a zero most of response time
so they load the page
and they use their credit card
before they realize
yeah so the answer is like
somewhere in the middle
it matters for some jobs
and it does not matter for others
like let me give you an example
If you're, you know, running like vanilla unit tests, they don't really matter.
They can run anywhere in the world as long as, you know, it's power efficient, like you said.
But it does matter for jobs like Docker builds.
Like let's say you build an image and you're pushing it to a container registry where your services are deployed.
Let's say that, you know, you're building an image in our EU region and you're trying to push it to U.S. West, there is a network agency there.
could slow it down. And that was one of the reasons we decided to, like, start a U.S.
region because we had a number of customers pushing the container registries in the U.S.
But it does not have to be super close as long as it's in the same region or, you know,
even in the U.S. it works out. Okay. So it matters somewhat depending on the job.
And do you have metadata about what the job is, right? Like, how much can you infer, you know,
I think you're using like, you're U.S.I.R.ize like the GitHub actions, Yaml file format,
which gives you the ability to help the stuff out. Like, how much data can you.
you infer from these formats about like the actual job type and like are these things you can
use to optimize scheduling like help us understand like yeah yeah you know okay so we're at the
point where it's like okay there's there's a hardware benefits or scheduling benefits cost
benefits help us understand what this action like what you can do on top of so i think it is
possible if we wanted to to go and try to understand where a customer might be running things
and auto-routing jobs,
but we're not doing anything too sophisticated right now.
We pin an organization to a region,
and they can email us, like, the default is the U.S.,
but if they want to run in the EU,
they can message us and we'll move all of their jobs to run in the EU.
It's a simple approach, but it's worked well so far.
Got it, got it.
But it's true fascinating that because you had to choose your own,
use your own hardware,
you basically have to recreate Amazon to some degree,
for yourself, right?
So you have storage, you have this sort of, like,
you know, Alienware servers.
I just keep people in my head.
What are other things you've got to do yourself
that just not typically as available on your clouds?
Because networking is actually not that easy, you know?
Yeah, I think addresses can exhaust really easily, you know,
Vips and stuff like I.
Like, what are other stuff you actually reinvent yourself?
And it's like turn out to be like more work that you hope.
We've had to like tune our networking stack quite a bit.
And networking is something that we're learning as we go.
Like, for instance, you know, the number of machines in a subnet, once we exceeded that,
we had to figure out, and we're actually still figuring out, like, how do we, like, solve those
problems?
I think networking is something that we're, like, working on.
Yeah, because that's the thing I thought of is, yeah, your IP addresses go out quick.
And there's a lot of reuse, but you want to be efficient on costs, right?
So you can't just be, like, taking that forever.
is it super fascinating that your customers, though, right?
They just want fast and cheaper.
Are you trying to always make sure both are satisfied for customers?
Or for some folks, like I actually tell you, like, you know, I just want cheap.
I don't care about fast.
I just want cheap.
But there are some customers because of their workloads are maybe inherently takes hours or minutes or whatever.
I do want faster and actually willing to even pay you more to try to get faster.
And I've seen those situations before.
And I wondered how do you play a tradeoff here or you just don't.
Yeah. So I think when we were starting out, this is a question that we wondered ourselves,
like, which value prop matters more here? And why are customers choosing us? I think we've learned
that performance matters more than anything. Especially with developers, they care about using
the fastest product. I think being cheaper than GitHub really helps us get through procurement
and finance because when you talk to someone from finance and you tell them that, hey, we can
like half this bill, they're like, great. And the conversation ends there.
but performance is what gets our customers excited.
It's what keeps them happy, and that's why we keep getting more customers.
There are a lot of workloads that look like CICD.
I mean, batch data pipelines is a great example.
You know, a training run of like a machine learning model is another, like, good example.
Like, there are tons of the sort of non-real-time batch style workflow-based use cases
tell us the vision here.
You know, you start in CICD.
Do you want to go further or do you think there's just so much of opportunity there
that CIC we just cooked there for a long time?
Like, what do you think the broader opportunity is for like a company like a blacksmith
to go after this?
I mean, it makes sense why you start here.
Like, it's an incredible starting point.
Yeah.
I think for now we're laser focused on C.I.
And we think there's a lot there.
And we think there's a lot more that we can do for our customers.
I think compute was a really.
good wedge for us for customers to start using us, but there's a lot more value that we can add
when it comes to CI observability. And that's something we're working on right now. Especially
right now with people writing a lot more code, they're writing a lot more tests and they're running
these pipelines a lot more. Making sure their CI pipeline is healthy, is not failing, is not riddled
with flaky tests. They matter a lot. And because our customers are running their workloads on
our VMs, we can offer a lot of this observability out of the box without them having to
configure anything. Like, that's the key. One example of this is in GitHub actions, you can go
and search your CI jobs logs, but only for a single job. You cannot do a global search across
all of your jobs historically. I'm sure you've seen the scenario where you see an error and you're
wondering, did I introduce this or has this existed before? Now, unless you're automatically
making sure that your CI logs are going into your data dog or New Relic or
Logics or Honeycomb, there's actually nowhere to figure that out and get up actions today.
But because it's running on our VMs, we actually parse these and expose them in our UI.
So you can actually go to Blacksmith and do a global log search across all of your CI jobs.
And that's helping a lot of our customers, like, fix problems faster.
And we're going to help them fix their flaky tests and catch them in the future with that same
approach. I mean, that's pretty cool. Because having spent a lot of time, especially early on,
when we sold the ones since the Salesforce in like 2013, we were heavy Selenium users. Like, one of the
first big at-scale Sun users was basically Salesforce had a huge UI service layer and trying to get
selenium at scale to work with all of the flaky UI tests and all the way that selenium itself works.
We've had Paul from Browser base on and I talked a lot about some of my experience of selenium,
how awful it was. But like, it also had this native issue of flakiness of like, how do I actually
understand. Do you think like the next step after
observability is like improvement? Like how do you actually
improve for people? Like, you know,
in a semi like autonomous manner,
like, hey, we notice this test is broken. Here's
here's the patch. Or hey, we notice this thing.
Here's what you can potentially do. Is that like the
ultimate outcome of sort of the flow you see from having this
data or where does this go from here?
We're still figuring it out.
But I will say that that's something that we're like
actively thinking of. We have all of this information
about what's happening in your CI. What's breaking
right now, in addition to surfacing it, can we help you fix it? We're still figuring out the medium
in which it's most helpful. We actually have something coming out soon where we're going to
surface CI errors and post them as a PR comment. And eventually, we're experimenting around,
like, can we have them do something about it, maybe kick off like a cursor agent to help
fix it or use cloud code? We're still working with around that. I was reading a tweet from
Michel, you know, HashiCorp, Mitchell was talking about how he was able to survive with
Amazon, right? Because his motto, I think the sales pitch was I would support Amazon faster than
Amazon, right? It would get a telephone to everybody. I saw it yesterday. And so, you know,
with that in mind, you're working with another incumbent, right, GitHub in this case, right? And trying to
be like, I'm better than GitHub than doing their own thing. Do you think they just don't care enough
about the problem,
they have too many other things to solve
so I don't want to solve
this little performance problems
or what is your way of surviving
with incumbents?
Like, is it just,
I have special expertise, they don't?
Maybe tell us more,
like how you've figured out
that there is enough for us to continue
without being squashed by, you know,
the big boss here.
Yeah, so there are a few different angles here.
The first is, in the words of our customers,
GitHub has stopped focusing
on actions and is more focused on co-pilot, and there hasn't been any material improvement
with actions in a number of years. And we think we're doing a lot of things to actually
help with that. We're making GitHub Actions as a platform much better and feature-rich.
And we actually think that's actually going to make GitHub Actions more appealing as a platform.
As companies think about where should they migrate to from Jenkins, should they go to GitLab,
or should they go to GitHub?
And if they see that GitHub has a much richer ecosystem,
then that's actually better for them.
And is there any acceleration with AI with you guys at all?
Like, doesn't matter to you or not?
Like, maybe the unit's just testing is the same amounts.
It doesn't matter.
I feel like the vibe coding is getting more code in general out there,
but I don't think the tests are really that much more yet,
my assumption.
But AI affects you at all, I guess.
Yeah. So the answer is AI has a pretty big effect on our business. And I'll say we're one of the second order beneficiaries of this AI cogen boom. I'll break this down for a few reasons. First is people are pushing out a lot more PRs than before. And every time someone pushes code, they have to run CI. And they're running it on us and we're running a lot more of their workload.
The second is developers are writing a lot more code than before and a lot more tests than before,
which means their bill times are actually going up, and the amount of tests that they have
to run, that's going up and that's also taking more time.
And that's actually creating more demand for us because people are getting more impatient
and they're like, hey, CI needs to be faster.
And we're seeing this with data too from like how many jobs our customers are pushing
and how much they're also like spending on us.
there's a third effect too
there are a lot of tools today like codex
clot code code gen
that are automatically
pushing PRs and iterating on them
and I think PR arena actually has
this really great dashboard that talks about the number
of PRs merged across all these
providers and also aggregates it by volume
on public repos and you should look at some of those numbers
those are growing exponentially
and all of those PRs had to run SEA.
I know there's a lot of like
ephemeral compute companies, like an E2B, a Daytona, in addition to what you just said,
like a lot of the net new thing about AI's high ephemerality and the task-based workflow
and this sort of scale-out, right? Like the future of development certainly does feel more
like a scaled-out branch test workflow experimentation pipeline than it does feel like systems
of the past. I'm curious, like you envision CI getting significantly more integrated directly
into these, like, whether I'm using a cursor locally or I'm using, like, you know,
some external coding agent like a Devon or a Codex or something, like, what is the future of
CI and CD? Because they are the encoded feedback loop for the LOM. And so I'm kind of curious,
stepping back and just thinking about the CI, the purpose of CI, the purpose of CD, the purpose of
like red, green, blue deployment. It's the purpose of like why we even start all this stuff
years ago. Like, where does this all end up? And certainly some part of this is a giant
feedback loop that can feed into the broader machine, the broad experimentation machine,
the broader, like, how do we actually like generate something that says, yeah, this is good
code or bad code and how we scale out the usage of AI and code development?
Yeah. I want to break this down into two different categories. I agree with what you said about
feedback loops, but there's immediate feedback, which is when an agent makes a change to a file
or a class, and it might test out, test associated with that. But there's,
also the final step of running all your tests and just making sure nothing has regressed or nothing
has broken, I think that's still going to happen. And I think that matters, especially before
deployment, just to make sure that things are still working. I don't see that going away.
Long and short, I still think CIA will persist, but the agents are going to iterate differently.
And I agree that today's approach of Devin pushing a PR and pull and get up actions to see if
something is broken, I don't think that's going to be the case forever.
Got it.
Well, we definitely want to go into our favorite section of our podcast called the Spicy Future.
Spicy Futures.
So I'm very curious now.
Give us your spicy hot take that you believe that most people don't believe yet.
I believe that companies should 5 to 10x their CI budget.
For the reasons that I mentioned, you know, companies are pushing a lot more code, their bills
are taking longer, number of commits is exploding and the amount of software being written
is going up. I don't think companies realize that they're going to have to rethink their
compute budgets. A lot of companies are introducing like token budgets. I know a lot of companies
giving their employees $500 in, you know, open AI or cloud credits. They're going to have to do
that for compute as well. It's already happening, but I don't think companies have adjusted to that
reality yet. So five to next CI budgets is, I think your correlation is like, hey, we should
have more people able to get faster CI to unblock them is the first thing I thought of,
is productivity will increase? Is there any other benefits that are really trying to push for
there? Like, hey, what is the effect of getting 5 to 10x more CI budget to do what?
Sorry, I should have iterated that more. So the reason they're going to have to increase their
budget is because people are going to run so much more CI. As developers are pushing out more
PRs, the number of CI jobs is going to go up by 5 to 10x. Got it. And you think the
CI jobs are getting five times because the code
were generating a fight attack more, right? Just basically.
There's more code and also the number of commits
that they're pushing out is actually going to go up
drastically. That's the main driver.
Got it, got it. Of course, like Clavode and Gemini,
whatever. The whole batch vibe coding
is it changed from vibe coding in my ID
to like vibe code all my code at this point.
So it's such a drastic effect.
And just to add to that, you know,
Today, a lot of people are using, you know, a lot of, like, clod code,
but eventually you're going to have, like, agents orchestrating other agents
that are going to keep pushing code all the time.
And every time you push code, you're going to have to run CI.
Another way to put this is, every time someone pushes code,
they're spending X dollars on running CI.
And when that number of instances, they push goes up five to ten times,
the downstream spend goes up five to ten times as well.
That's probably the most interesting part is, like,
If we talk to engineering leaders or even everyone's interest is so much AI right now and
AI budgets, like how much I'm spending on Open AI and all of these like inference providers,
right?
And maybe a bit of the AI tooling and platforms.
But I guess we haven't really even figured out the downstream effects of costs.
But you're saying like, hey, if you're going to push 10x more code, all over the rest of
your platforms is going to get 10x increased cost as well.
And people already noticing like the productivity improvement.
is huge. So I think that it's really justifying a lot of the cost right now. But it's also
terrifying, basically. I'm spending already a lot in R&D. And now I'm like, maybe the R&D people
has to reduce down to really like make up to that. Or I basically have to push more products
and more value out, right? Because there is a limit sometimes of like how much product stuff I can
actually sell. It doesn't even matter how much code I can really output there. And so what do you
think it's a prediction here. We're like, okay, code is getting generated way more. Less engineers
per team. So I basically don't really that much people now. And that would just even go more
exacerbated over time. Or do you think there's something else will happen, right? My prediction here
is that companies are going to do more with less people. So you're going to see small engineering
teams do things that historically would have taken, you know, five X more people. And I'd argue that
paying for compute is a lot cheaper than paying for labor.
And I think companies are going to be okay with that tradeoff.
That's my bet.
What do you think of the complexity of CI jobs?
Like, you think, you know, one is we've talked about,
you basically positioned your 5 to 10x CI budget around just like
velocity of code changes, broadly speaking.
And I kind of consider that, yeah, that's a second order effect.
But also, I mean, there's another, it may be a third order effect of the
that we have a lot of more of these agents, or maybe it's, you know, in a second order,
whichever order we want to pick. But, like, the complexity of what we will need to do in
CIO will increase as well, right? Like, a lot of what we're going to get to some formulation
probably a simulation in CICD to deal with the fact that the upstream velocity, and we're going
to have less humans to test all these new features and test all these new interfaces to test all these
different things. And so you're going to be wanting to look at systems in a way, like, in a true
like simulation style testing experience, something more akin to like maybe it's not end-to-end
studying of like, you know, 10 years ago. But certainly like the complexity of what you need to do
in CI will probably change drastically as well in a way that it hasn't. Because we had other ways
to manage, you know, that risk. I agree with that. And we've talked about this a lot inside the
company. And our prediction is that people are going to need tools to run a subset of their
tests. And here's the way. Like we spoke, people are going to write a lot more tests. It doesn't
make sense to run every single test when you've only changed a small fraction of a code base.
Now, of course, if you're on something like Basel, and Basel can figure out which target has
changed, it can run specific tests. But most of our customers, and we see a lot of people are
not on Basel, and they shouldn't be given the overhead. So we're going to need better ways of predicting
which test to run. And that's going to be a challenge of its own. Yeah, that's sort of
Maybe this is a little side question, to be honest, but I'm actually very curious about this.
Because we talked to Paul at browser base, right?
The AI Wave is creating a new infrastructure permitive required, right?
Because web agents and more actions and automation is happening.
Besides AI pushing more code to you, are you also thinking about new permit is required?
Like maybe the first thing I thought was GPUs, obviously.
Like, I need more special purpose, you know, compute.
or do I need to actually start to have specialized frameworks
to help people test their AI stuff?
Like, how broader or deeper the stack
do you think you do need to go
because of the AI changes at all?
Or did maybe just, I all just focus on the basics and make it fast.
Yeah, I think right now, for most of our customers,
the kinds of things they're doing,
they're mostly bottlenecked by CPUs.
But we do have a few customers who are asking for, like,
GPU instances so that they can test,
some of those workloads that require GPUs.
And I think over time we'll see more of those,
but I don't think we're going to see a shift
where all CI instances need GPU support.
I don't see that happen.
Got it, got it.
Okay, this is super fascinating.
We have so much we could ask,
but just intro of time,
I think, want to probably end here.
What are things, places people can find out more about Blacksmith?
Can I sign up as a user, where should you go sign up
and learn more about what Blacksmith?
Yeah, yeah.
For anyone who wants to learn more about Blacksmith,
just go to blacksmith.sh and click the sign up button.
We're actually just a few clicks to try out.
And that's it.
Don't even need a credit card.
Amazing.
Amazing.
And you get gaming servers ready to go to run Call of Duty and C.N.
Both, right?
Cool.
Super appreciate it being on our pasta.
Yeah.
This is a lot of fun.
Thank you for having me.
Thank you.