The Pragmatic Engineer - How AI is changing software engineering at Shopify with Farhan Thawar

Starting point is 00:00:00 We've been using AI tools for a long time in engineering. I believe we were the first company outside of GitHub to use GitHub co-pilot. And the reason... Really? Yeah, the reason I know that is because when Thomas Domke became the CEO of GitHub, the same day, I emailed him saying I would like GitHub co-pilot. This is 2021. A year before Chat, GBT, he messaged me back saying, it's not available for commercial use.

Starting point is 00:00:17 And I said, that's not what I asked. I don't care what you have for commercial use. I would like this deployed for all Shopify engineers as soon as humanly possible. I think it took him about a month. And we were not charged for two years because there was not a skew to charge us. And we said, exchange, we'll give you lots and lots of feedback. And so we were using copilot for a long time. And then we worked with them as we started looking at their roadmap. We then deployed cursor

Starting point is 00:00:37 internally as well. Actually, we were very new to cursor, about a year into cursor. So we're kind of trying all these things. And as we try them, we want to see what's working, what's not working, and then we let more engineers use it. The most interesting thing about cursor is that the gross in cursor at Shopify is happening a lot outside of engineering. And outside of R&D, finance, sales, support. Those are the teams using cursor. What happens when a company goes all in on AI. Shopify did exactly this a few years ago and Shopify's head of engineering, Farhan Tavar told me exactly how it's going so far. In this episode we discuss how Shopify works closely with AI labs and why Farhand paired for an hour with an engineer at Anthropic working

Starting point is 00:01:15 on the Cloud Code team. Why Shopify is planning to hire 1,000 interns in a year and how they incorporated AI into their interview process. Why Shopify has no cost limit on how much an engineer or team can spend on AI tokens and many more. If you're interested to know what an AI first tech company operates like, this episode is for you. This podcast was recorded as a live podcast at LDX3 in London. If you enjoy the podcast, please to subscribe on your favorite podcast player and on YouTube. So welcome, everyone, to this very special light podcast with Farhan Tarar. And we're going to talk about AI at Shopify.

Starting point is 00:01:50 But before we start about AI, I just wanted to talk a little bit about you, Farhan. So I did a little bit of research. I talked with a couple of engineers at Shopify. and they told me, your role is the head of engineering. And engineers told me that they see you in everything. You have reworked Shopify's intern hiring program. You've defined what Shopify does. In fact, you even were in the weeds to deploying an internal hack fest to more than

Starting point is 00:02:17 getting the Wi-Fi right for a hackfest. So I'd like to know what does a head of engineering do at Shopify, and specifically what do you do or maybe what do you not do? Yeah, so I think what's interesting about Shopify is that we use this line, we're not a swim lane company, which means that we don't try to put people into these roles where you are like, you're in products, so only think about product, or you're in engineering, only think about how the code is written or architecture. We're very much just curious problem solvers. And so something's broken, we expect curious people to go and look at the problem, even it doesn't matter what their role is and try to solve it. And so last year at our Shopify Summit, which is our employee event, where, you know, 7,000 people come to one space, the Wi-Fi did not work super well. and so I spent all my time in that summit for three days trying to fix the Wi-Fi. I became known as the chief Wi-Fi officer. And so this year, it was two weeks ago, I deployed 300 ubiquity APs, 800 switches,

Starting point is 00:03:14 access points. Yeah, to have a much, much smoother Wi-Fi experience, which is how that all got started in terms of, and the team built lots and lots of memes because they thought the Wi-Fi was not going to work, but it did work. And so I did publish all those memes. I need to ask you because a lot of leaders would say, look, like, your time is super valuable. You're kind of ahead of an organization.

Starting point is 00:03:35 How many people are? 3,000. 3,000 people. You're kind of up there, and your time is now, you know, if you measure it in dollars, it'll be expensive. It's a lot better to just get a specialist to do these things. Why do you still do these things? And do you disagree with that advice of, like, focus on the high leverage things? Yeah, so we don't love the, there's that notion of like hire smart people and get out of their way.

Starting point is 00:03:56 Instead, what we say is like hire smart people and pair with them on problems. So the difference for us is we really like to bring in, obviously lots of smart people, hands-on people. And instead of just saying, hey, you take this area and then go away and just solve it for me and come back, we like to pair with them and say, you're smart, I'm smart. Can we work on this problem together? This episode is brought to you by WorkOS. If you're building a SaaS app, at some point your customers will start asking for enterprise features like Sammel authentication, skin provisioning, and fine grade authorization.

Starting point is 00:04:23 That's where WorkOS comes in, making it fast and paying it. to add enterprise features to your app. Their APIs are easy to understand, and you can ship quickly and get back to building other features. Work OS also provides a free user management solution called AuthKit for up to one million active users. It's a drop in a replacement for Alt Zero and comes standard with useful features like domain verification,

Starting point is 00:04:44 rule-based access control, bot protection, and MFA. It's powered by Radix components, which means zero compromises in design. You get limited as customizations as well as modular templates designed for quick integrations. Today, hundreds of fast-growing startups are powered by WorkOS, including ones you probably know, like Cursor, Versel, and Perplexity. Check it out at Workos.com to learn more. That is WorkOS.com. If you want to build a great product, you have to ship quickly.

Starting point is 00:05:11 But how do you know what works? More importantly, how do you avoid shipping things that don't work? The answer? Statsig. Statsic is a unified platform for flags, analytics, experiments, and more. combining 5 plus products into a single platform with a unified set of data. Here's how it works. First, Statsic helps you ship a feature via FeatureFly or Config.

Starting point is 00:05:35 Then, it measures how it's working, from alerts and errors to replace the people using that feature to measurement of top line impact. Then you get your analytics, user account metrics, and dashboards to track your progress over time, all linked to the stuff you ship. Even better, Statsick is incredibly affordable. With the super generous treats here, a startup program with $50,000 of free credits and custom plans to help you consolidate your existing spend on flags, analytics, or AB testing tools. To get started, go to statisic.com slash pragmatic. That is sta, tsig.com slash pragmatic. Happy building.

Starting point is 00:06:10 So this leads me really nice into the next thing I heard about you, which is at Shopify, you have gone kind of AI first, and we're going to talk a little bit about that. But one thing that you do is you have. have access to some of the top AI labs and what you told me just before that you pair with them. You just told me that the other day you were pairing with Anthropic engineers. Can you tell me about this whole pairing and how you work with these AI labs? Yeah, because again, we're like super curious problem solvers.

Starting point is 00:06:37 And so when there's something interesting happening in the industry, we want to be close to it. So obviously, chatypt comes out and then Anthropics builds their model, Gemini has their model, Kohir has their model. We want to be as close to the leading edge as possible. And so we try to be close to those people and what they're working on. One of the things we do is when something comes out, so in this example, it was Cloud Code. We deployed inside of Shopify, we saw people using it.

Starting point is 00:06:59 This was in May, right? Three weeks ago. Exactly. So Cloud Code came out before that, let's say six months ago. And then we started spending time with Anthropic. I wanted to see how Anthropic was using Cloud Code internally. Similar to Open AI is now releasing Codex, I want to figure out how they're using it internally.

Starting point is 00:07:14 And so what I did was I reached out to Anthropic and said, hey, I'd like to pair with one ever your applied AI engineers. We spent an hour building a bunch of things. And I learned how he sees Claude Code being used internally an anthropic. Then I can take that back and say, well, here's how we're using it at Shopify. And we can together figure out where we think this thing can go. Is this just a matter of like you just asked them? Yeah, I mean, because we are Shopify, we're very lucky.

Starting point is 00:07:37 People want to see how we work. We tend to work in an unorthodox way. And they also are interested in how we use it, but they also use it. And so it was more of a me, again, a pairing. It was a true pairing. It wasn't like me as a customer trying to push them and say, I really want, it was, we have a shared Slack channel. I pinged in there saying, hey, who would want to pair with me for an hour on something I'm building inside Shopify?

Starting point is 00:07:56 Somebody put their hand up on applied AI engineer and said, we'll show you how we work and we show you how we work. And we can together figure out where we think this product could go. So I do it with a lot of the providers. And as you're pairing, is it just you pairing and then you later tell people or like do you? Yeah, so I might record the session and then share it with the internally to say, hey, I just did a pairing session with Anthropic. They might use it for a case study that they're building. But the idea is, again, because I'm just curious, when somebody says something, I'm like, oh, how does that work?

Starting point is 00:08:24 How can we use this? And so I think it's a way for us to just learn. And one recent thing that I just wanted to go into before we get into some more of the AI things, it's a really fun fact that most people don't know. You just came out of a code red for seven months. I think this is the first time we're doing, what was this code red? What was about? And what did people outside and inside of Shopify see of it?

Starting point is 00:08:46 Yeah, so we were internally. seeing a lot of things that weren't bubbling to the surface, but that can actually affect the operation of a system. So you would typically call a tech debt, but we had a lot of signals showing that the tech debt was growing, not just the general ones, like taking longer to update a piece of software, or build a version two of a product, or people having a hard time

Starting point is 00:09:08 maintaining some of the software stack. We saw things even, not externally, but internally, that were shooting out signals. Lots of exceptions growing. Even in some cases, because we build things at the very low level of the stack, So for example, we patch, we're core contributors to Ruby, we patch MySQL. I believe we have the second largest MySQL fleet in the world outside of meta.

Starting point is 00:09:26 And we were seeing things like Seg Faults and things like that, that I was not willing to feel like we should move on and just build features as normal. And so we actually got together, spent, we didn't think it was going to be seven months. We thought it was going to maybe be three months. And we took something between like 30 to 50% of engineering. We didn't have an actual number. We just said, these things have to not grow anymore, like exceptions, unique exceptions. exception counts have to shrink. There should be zero seg faults.

Starting point is 00:09:52 We should understand everything that's happening in the system. And so we spent almost seven months doing that over the last, you know, from November until now. How did you make sure that you fix it right thing? Did you, like, talk to the seam saying, hey, bring up the things? And also, how did you decide when you will stop? Because like whenever I hear these cold reds, it's always, you know, a lot of companies go to cold yellows, cold reds. But sometimes they can drag on forever. Yeah, so we had multiple metrics that we used.

Starting point is 00:10:15 One was literally what I said, exception counts, unique exception counts. and SegFaults, which would literally would say, like, SegFaults are at zero now, and exceptions should be going down. But we also looked at other metrics that we have, like, you know, we want to have four nines of reliability across our different surfaces, storefronts, merchant admin, point of sale, like we looked at all these things, and we also used the 28-day rolling average of those things, and once those were all green, and we saw that, again,

Starting point is 00:10:40 SegFaults was zero, and UDIC exception stopped and was shrinking, we then felt like we were in a place where we can start building features again. And now going on to AI. So you're a huge early adopter of AI coding tools. Can you tell me on what tools you started to use and what you're using right now and what you're liking, what engineers are liking specifically? To focus more on software engineering, maybe even outside of a software. Yeah, so we've been using AI tools for a long time in engineering.

Starting point is 00:11:08 So we have, I believe we were the first company outside of GitHub to use GitHub co-pilot. And the reason, yeah, the reason I know that is because when Thomas Dom K became the CEO of GitHub the same day, I emailed him saying I would like GitHub co-pilot. This is 2021, a year before ChatGPT, he messaged me back saying, it's not available for commercial use. And I said, that's not what I asked.

Starting point is 00:11:27 I don't care what you have for commercial use. I would like this deployed for all Shopify engineers as soon as humanly possible. I think it took about a month, and we were not charged, I think, for two years, because there was not a skew to charge us. And we said, in exchange, we'll give you lots and lots of feedback. And so we were using co-pilot for a long time.

Starting point is 00:11:45 And then we worked with, them as we started looking at the roadmap, we then deployed cursor internally as well. Actually, we were very new to cursor about a year. Really? Yeah. They were a tiny startup back then. Yeah, so yeah, I think that's super late. You're saying it's super early. I feel like it's very late to cursor. We knew about cursor for a long time. We like to have one tool at Jopify for the things that we're doing. And so we don't like to have multiple

Starting point is 00:12:05 tools. So like, we have like Figma. We don't try to have lots of other visualization tools. We have my SQL. We don't have other databases. So we try to focus on one tool. And so our bet was Vs code. But because of AI, is proliferation and we don't know what's going to happen. We then started making a change in our stance around trying more tools than just one. Maybe we consolidate, maybe we don't. But right now we have Cursor and VS code

Starting point is 00:12:28 as our two AI tools and then Claude Code as the agentic workflow and we started using Cloud Code. We tried Devon last year. So we were kind of trying all these things. And as we try them, we want to see what's working, what's not working and then we let more engineers use it. The most interesting thing about Cursor is that the growth in Cursor at Shopify is happening a lot

Starting point is 00:12:46 side of engineering. And outside of R&D, finance, sales, support, those are the teams using Cursor. What did he use it for? So what's happening is there, because Cursor is actually so ubiquitous on building things, they're using it to build MCP servers outside of regular engineering domain. So for example, Salesforce, Google Calendar, Gmail, Slack.

Starting point is 00:13:10 They're building MCP servers to access those services, and then building like home pages for themselves, like you're a salesperson, connect to Salesforce, connect to Google Calendar, connect to my email, and tell me what opportunity I should be working on. So with just going back with an MCP server, when you have a service, you can put an MCP server and then you can activate it with, for example, with a cursor or with an agent. Exactly.

Starting point is 00:13:33 Do engineers help build that MCP server? Sometimes yes, but a lot of times no. Now that these services are coming out with their own MCP agents so that you can literally just download GitHub's MCP server, right? You don't have to actually build it yourself. But a lot of times, they're actually just building it themselves, and they're watching, like, videos, and they're actually just deploying it on their own without any engineer intervention at all.

Starting point is 00:13:52 Now, it's end of one, right? They're building software for each individual person, not, like, infrastructure for all the people. Does this remind you of anything? You've been around the tech industry for a long time, but the fact that, you know, the non-technical people are now kind of building, does it, do you see any parallels from the past of, did this happen before, or is this new? This coding part is new, although you remember, like, like, wissywig editors and, like, people starting to like self-serve.

Starting point is 00:14:17 I think it reeks of that, but they are, I mean, they're not reading the code, right? They're really much vibe coding. And if it doesn't work, they just delete and start over. But even if you read Anthropics documentation on Cloud Code, like they say one-third of the time you can kind of get something working with OneShod. And so we're starting to see that. And people are getting used to the fact that if it doesn't work,

Starting point is 00:14:35 they just delete and start over and try to get it working. I find this really interesting on how this is happening. Is it changing of dynamics with engineering? because one thing I noticed when working at places like Uber, everyone was a bit jealous of engineering. They wanted engineering resources. They couldn't get it. They really wanted one of these things.

Starting point is 00:14:52 Do you see any of this change? I'm going to say yes and no. So there's a weird thing happening whereby, so let's say you're a PM, and let's see your technical PM. You now probably have enough knowledge to vibe code some new feature you've been waiting for engineering to build for you yourself.

Starting point is 00:15:08 The question is, can you submit a PR, and should the PR be accepted by engineering? I'm going to ask you first, and then I'll tell you what I think. Well, I think you can always submit a PR, and engineering will tell you why it can either be accepted, if it's good, or what it's missing. May that be conceptual, may it be coding standards and all those things. But at least, you know, and this feedback for them, they're seeing what this person wants to do. I think it should be a great thing.

Starting point is 00:15:38 Yes, I'll say yes with one caveat. And the caveat is the problem with vibe coding today is you might generate 10,000 lines of code for a very simple application feature that now the burden on engineering to read the 10,000 lines is there. So what we say is, yes, you can submit PR, but you have to understand the code you are writing. Writing yourself. Exactly. Before you submit it. Because I can do the same thing and I can say, hey, I'm going to write a blog post, generate 20 pages and then send it to you and be like, hey, edit this and post it on your blog.

Starting point is 00:16:13 And now you have the burden of having to it. Now, if I said I read it all and it is my voice and I agree with it, then you might be more likely to read it. The interesting thing that we're probably going to see this problem with open source projects because it's so easy to use AI tools and they're going to get a lot of these things. I think policies like this are interesting. Now, I wanted to get your take on one thing, which is going around and social media. There's a lot of people predicting it's the end of SaaS because people can create their own SaaS. Now, what you told me is some of your non-technical people

Starting point is 00:16:40 are creating like small solutions. For themselves. For themselves. How do you see this potentially changing SaaS vendors? Are you, and you're also someone who actually buys a lot of SaaS vendors. You understand them. Do you think this changes anything or not really? This is just more personal software.

Starting point is 00:16:57 Yeah, I'd like to think we buy less SaaS than those companies, only because we try to consolidate down. That being said, there is this notion. We're in this middle zone right now. The middle zone is basically this idea that you can vibe code something maybe for yourself. It's unclear whether you can vibe code as a platform or vibe code something that gets into the infrastructure layer of what you're building because you do really not to understand, again, like you mentioned it earlier, like you might be putting yourself in a very precarious situation.

Starting point is 00:17:24 If you're building on top of something and building on top of something, you don't understand what you're building. Maybe it's a prototype, that's fine, or a proof of concept. But if you're building infrastructure for the internet, you likely want to understand how that's being built. So that is not, we're not there yet. Is it coming? Yes. It's for sure coming where someone can vibe code something in the, you know, Anthropic or Open AI or Gemini or who cares, who model, build something

Starting point is 00:17:46 that's actually architecturally elegant and the right architecture for what you need. We're not there today. So where we are is this notion of still human in the loop and still like start over and like this is not what I meant and like prompt engineering English as the programming language. And so I do think you can get yourself very far, but I'm not worried yet about SaaS.

Starting point is 00:18:06 I don't know we should be worried, by the way, because let's take another view. How much software do we think should there be in the world? Probably 10,000 times as much as there is now, 100,000? There's a lot of software that should be in the world, and we are satisfying 0.00s or 1% of the demand. So now that everybody can generate software, we should welcome them into the software world.

Starting point is 00:18:29 We believe everyone should be writing software, just like my sales team is, and everybody becomes more productive. I'm not yet worried about the software industry. I still believe this is Jevin's Paradox. The more we get, the more we want. Yeah, a really good analogy that has stuck with me as Simon Willison was saying,

Starting point is 00:18:43 how has the filming industry changed with this? Everyone has, this used to cost like $5,000 or $10,000 a few years ago, this camera, everyone has one in their pockets, and there's still a professional movies industry, but now there's a bunch of, you know, there's YouTube, there's TikTok, there's all these things. So it has become bigger, but the barrier for the, the purpose has not really changed.

Starting point is 00:19:05 Dune and some of these amazing movies are so hard to make. Jepin's Paradox. The more you have, the more you want. We all want to be filmmakers now. We can be filmmakers, and then there's still going to be people at the high end and the low end. I would say the controversial, maybe controversial. Spicy take is that these phones actually benefited the experts more,

Starting point is 00:19:26 meaning the camera, you know, like being able to pull out your iPhone and take a video actually benefited those who were really, really good at it already. And the same thing is maybe potentially true of vibe coding. It's like the AI agents are going to help the best engineers more than the mediocre engineer. So there was a memo that went around in the media about a month ago. Toby sent a memo. It was meant for internal consumption.

Starting point is 00:19:50 It was about reflexive AI usage and it got, I guess it got leaked and then Toby posted the whole thing. And this is about saying that, hey, everyone is expected to use AI at Shopify. Can you tell me why that's why that's that? why you felt that memo was sent out, what was there a response internally? And has it changed anything, or was it just kind of stating what was there already?

Starting point is 00:20:11 Yeah, it was a fundamental statement that this is something new, and it was meant to get everyone to move from passive to active, right? So yes, there's AI in the world, and chat GPT, and all these things existed, but we started building infrastructure internally to make it easier to use,

Starting point is 00:20:26 and we basically said the following, you don't have to use AI in your workflow, but we're gonna expect that you have this plethora of tools available, to you such that the expectation is you're going to be, like, your impact is going to be evaluated as if you had the tool. So for example, you know, imagine not having like Excel or Google Sheets and you had to like, you know, like do a complicated analysis. You don't have to use those tools. You could use a pen and paper, but we're going to expect that you have these tools available

Starting point is 00:20:50 and we expect the level of analysis to be treated as if you had these tools. And so by us doing it, what changed internally was actually probably more of a change outside of R&D, like I mentioned. R&D was already leaning in. Now, of course, there were some folks who were, waiting for, I don't know. R&D is engineering. R&D is engineering, product data, and design. They were more likely to use these tools because they tend to be more forward-looking

Starting point is 00:21:12 and wanting to use the latest and greatest. However, some people were waiting. It changed their shape because we said, hey, look, we're going to expect that you use these. But outside of R&D, we saw it even more. Again, sales, finance, like customer success, help centers, all those roles started using AI as well. And we, again, gave them the infrastructure to use that.

Starting point is 00:21:30 So we have like an LLM proxy. It has all of the APIs and all the models available. You don't have to be worried that your personal information is being leaked to a model. You can kind of use those tools. Let's talk about these things. Let's talk about the tools that you either create or adopted early.

Starting point is 00:21:44 I heard, can you talk a bit more about this? LLM proxy? And I heard something about MCP. Yeah. So we are, when I talk to people, let me start with LLN proxy. The problem at the beginning was people would go to chat GPT or Gemini or Claude and they would like,

Starting point is 00:21:59 you don't want to put customer data there. You don't want to put customer data there or even employees. data, you want to use it to write your employee reviews, you don't want to put, like, you know, Simon, you know, is having problems with this and, like, put their first name and last name, and here's the project they worked on and the code names leak. So we wanted to quickly have an internal LLM proxy, plus we built on top of Libra Chat, which is the open source negative chat product, and we're core contributors to that as well. And with the LLM proxy,

Starting point is 00:22:23 you're sure to be using the enterprise APIs. We can check to see which, you get a token, you ask for a token, we can see which teams are using, how much, like what Moss is being, incurred by that team or by person. We have a leaderboard where we actively celebrate the people who use the most tokens. Oh, really? Because we want to make sure that they are, if they're doing great work.

Starting point is 00:22:42 Of course, if they script, that's not what I mean. I mean, they're doing great work with AI. I want to see how did they spend $1,000 a month in credits by a cursor. Maybe that's something. They're building something great, and they have an agent workforce underneath them. So we have this proxy.

Starting point is 00:22:56 And again, it's used to build anything that you want to build, and you can ask for a token and use it. And then we have, we are very early on. MCP, we're on the MCP steering committee, we are fans of MCPing all the things. And so when I talk to people about how to get access to some piece of data inside the company, we quickly will spin up an MCP endpoint for them so that they can just use it. So for example, our internal wiki is called the Vault. It has information about all of our projects that are happening.

Starting point is 00:23:24 It has information on every internal talk that has been done. It takes the transcripts of all those. So you could say something like, when did the point of sale at Shopify launch? And it will come back and it'll say, oh, in 2012, there was a hackathon where the point of sale was started. Based on wiki docs and in the dog presentation. And then in the 2013 Q1 board letter, it was mentioned and it would tell you all of the history of what happened in Shopify. And so we do that by just having, all you do is put up an MCP and now the Libra chat has access. It's like an internal perplexity or something like that.

Starting point is 00:23:52 Exactly. And so we basically, any internal document is crawled by this thing and now you can have access to it by MCP. And so any new system comes online, we put an MCP in front of it and you know, and now everybody has access to it. This episode is brought to you by Sonar, the creators of Sonar Cube, the industry standard for integrated code quality and code security that is trusted by over 7 million developers at 400,000 organizations around the world. Human written or AI generated, SonarCube reviews all code and provides actionable code intelligence that developers can use to improve quality and security early in the development process. Sonar recently introduced Sonar Cube advanced security, enhancing Sonar's core security capabilities with software composition analysis,

Starting point is 00:24:32 and advanced static application security testing. With advanced security, development teams can identify vulnerabilities in third-party dependencies, ensure license compliance, and generate software bills of materials. It's no wonder that SonarCube has been rated the best static code analysis tool for five years running. So join millions of developers from organizations like Microsoft, Nvidia, Mercedes-Benz, Johnson & Johnson, and eBay,

Starting point is 00:24:55 and supercharge your developers to build better, faster with Sonar. Visit sonar source.com slash pragmatic security to learn more. And how many MCP servers do you have, given? Probably two dozen. Two dozen, yeah. And they're growing quickly. They're growing because people are building a new thing. Again, let's blend for Salesforce.

Starting point is 00:25:12 There's ones for, you know, Figma. There's an MCP in front of anything. But I tell anybody who's looking at building AI systems internally, the way to make this easy is start putting MCP in front of everything. And now your data is accessible. We happen to use LibraChat. You might use something else. But you don't have to be an even in cursor, right?

Starting point is 00:25:29 You just use the chat interface. Yeah, you can do chat. you can do the LMs. And I like how you spun up basically a platform team to build the infra, to do monitoring, to worry about things like a security, decide on models, all these things. Does this team have a name? Is it an AI platform?

Starting point is 00:25:45 Or it's just a platform team? Impra team just took it over. Data platform. I don't know if I'm trying to figure out because they're inside the data platform team. I don't know if they have a name outside. I call them LLN proxy, but. And was this more like you or someone upset,

Starting point is 00:25:57 like, hey, we need to do this? Or are they just like, hey, this is a good idea? that's kind of bottom up. It was more top-down because we felt like this was going to grow internally. We wanted to make sure that we didn't want people signing up for their own personal tokens. And it's also a good way for us to see, hey, this team is growing. Is the usage growing? What's happening there?

Starting point is 00:26:14 This person's usage is- Yeah, because number one pushback I get from people or even engineering directors working at companies is the data security issue. And obviously when you host it yourself, it's good, but then you need the infrastructure. So it sounds like, at what point did you decide? What was this month ago? Was this more than a year ago to like actually invest? Probably, I mean, I would say every week it's getting better, right?

Starting point is 00:26:34 But I would say probably a year ago we started thinking about it this way. We had the LLM proxy. When ChatGBTT came out, we actively have a warning if you go to ChatGBTBT and says, hey, by the way, like you can go to the internal version. That was a long time, for a long time. You mentioned cost and I am hearing stories of CTO's engineering leaders saying, okay, we were paying for GitHub co-Pilot. I don't know it's $10 or $20 per month or engineer,

Starting point is 00:26:58 but we have all these seats or maybe we have for everyone. And they're telling me cursor feels a bit too expensive because they're not sharing 30 or 35 and we're not going to go there. And generally, there's a big pushback on cost. People are saying, look, like, AI should make engineers more productive, but it's now like it's kind of ridiculous to spend like $1,000 when an engineer. You think about cost different. Tell me about it.

Starting point is 00:27:18 Yeah, I did a tweet storm on this because I think that people looking at it differently. So think about it this way, if I could give you a tool that could make your engineering team more productive by even 10% would you pay for it? The answer is yes. Would you pay $1,000 a month for it is a question? And my hypothesis here is $1,000 a month is too cheap. Like it's true.

Starting point is 00:27:40 If I can get 10% more, it is way too cheap, like we should be $5,000 a month. Like it's insane that people are looking at this like $100 a month, whatever. Now I did meet somebody who said their engineers were spending $10,000 a month per engineer. And I said, I want to talk to them because I want to see either they're doing something very, very smart

Starting point is 00:27:56 or very, very stupid. But I want to meet them. And so I literally said, please introduce me because I'm unclear how you can use that much. Like either they've got fields of agents building amazing things or something is going very, very wrong. But I would like to understand because there's something interesting happening. And you should not be penny pinching on AI tools because actually the productivity gains are there. We don't know what they are because we actually don't know how to gauge developer productivity. But it is clear that there's something happening there and you should be very open with your pocketbook there.

Starting point is 00:28:27 And so, again, we celebrate those who spend more. And, of course, we talked to them and what did they do? So they were getting the price. Do I understand that even though you cannot measure it right now or maybe not as accurately, you believe that there's productivity gains and you believe that people need to use it to get these? So you're kind of treating it as an investment right now. Right. For a year, for two years, whatever that may be.

Starting point is 00:28:46 Probably forever. Probably forever. Like, I mean, again, if you wouldn't turn off spell check and grammar check and G Suite. And like, you wouldn't turn off these tools, right? I mean, there's probably a debate on when you turn off slack, because it's like, that could be productivity. But, but like, you're seeing the gains, and I don't think that we should be penny pitching.

Starting point is 00:29:05 Now, over time, we might realize there's an equilibrium and, like, oh, this is the amount of tokens people should be used. Actually, if anybody's using cursor here, the number one thing I'll tell you is, please move away from the default model because the default model is nowhere close to, like, we're actually using a sophisticated model, and by default, cursor, puts you there.

Starting point is 00:29:21 And so we tell people, please move off the small models, move into something more bigger. We'll even say things like please use 01 Pro for this project, or please use 03, 03 Pro, because those... They're more expensive. They're more expensive. Actually, I'll go even further. Mikhail, the CTO, will say something like the following.

Starting point is 00:29:38 If you, you don't pay personally for O1 Pro or Gemini Ultra or Claude, whatever it's called, advanced, the $200 a month one, you are crazy because you can afford it. And actually, you are missing out on actually all of the progress happening in LMs because you are just stuck to the $10 a month or $20 a month chat GPT. Like that's how far he would go. One interesting thing that I've heard when I talk with engineers, they're saying leadership is leading by example.

Starting point is 00:30:05 Toby is hacking, and if you go on his Twitter or social media, you're going to see all the stuff that he's doing even a year ago. I want to ask you, like, what are you doing on the side or experimenting or playing with, and what tools are you excited about? Yeah, so when I was building, when I was pairing with that engineer with the on Anthropics, I was building something for myself. Oh, nice. Yeah, I was building a couple of things.

Starting point is 00:30:27 It's great ways to use company time, I guess. Yeah, exactly. Well, I was like, let me see what they can build, because we didn't want to go super deep into the Shopify API. Yeah. But we do build a lot of things that, you know, one was actually a commerce flow. I was trying to build a commerce flow via like operator

Starting point is 00:30:39 where you can actually record the browser session. And I wanted to see how the session was storing, like credit card information or secure tokens as we were getting through a commerce flow. And I was trying to figure out, I was trying to build almost like subscription product without using our subscription product from Shopify. Yeah.

Starting point is 00:30:57 And so it was an interesting experiment where we were trying to see how do we store those credentials, especially when you want to store like credit card is it stored in the cloud already? Is it stored by the vendor? Are we storing it in the browser session? That's how I kind of stay as close to it as possible. I use cursor pretty often to build my own workflows.

Starting point is 00:31:13 In this case, it was Claude Code. We actually even use gum. I don't know if you heard of Gumloop. Gumloop is an automation platform that is very interesting at its own layers. Oh, here's one example. We tried to build, I did something super dumb. I was like, let's figure out when the next sand disk

Starting point is 00:31:28 Extreme Pro 8 terabyte drive comes out. We tried to build in cloud code. It took us an hour. We did not get there because it actually tried to do something at the wrong layer. It was trying to read the JSON, and it was like try to get the image from the sand disk side and figure out if the 8 terabyte drive is out. It was actually quite complicated. We built it in 2 minutes in Gumloop.

Starting point is 00:31:47 Not because they're different. What a Gumloop do? So Gumloop is basically like it was browser-based. We said, go to this website, go through the search box, like using English commands. And they came back right away and said, there's no A-Tarabyte drive available. And I said, okay, cool, run it every week and send me an email. It was just, there are different tools for different jobs. And an hour with cloud.

Starting point is 00:32:06 Again, nothing wrong with Anthropic Cloud code. I use it all the time. Yeah. It was not meant to do this kind of like web scraping because what it is, what does it do? It tries to build a web scraper versus Gumloop. It already knows how to scrape websites. So it's actually just a different layer of the stack. And what I would encourage people do is figure out what you're trying to do.

Starting point is 00:32:21 And we have a lot of automation happening in gunload now because people have to do something. I want to scrape a LinkedIn profile. I want to find out what platform this company is using behind the scenes. I want to scrape a website. All of these things are, it's built for the right thing. You have a lot of bigger tools, such. Exactly. So you should figure out where you want to work.

Starting point is 00:32:39 Right? If you're writing code, plot code. If you're writing, if you're doing a web skyscraper, use gumloop. So far it was AI AI. And when this letter came out, externally there has been a lot of some speculation saying, is this a sneaky way for Shopify to freeze headcount because it says you shouldn't hire unless you can check if you can do it with AI.

Starting point is 00:32:58 But then you told me something interesting. I'm not sure how big Shopify is, but you told me that you're applying to hire 1,000 interns. Can you put that in context with the size of Shopify and tell me why are you hiring interns where a lot of companies are saying, let's not hire entry-level people because we have AI? Yeah.

Starting point is 00:33:16 What are you seeing that others are not seeing? Yeah, so a couple of things. One is there's this real notion of like generations as they come up in different, like, graduate from different programs out of university or even coming from high school. And you really want to be close to the next generation of people. One, they're like, you know, they know things that you don't know. And so we restarted our intern program in full thrust this year. So last year we had like 25 interns a term.

Starting point is 00:33:41 I convinced Toby to hire a thousand interns this term, mostly because I felt that they would be more AI reflexive than. and everybody else. That was our hypothesis. So my tweet actually said, I want you to come to work with an LLM and a brain, not one or the other. The idea was that bringing these like centaurs, AI centaurs, they know how to work with an LLM

Starting point is 00:34:03 to solve their whatever workflow it is, whether it's a finance person or an engineering intern. And we really focused on them coming in and changing our internal culture. And so a thousand engineers over the year meant like 350 a term. And what we saw there was, one, they're excited, they're hardworking,

Starting point is 00:34:18 We also have them as a cohort, meaning they come to the office. We're a remote company, but we have the interns come in because we felt like the younger people need to have younger people around them to work together versus just being remote in their condo or whatever. We're changing the culture that way as well. And we did see this. They come, they have AI reflexive kind of. We saw the same thing in mobile.

Starting point is 00:34:35 You were in mobile as well, and so was I in my last decade of work. And I hired lots of interns because they grew up with mobile phones. And so this next generation is growing up with the Internet. And they're growing up with phones and they're growing up with LLMs. So we wanted them to come in and change. change our culture around that axis. So do you feel like people are learning from... We always were.

Starting point is 00:34:54 Interns are the secret weapon. You always learn more from the interns. Like people think about internships as like community service. Oh, that's so great. Shopify, you're helping the next generation. We don't do it for that. We do it to learn from the interns. That is always the reason.

Starting point is 00:35:05 So if you don't have an intern program, an early talent program, I encourage you, you should be bringing those people in because you will learn from them more than you will get out of them. You will take a net positive benefit from them. And of course, for those who like tweet me and say, please don't hire interns,

Starting point is 00:35:18 you have to pay them. I'm like, we pay them. I don't know why people think that you don't. We pay them very well, actually. And we do hire, I think I've hired almost 100 in this year from the intern pool to come into our program. And we're very much thinking about the only way to get into Shopify as an entry-level engineer is through the intern program. I'm fully with you with on the paying. I never understood why companies are cheaping out on this.

Starting point is 00:35:41 Because sometimes interns will reject some of the offers because it doesn't cover the living costs, which is wild. Yes, of course. You have to pay them. Actually, one of the best intern, it was another company's loss. In Amsterdam, they offered 500 euros per month, no housing, and that intern would have loved to work there. What had the experience, but they applied to Uber because they said it's just too low. And then we gave them a new grad salary.

Starting point is 00:36:01 They could afford housing. They could afford to travel. Of course. And it was, that intern could have done an amazing thing. And now this person is still senior at Uber. You know, been there for five years. So, you know, their loss. So think about a 350 engineers, interns to about 3,000 engineers,

Starting point is 00:36:16 So about 10% at any one point, yeah. So if I joined Shopify as either a software engineer or an engineering manager, can you tell me how stuff gets done, first as an engineer? Like what tools do people use? I'm assuming it's just no longer, you know, VS code, do code review. There's going to be like AI tools here and there. Yeah, well, first on, we have a very specific way of writing software to the point of we built our own project management system.

Starting point is 00:36:45 Like we don't. Oh, really? Jira, we don't use linear, we don't use, yeah. And the reason is, is because we believe that there's that line, like first you make the tool and then the tool makes you. And so we believe that. And so we make the tool so that it can make us. Versus, it's nothing against those amazing tools.

Starting point is 00:37:02 I'm a big fan, like, linear is beautiful. And, like, you know, I've known Jira for a long time. I'm a pivotal fracko guy myself being a pivotal ex-pivotal person. Yeah. But we do believe that we have a specific way of building software. And if we use one of those tools, we would be adopting someone else's way. And so instead we have a tool called GSD, which stands for Get Shit Done. And it is a tool that allows us, it's a program management tool, like a product management tool,

Starting point is 00:37:26 which allows us to think about what are we building, who's on the team, what's the latest weekly update. It has metrics. It actually pulls in PR reviews. So you can see, like, it pulls in PRs and you can see there's activity happening on this project. Here are the core contributors. Here's how long it's been going on for. This tool is how you kind of get work done. And you're forced to write an update every week.

Starting point is 00:37:44 And now actually we have an AI. Here's an interesting one for you. We have an AI tool which will pull in the latest PRs and the latest conversations from the Slack channel and it'll write update for you. And then you can look at it and say, this looks good. And you can also tell it, what did I not think about? And say, please emphasize this. And it'll help you write the project management update every week. So this is interesting.

Starting point is 00:38:03 I want to push you a little bit on this because I think there's two schools of thoughts here. One says your weekly update. Like I appreciate that you're doing it. The weekly updated point should be for you to look back on what you did. Obviously, everyone hates doing this, by the way. And by AI doing it, would you not lose that kind of reflection? Well, this is why we pair you with the AI. So we do auto-publish it if you don't do anything.

Starting point is 00:38:29 But if you auto-publish it and don't do anything, and we look at the stats, there is a little bit of loss from you because you are now losing out on the context that you were supposed to gain as the champion of the project by looking deeply. However, we expect everyone just like a PR written by AI to have read the code and read the update. Yeah. So again, you're responsible. You're responsible. And so we're early days in this. We're like literally three weeks in on this project. And we may revert it and say, by the way, it turns out everybody stopped looking at what was happening weeks a week. Yeah.

Starting point is 00:38:57 But we also, every six weeks at a company level with Toby, go through every project in the company. And we see if we think that it's on track, not on track. She'll be kept working on, has the right resources. This is not too long. Is it aimed in the right direction? You better know what's in there because literally we're going to review it at the leadership level.

Starting point is 00:39:14 Yeah. I think you have no... I like how... Because I feel like with AI, like I kind of want it to automate the stuff that I really don't want to do. And ironically, it's really good at automating stuff that we like to do sometimes, which is coding.

Starting point is 00:39:28 But like writing an update or writing documentation, I don't think as engineering we loved doing that. And so as an engineering... Again, we're trying to reduce toil. If we find it reduces actual contact, we will revert. I love the experimentation and going back and forth. As an engineering manager, what additional things might people use?

Starting point is 00:39:48 Do people build their own tools to stay on top of things? Do people go deeper because of... Yeah, as an engineering manager and interdirector, we try to surface again metrics so that you can see how your team is doing. So for example, we have tools that allow you to see things like focus time of your team. How often are they in meetings? AI adoption. Are they using like AI tools or not?

Starting point is 00:40:08 Like we're trying to surface this information. How many of your people are on a GSD project versus not? Because maybe a project ended and they're like freed up to work on something else. So we try to surface that. And engineering managers too. We try to get them to be, a lot of the best ones come from IC land. Like they started as ICs. So we try to get them to do, if they're hired outside,

Starting point is 00:40:25 I start as an IC at Jobify before they move into management. Oh, here's a fun one that I think we should talk about. You mentioned something super interesting to me. When you're hiring engineering directors and above, in the past, it was the usual interview. usual interviews, you know, culture, fit strategy, all that stuff. You added a coding interview for every single and your director and above hire. Can you tell me about this? Yeah, so it's interesting.

Starting point is 00:40:46 Like, it's maybe shocking to folks. And I know like Rodney's sitting there. I used to work for me. I did the pairing interview with him. And so what happens is it is shocking for VPs, especially to be like, whoa, there's a coding interview? I'm like, yeah, because we believe that the best leaders here were ones who were not running away from coding.

Starting point is 00:41:02 They just felt like they got better leverage from running a team. They still are deeply in love with technology. and they still, and Rodney's case, still codes on the weekend, right? And so it worked out super well. But our whole idea is that you're not running away. You're running towards technology, and it's just a better way for you to express it. So I pair with the candidates,

Starting point is 00:41:19 and they also see that even though I'm not writing code every day, I'm still deep in the weeds of technology. I still love technology, and I still want to talk about technical topics. And so we pair, and some people believe that that's not the best for them, and there's lots of great companies out there where that's not the requirement. But at Shopify, we believe people should be as close to the details as possible, and we are such a tech nerd company. Like, again, our salespeople are vibe coding now.

Starting point is 00:41:40 We want our engineering leaders to be coding as well. And so that doesn't mean coding day-to-day. But you should understand code and how code works. And a lot of it comes back. The muscle memory of coding will come back in these pairing interviews. Yeah, and I guess especially now we have these coding tools when you can generate something as long as you can look through it. Like you're in control.

Starting point is 00:41:57 This is the best part about co-pilots is that a candidate comes. The copilot generates tons of code. And now I'm like, great. Is that good code? Bad code? I love... Hold on. So you're using AI during your interview process. Yes. Oh, you're not running away from it.

Starting point is 00:42:11 No. You know, one of the interesting stories we just learned, Cursor has disbanded on the interview process. They're the AI tool company. Right. So you're embracing it. We're embracing it. Okay, how is it working? Tell me. I love it. Because what happens now is the AI will sometimes generate pure garbage. So you're screen sharing and you say literally use that everything you want.

Starting point is 00:42:27 And they're using, I say, co-pilot or whatever. We let them use whatever they want. Here's what I'll say. If they don't use a co-pilot, they usually get creamed by someone who does. So they will have no choice, but who's a co-pilot. Sometimes I will shadow an interview and do the questions myself. I've never seen them with a co-pilot and send it to the interviewer and say, please mark my assignment as well against a candidate. I have not law yet.

Starting point is 00:42:46 If they have not, if they don't have a co-pilot, they will lose. But when they do have a co-pilot, I love seeing the generated code because I want to ask them, what do you think? Is this good code? Is this not good code? Are there problems? And I've seen engineers, for example, when there's something very easy to fix, they won't fix it.

Starting point is 00:43:02 They will try to prompt to fix it. And I see, are you really an engineer? Like, I get the nuance of like, just prompt and prompt and prompt, but sometimes it's right there and they will not prompt. I'm like, change the one character and they won't, and they won't change it. And I'm like, okay, so they're... Interesting, sick. I don't want you to be to 100% AI code or I want you to be like 90 or 95.

Starting point is 00:43:20 I want you to be able to go in and look at the code and say, oh yeah, there's a line. That's wrong. So I want to ask you, what does a standout senior or staff engineer today look at Shopify and what has changed, let's say, three years ago, when we didn't have a line, any of these tools. I'm trying to prod here like how you think AI or AI tool usage might have changed of what we expect an amazing engineer. And you think of the person who you think is like this fantastic person. Well, I would say the thing that's changing is I'm seeing some of my engineers now actually use these AI tools to like do the infrastructure that they've always

Starting point is 00:43:52 wanted to do but never felt like they had the time. So like tech debt reduction, refactoring, making things easier to read. Those are the things I'm seeing the best engineers at Shopify. if I use AI tooling to actually do, and they just never felt like they had the time or the resources. Like the engineer who always felt like, I wish I had like six months to do a code red, now can use an AI refactoring tool,

Starting point is 00:44:13 like a Claude Code or an Open AI Codex to like unleash these agents in parallel to then and then review all the PRs, but then unleash it and feel like they have a team themselves. So that's where I'm seeing, and they're not afraid of trying things that may or may not work. They might unleash something for 24 hours

Starting point is 00:44:30 and like a Devon-like tool and comes back and like, utter garbage, they deleted it all, but they said it was worth trying. Yeah. And as a closing question, and a lot of people are thinking, we'd love to be like Shopify. Like, I'm either an engineer, I'm either a staff engineer, director of engineering, et cetera. I'd love to transform my culture to be a bit more AI first. What would your advice be? How can people start? Like, not everyone has a luxury of having the CEO saying, okay, I love these things. Do you have tips that you give to your peers? I know you're in CTO groups, for example.

Starting point is 00:44:59 Yeah. So I think that the number one thing. thing is role modeling is the best example. I haven't seen anything work better than role modeling. And the way that happens is, like, you have to do it. So if you are coding and you are showing people your workflow, and you are showing, you are in those same channels asking like, hey, I was trying to get this working, and it wasn't working, or I was able to get this workflow

Starting point is 00:45:19 going by this prompt. We have a prompt library internally where you can literally grab prompt. And say, like, you know, hey, this prompt work for somebody. Let me try it and modify it for my use case. That's the number one way in which I see people trying to adopt AI is because they've seen other people do it. And so we try to share like AI use cases.

Starting point is 00:45:37 We have, again, prompt libraries. We had a hackathon, like, as a part of Shopify Summit a few weeks ago, and we leaned heavily into AI. And it wasn't just junior people learning AI. It was senior people trying to embrace cloud code. And one guy said, I haven't coded in six weeks. I've only used cloud code. Then I pushed them, and he said, okay, I have to make changes like here and there.

Starting point is 00:45:54 I said, cool, you're 95% AI, but that's the point. You're not supposed to be 100. And role modeling is the number one thing I've seen. That and sharing examples. So role modeling and then sharing. Especially because no one has to figure it out. No one has, exactly. No one's there yet.

Starting point is 00:46:08 Well, this is wonderful. I love how much you're experimenting. I love how you are not afraid to, you know, like take some churn to learn more. And I love how much you're sharing internally. I think we got a little window of this, but I think it's just something I hope more people, more companies will do. So this was wonderful. Thank you so much.

Starting point is 00:46:26 Thanks for having me. I found it refreshing to see just how practical Shopify is about LLMs. It feels to me that they understand that you need to invest and experiment to get results. One surprising thing for me was understanding how Shopify expects to become more effective using AI tools by hiring interns who come in with an open mind towards AI tools and can come up with clever use cases for it. For more details on how Shopify operates, check out deep dyes in the pragmatic engineer, linked in the show notes below. If you've enjoyed this podcast, please do subscribe in your favorite podcast platform and on YouTube. This helps more people discover the podcast and a special thank you if you leave a rating.

Starting point is 00:47:00 Thanks. and see you in the next one.

The Pragmatic Engineer - How AI is changing software engineering at Shopify with Farhan Thawar

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.