The Infra Pod - Code refactoring is more than a AI problem! Chat with Jonathan at Moderne

Starting point is 00:00:00 Welcome to the InfraPod. This is Tim from Essence VC, and Ian, let's go. Hey, this is Ian, lover of DevTools, and couldn't be more excited to talk with Jonathan Schneider, CEO of Moderne today. Jonathan, tell us a little about yourself. Yeah, I'm Jonathan, builder of DevTools, and I guess lover of DevTools, I'm going to build them. I've been doing automatic refactoring, like large-scale source code refactoring for quite a while now. That's what we'll be talking about today.

Starting point is 00:00:34 Started that back at Netflix about almost 10 years ago now. And doing it there because I was on engineering tools and the team there had that freedom and responsibility culture, which meant you couldn't impose any constraints on what product engineers did. So it really led us down the route of automating a lot of these things that we wanted to get folks to do. But anyway, took a detour from there to the Spring team at Pivotal and worked on Spinnaker for a while.

Starting point is 00:00:59 So other infra-type things, right? Continuous delivery and then walked back into this as a result of that. Amazing. And what made you decide that, hey, I'm going to go start a company around code migrations? What was it? Was it just like, this is a huge problem? Was it like you had some insight from your experience?

Starting point is 00:01:17 What got you to the point to say, I'm going to go build a company here and try and really solve this problem? Because I think from my technical experience, this is a ridiculously difficult problem space. Ridiculously difficult. And there's a lot of user interface, and there's a lot going on in terms of how you actually get something to work here. So I'm curious, what got you over the hump to go and do a company? I think it was two things. I mean, one,

Starting point is 00:01:40 maybe just a bit of ignorance, starting this even at Netflix, like this problem of large-scale refactoring, I didn't exactly understand how deep this problem was going to get. Like you said, like that's intuitive to you. It wasn't yet intuitive to me at the time. So I started it there, saw some early use cases around, you know, replacing one volume library with another, fixing this or that vulnerability. But as I became more and more aware of the data needed to support that kind of thing, I started asking for more team, right? Like, you know, so I had a skip over meeting with Yuri Izhevsky, our VP of platform engineering at the time, and was going to make the case for building a team around this at Netflix.

Starting point is 00:02:22 And he said, just no. Basically, this doesn't really, it's just too much, right, for a Netflix to take on. The ecosystem's too broad. And so I guess I had a little bit of a pause in that, and I was working on Spinnaker continuous delivery, taking that into enterprise environments, banks and retailers and so forth, and trying to teach them about automated canary analysis. And they were like, well, thanks.

Starting point is 00:02:48 You know, talk to me in a year when I'm done moving this framework version to that framework version or this language version to that version. And they're always super depressed about it. And also, they were always asking for pretty much the same things because it didn't seem to matter the business vertical or size. Like they're all building on the same substrate of third-party and open-source stuff. So I kind of felt pulled towards this problem by just that experience, that repeated experience over and over.

Starting point is 00:03:16 What is Moderna working on? If you had to make the pitch for the company and why you're different against other approaches, what is it that's unique to your approach? Because this isn't a new problem. And then there's a graveyard of companies that I know I've hit some money in that are now gone to zero, and that's totally okay. What is Modern's thesis or approach, or what is it that makes you differentiate against the other folks and how you're approaching it in a completely differentiated way than before? I think, first, the principles that I had to start with at Netflix

Starting point is 00:03:47 because of that very much responsibility culture were, if I was going to make a code change, it had to look idiomatically consistent in the context of the repository the change is being inserted into. Otherwise, because there was no stylistic consistency there, it was all over the place. And if it produced an accurate change that looked like a developer on the project did, engineering would just reject it. And I couldn't force it. It's almost like a lot of humble pie. Like you have to take this as like,

Starting point is 00:04:14 I could not impose a constraint on them. So I couldn't say, oh, you know, the style of your project is wrong. Oh, you should have been using this build tool. You should have been doing that. It's like this almost extreme form of meeting people where they are without requiring them to make a change first. And then what's the smallest change I can make that delivers value? Maybe it's just replacing Netflix's Blitz for J library, which we never should have developed in the first place, with SLM for J for logging.

Starting point is 00:04:41 It's a relatively small change, still difficult, but I'm not doing COBOL to Java to begin with. I'm not doing.NET to Java or vice versa. Let's just start with a smaller problem. Then we just kind of built up layer after layer after layer of sophistication over time. I kind of generally think of this as this combination of humility, approach the project as it is right now

Starting point is 00:05:05 and this like fierce resolve we'll figure out a way to make this next little bit that's proximal to what we're doing right now work the question i have is like how does it work i think it'd be useful to help our audience and even myself like a better understand like you know i i need to do a migration i need to i want to upgrade from React 14 to whatever the new version of React is. I need to go upgrade Log4j to some new thing because it is terrible. What are the typical use cases for migrations inside companies? And then what's the workflow typically look like? And it'd also be interesting to understand what use cases are easy

Starting point is 00:05:41 and what use cases are hard in this code migration space. Because I'm sure there's a set of things that are actually like, oh, this is really simple, it's a find and replace effectively. And then there's a bunch that are like, actually, this is crazy difficult. And I think for a lot of people, we approach this space, we think, oh, we should be able to do something here. But you get into it and you're like,

Starting point is 00:06:00 oh, it's actually deeply sophisticated. And so I'm curious to understand your thought process there, what you think is easy, what you think is hard, and then what is it, if it's actually deeply sophisticated. And so I'm curious to understand sort of your thought process there, what you think is easy, what you think is hard, and then what is it, if there's some insight you've had about how to make the hard things easy or maybe not easy, but possible. Yeah, absolutely. I think I'll start from what my initial use case was.

Starting point is 00:06:19 And then there's kind of the punchline of what we think of the market about this later, but I didn't have the insight into what the market was going to be and when i was first starting out back to the initial problem of what's for j it's just a logging library that looks like every other logging library to slo4j that was the first use case at netflix and it was almost possible to do that with find and replace like i'm looking for logger.info, and I'm looking for a string with a certain kind of parameterization.

Starting point is 00:06:49 The parameterization mechanism is a little bit different in these two libraries, and the order of the arguments is a little bit different. And then it gets just brutally difficult to do even that simple thing, which is, you know, replacing the parameterization and reordering the arguments because they can span different lines, and you can have like this first string argument be a binary concatenation of a string literal and a method call and a you know another variable and then just wrapped in some other methodification and so you're certainly going to get to the point where regex isn't going to do it and so the next step would

Starting point is 00:07:21 be to think well let's do abstract syntax tree manipulation. And I can understand why one would start there. But even that wasn't sufficient because just because of the first problem we selected, when I see a field called logger, it's potentially not visible anywhere in that source file what type of logger it even is. Like I could have inherited that field from an abstract base class that came from some binary dependency and so there's no import statement to indicate where it's coming from there's no type definition and so even that very first thing i needed not just the abstract syntax tree i needed to know everything the compiler knew

Starting point is 00:08:00 about every method call every field and so we call that now the lossless semantic tree, which is like an AST plus everything the compiler knows, plus all the transit dependencies. And of course, the set of data we bolted onto this thing has grown over time, but it winds up being like the minimum information I need to make an accurate change. And again, accurate change is important in this case,

Starting point is 00:08:20 was greater than what was visible in the AST. And so I think where it's easy to start wrong is to think, my most important thing to do initially is to support as many languages as possible and to support them somewhat shallowly, maybe with like an abstract syntax tree. And then maybe later on I'll try to back my way into figuring out what the compiler knows about it as well. Even for a language like Java, like trying to compile an arbitrary repository

Starting point is 00:08:47 without reconfiguring every repository is like a really deep problem. But that's the data that we felt like we needed from the beginning was that loss of symmetry, everything the compiler knows as well. Let's come back to that because that data, I think is like, there's an interesting connection to AI now as well in applications, but we'll put a pin in that for a moment.

Starting point is 00:09:09 So in terms of use cases, you know, at the very beginning, day one, I was thinking Blitz4J to SR4J. That's a fairly narrow use case. I was thinking of replacing one kind of method call. Then it was, oh, we'll try to fix this kind of zip slip vulnerability, that one that Sneak found so many years ago. That required a little bit of local data flow analysis, but not global data flow analysis yet. So this kind of scope has widened over the years. In general now, we call this concept tech stack liquidity. The idea of being able to move from state A to state B.

Starting point is 00:09:41 And that could be framework version 1, you know, one to framework version two, or it could be language version one to language version two. But the use cases that had had even more business ROI have been like, hey, I just had a vendor come back at me with a 15 times price increase. I'm offended. I want to rip it out. I want to replace it with another wholly similar one. But I have to go through all my code and replace all the like api calls that are unique to that vendor with some other one and you know so there's ones that we're probably not too too scared to say publicly like oracle to postgres right like i'm trying to kill off the oracle bill or i'm you know you can imagine the sort of like desire and right now, especially in the enterprise, that's really difficult to plan.

Starting point is 00:10:27 Those sort of activities, if you just go to a principal engineer, how long will it take for you to replace Oracle Postgres? It's like, I have no idea. How do you even put an estimate around something like that? It just becomes very difficult to plan. So those are the kinds of problems I'm still interested in. It's like not moving necessarily from 40-year-old COBOL application to modern Java microservices. I still think that's like just complete re-architect, you know, rebuild. It's tech-side liquidity, you know, inside the same language, potentially moving from one vendor to another, one framework version to another, one language version to another.

Starting point is 00:11:02 Patching SAS-type issues, SCA types, those are all in the realm of possibility in my mind. And I think you referenced this loss of semantic tree almost everywhere in the website. I saw you have a summit recently, and this gets mentioned also. I definitely want to keen into this a little bit. So from my understanding from the technology overview page, LST is a much richer information of AST. And I think it's probably, it's good that it's very intuitively easy to understand the conceptual way of thinking about that. But what's probably harder is like, why can't AST just bolt on a bunch of more data as well? What is the technically difficult

Starting point is 00:11:46 thing for any existing AST just add a bunch of metadata, for example? I think you talk about a lot more vertices, there's duplication. I feel like there's actually quite some nuances here. So can you talk about some of the fundamental hard things? Why can't you just add stuff to ASTs?

Starting point is 00:12:02 And what are the things you guys have to get right to make sure LST is done in a way that is able to do what you guys are doing? In many ways, this is a great intuition, actually. Why isn't it just adding stuff to AST? The way the compiler itself for any given language works is a multi-step process. It's taking the text of the code. It's tokenizing it. It's producing an abstract syntax tree.

Starting point is 00:12:26 And then there's another phase that we'll vaguely call type attribution, where it's going back through that AST and it's trying to solve for what each thing means. Keeping track of the symbol table and all this kind of stuff. So that is what the compiler is doing. And it uses that type attribute in AST ultimately to produce its, whatever its end result representation is. So if you're starting like, you know, almost horse blinders here, I'm just interested in Java for a moment. I'm starting from the perspective of like, rather than, I'm just going to exercise the internal guts of the Java compiler and march it through the first few phases up to the point where there's this attributed type attributed AST. And now I'm going to just like try to rip all that information out and put it somewhere over here that's designed for serialization or designed for

Starting point is 00:13:16 refactoring. Of course, at that point, the compiler, it doesn't care about whitespace, right? That's all been discarded. Comments have been discarded because that's not important. And it's an immediate representation to what it's trying to get to so it's got a lot of information but you have to kind of stitch other information onto it if i was instead trying to say like ast is sufficient i think what i would start with is a tree sitter or antoir or one of these other sort of like general purpose, you know, text AST. And then what I would find is I have an AST and then I have to build up

Starting point is 00:13:49 essentially the type attribution mechanism that mirrors and does the same thing that the compiler is doing. And that's work I don't want to do. I would rather rely on the language compilers because that's obviously a very complicated problem with a lot of edge cases. It's the question of, you know, starting point.

Starting point is 00:14:07 Like, if I think I'm going to quickly grab a lot of layers, whenever I look at a new refactoring technology or new code search technology, if I see TreeSitter in the stack, I think I don't want it, because it's not going to have enough information for me to do what I want. And yet the initial, like like quick demos are always like, oh, look at this. I can find like a JavaScript console.print. Like, of course, like that's sort of, you know, uniquely identifiable even from the syntax. But that's just not the reality for most changes that we're trying to make in the code.

Starting point is 00:14:42 Very fascinating. Okay, so it sounds like there's like a fundamental difference, not just even like what you store, but even how to make in the code. Very fascinating. Okay, so it sounds like there's a fundamental difference, not just even what you store, but even how to get to that state. How do you get there? Yeah, I think the smart thing is to start with the compilers, which means you're essentially taking on the problem of how do I exercise that compiler? You're doing that on a per-language level.

Starting point is 00:15:00 Yeah. There's just no shortcut to that. Got it. And so you already alluded you know you want to talk about AI and we all want to talk about AI the whole world we have like AI's a couple of your sessions so we're going to jump into AI obviously and I think the whole world of AI and even this whole code refactoring or like just AI generated code or AI modified code a lot of people treat this almost as a language model approach, right?

Starting point is 00:15:29 Like sequence in, sequence out. Code is just a type of thing you look at. But obviously, I think you're... I wasn't able to listen to your talk. You don't have a recording on the session. So we don't know exactly what he talks about, but from the description, it sounds like there is a pretty meaningful difference

Starting point is 00:15:48 of having this richer type of information available to actually do AI-assisted, large-scale code refactoring. Can you talk about what is the most effective way people can think about? Why just general AI LLMs are just hard to just go after AI? Go to Cursor or something, just, hey, here's my code, make it into Java 5.0. What's the fundamental hard thing for any potential AI

Starting point is 00:16:12 to just do that versus your approach here, which is like, I need to have this sort of LST and understanding and rule base. Do you see some fundamental trade-offs and some things that you think, okay, I don't think AI can ever do this? Yeah, it's such an interesting topic. I think, first of all, I use Copilot all the time.

Starting point is 00:16:34 And by Copilot, I think you could easily say Cursor, I could use Copilot. I mean, there's differences in UX a little bit, but I think code authorship in the way that the Cursor is blinking in a particular place, and I'm going authorship in the way that the cursor is blinking in a particular place, and I'm going to predictively suggest what the next tokens are, is perfectly useful and additive to the rule-based refactoring that IDEs are already fairly good at, depending on the language. I think for me, in terms of the total amount that adds to my productivity, it's

Starting point is 00:17:02 questionable. I think maybe 10% or 15% in my view, but I think it depends substantially on the language and how much ID-based refactoring there already was for that language anyway. So if I set aside code authorship, which is like net new code production, and I'm instead focused on,

Starting point is 00:17:19 you know, how am I maintaining a large body of existing code? I think when you're looking at code as text or you're looking at code, even as AST, I think as some like ABI's refactoring things are starting to talk about, go back to the original problem statement I had with Blitz4j to SL4j. Would the model be able to understand whether a particular field of type logger is a Blitz4j logger or is an SL4j logger or is a JUL logger or some other thing? And the answer is there's just not enough data for it to understand what that type is by looking at that data structure.

Starting point is 00:17:53 And so I think the temptation is to say, well, I didn't accurately replace all the loggers, but the models will get better, right? I think in this case, there's just not enough information. The model right now is actually good enough, I think, to make a lot of these determinations or to summarize how something's used if it's provided with that richer information. I've got another example other than the logger thing, because it's kind of like beating the dead horse on that one. I think there's this interesting question that came out of one of our insurance customers about what is the set of all rest endpoints that I have to find in my code that produce a particular piece of PII, like the name last name or something like that. Find me every rest endpoint that has last name somewhere in it. And when I look at the code,

Starting point is 00:18:38 look at this like pet clinic application that has a rest endpoint that returns a list of veterinarians. In the code, it says list of veterinarians. It doesn't say, but when you look at the type attribution for what veterinarian means, it's well, veterinarian extend from person and person has a name and name has a last name on it. And like, so I can drill down through that information in the LST. There's no situation in which the code as text or the AST is going to enable a model to tell whether that thing has a last name on it or not. I'm really curious to get your perspective.

Starting point is 00:19:13 I kind of think of it as two things, basically. I kind of have this mental model in my brain. LOMs, machine learning at the end of the day, is an incredible way of applying statistics, probabilities, to solve a problem space. It's a distribution at the end of the day is an incredible way of applying statistics, probabilities, to solve a problem space. It's a distribution at the end of the day. And it's not deterministic. It's indeterministic by virtue, and that's a feature.

Starting point is 00:19:38 And on the flip side of it, program analysis is a broad problem space, which is kind of like this code rewrite stuff you're talking about lives in that solution know lives in that solution space is a deterministic like approach right it's it's rules-based at the end of the day right like we're writing some type of i assume that you have some type of query right that you query this broad enhanced ast that you generate and that through that query is how you then can you know think about like okay i'm going to find and then if i want to make a location you need to write then generate a series of other queries mutations upon the graph and then i'm going to find, and then if I want to make a location, I need to write, then generate a series of other queries and mutations upon the graph, and then I'm going to reduce that broader tree back down to, or that graph,

Starting point is 00:20:12 then basically rematerialize the resulting code from that graph. I assume we don't have to look at anything. That's exactly how it works. Where do you think, in one of the problem spaces that are hard that we talked about, is there places where the probabilistic system, so the LLMs, are good and the deterministic are hard and

Starting point is 00:20:34 vice versa? Is there overlap? Is there a world where the hybrid solution is the right approach? I'm very curious in terms of how you think about AI sits. The fact that you have a thing that is non-deterministic sits into this world of determinism and how that allows you to potentially solve new problems

Starting point is 00:20:50 you couldn't solve before. I think there's two sides to this, and they're equal and opposite. Maybe that's not going to be surprising in the end. But one is, how do we supply deterministic data to a probabilistic system? I think the interesting change recently has been the introduction of

Starting point is 00:21:06 tool or function calling to the large foundation models. I see this as almost like the inversion of control moment or LLM architectures where previously I would take a rag-based approach and try to like prompts effectively prompts stuff some data into the prompt before I asked a question but by asking a question and also being able to supply some tools, I'm leaving it to the LLM to decide to or not, you know, invoke one or more tools to supply itself with more information. And I see recipes, so recipes being programs that either make a change or produce, you know,

Starting point is 00:21:41 columnar data or row-based data that, you know, extracting information out of the LST, these recipes are ideal tools to supply an LLM with information. So one tool that I could provide is the example I just gave, like, where are my sensitive API endpoints that produce last name? But then I could provide another tool that says, you know, what's using this transitive dependency? What's using this kind of API,

Starting point is 00:22:06 even if it's not visible in the text, what's using, where are the methods that are calling that are deprecated? And there's, of course, hundreds or thousands of these. I think it could be very interesting. An LLM supplied with hundreds or thousands of such deterministic tools may choose to recombine them in ways that are hard to anticipate.

Starting point is 00:22:25 It may grab a little bit of information from here and a little bit of information from there. And so synthesize some understanding about the code base that would be hard for me to write the sort of like recipe, like ETL pipeline to get to. So that's like one thing providing data to them is important. And then on the other side, I think you've got a deterministic recipe that's operating on some narrow part of the code. And it's difficult to make a rule-based decision about what to do. And the example I like to give of this, I've actually literally seen only one of these so far in all of our custom recipe developments over the years. And that's that there was a particular bank

Starting point is 00:23:06 that was writing most of its code in French, which French, of course, has particular kind of accent marks on their characters. And it just so happens that when a file moves between UTF-8 and ISO-859, it can mangle some of these accented characters. I don't really know why. I guess there's not some alignment between the simple tables, but at any rate it does. And if that happens in the source code, it breaks compilation and somebody fixes

Starting point is 00:23:35 the thing. But if it happens in a documentation comment, nobody notices because it doesn't break compilation. But in the JavaScript space in particular, one mangled character in one documentation comment meant you no longer got a documentation artifact, like a Java doc artifact out of that repo. And so it so happened that over the years, it basically had no Java doc artifacts anymore, because it only took like one occurrence somewhere in a repo

Starting point is 00:24:01 to break this thing. And so here you can imagine a recipe that's going down the tree, and a documentation comment is just one type of tree node in that tree. And I can be looking at this particular documentation comment and say, I see a mangled character in here. Now, I don't even know what the original word was because it had gotten mangled. But I'll give the whole comment to an LLM and say, basically try to fill in what you think the original sentence was it supplies that output back and now i'm

Starting point is 00:24:32 surgically inserting that where that documentation comment was in the tree and then like you say i can rematerialize the source code from that afterwards there was zero probability that the LLM made some other change in the code that would cause some problem. And so even though this is an LLM-based suggestion, I'm comfortable merging this or mass committing this to thousands of repositories because I've sufficiently narrowed the problem, you know, or the sort of like point of injection. You know, a deterministic system can use an LLM. An LLM, I think, can use a deterministic system in different ways. So broadly, you believe the future is hybrid. I think certainly in my own use case,

Starting point is 00:25:15 I think about, I've been playing with Cursor, it's a pretty great IDE in terms of if you want to have a chat with an LLM while you're writing some code. The experience is nice. But broadly, you get to a level of depth that the LLM can't deal with anymore. And so I'm kind of curious where you see these more deterministic rules-based systems fitting into solving that. For example, I had a particularly annoying problem where I was messing around with basically metaprogramming and Go

Starting point is 00:25:47 using Reflect. Anyone's ever played around with Go Reflect? You can do anything any semi-complicated and you basically just want to throw your computer out the window and question why you ever decided to do something complicated. But as an example,

Starting point is 00:26:02 any semi-sophisticated code base someplace will have reflect use in Go. It just is a virtue of niceties. Because the programming system doesn't give you, it does now have some formulation generics, but you're missing some language features and their solution was always reflect, especially pre-generics.

Starting point is 00:26:21 My point is, using the LLMs, they fall apart on that type of situation because it's the complexity. Read the next token, attention-based model is great for very surface-level issues or anything that actually has deep-level code representation, which is any sophisticated code base

Starting point is 00:26:38 or any code base at any scale along whatever axes they start to fall apart on. So I'm kind of curious, where do you think we are in terms of our journey around like co-pilots? And where do you think, from your perspective, advantage point coming from like, you know, the world of program analysis, where do you think these two like start to merge? And where do you think we're going to see the upside first and how we get to the next

Starting point is 00:26:59 step? I'm both like an avid user and not the most optimistic person on this. And I hate to be a downer, but like I actually did my own measurement of this for a couple of days where I just had like essentially like key logging recording of every character that was being typed in my code for a while. And the question I was trying to answer is how many characters did I type versus how many characters did IDE basedbased refactorings or like, you know, completions do versus how many did Copilot do? And the numbers were not so good

Starting point is 00:27:33 in my favor. It was like 13% of characters were written by me, but it was like 70% of the characters were written by IDE-based completions or rule-based refactors. And then the remainder was written by Copilot. And so I think those places where Copilot made a suggestion that I accepted are naturally things that ID based completion or rule based refactor would never do. It would be like suggesting a documentation comment on the top of something like it would be either providing little blocks of code that I may or may not accept. And so am I better as a result? Sure. Like I almost think of it like it's like I'm an Olympic swimmer.

Starting point is 00:28:19 And, you know, this year, you know, you're supposed to wear a long swimsuit or a short swimsuit or whatever the trend is at that point in time. And if I don't, I'm going to obviously not win the gold medal. But if I do, I might win the gold medal. And if I'm just like a couch potato, I'm not going to win a gold medal by wearing that swimsuit you know so i think to the best developers makes them better i believe when folks say like it makes junior developers learn faster but i think the most effective applications of this find ways to hybridize these approaches and that's why i haven't really personally had a great experience with cursor like i can't get it to work on most of my complex code bases like because it's like eschewed rule-based determinations so much or ai-based determinations but then i hear people say like oh i spun up this you know new sort of crud app like very quickly i believe that's true

Starting point is 00:29:01 as well so perhaps the greenfield stuff gets quicker to write. The existing code doesn't benefit from it as much. And that doesn't mean that code is wrong. It just reaches a sufficient level of complexity and it just doesn't work as well in that situation. And that kind of builds on the point that some hybrid, to unlock a broader depth of field or let's say to find broader value in the enterprise or even a semi-sophisticated situation, you need some type of hybrid approach, which is great for you.

Starting point is 00:29:31 And I'm curious. It should never be surprising. I mean, that's so it is with every tech trend. It's usually some hybrid approach that winds up being the right way, ultimately. Exactly. That takes advantage of the pros, but substitutes for the cons.

Starting point is 00:29:44 I'm very curious to get your take on where you think we're at in terms of the concept of agents in relation to codebases. I think we're hearing a lot. You'll listen to OpenAI, they talk about the future of AI as agents, like this agent's going to go and plan out your whole trip to the Maldives for you and booked everything and your flights, and they've pre-booked your excursions and reservations at all the restaurants, and it's wonderful and did nothing, except type some things in text. and booked everything and your flights and they've pre-booked your excursions and the reservations at all the restaurants and it's wonderful and did nothing

Starting point is 00:30:06 except type some things in text. The foundations of the conversation we're having is also the foundations of what needs to be true in order for an agent to work in the context of a code base. So you could have not just a copile, but something that's semi-autonomous minimally or fully autonomous.

Starting point is 00:30:23 So I'm kind of curious, how do you see the future of agents in relation to code generation or modification or maintenance in the context of the conversation we're having today? I think even for, you know, here I am with a rule-based system that's supposed to be deterministic in its output and to some extent provably accurate. And what do I find? I find that I still have to force a human into the loop or the work doesn't get done. And some of that's like social resistance from the human themselves. Like I've noticed one other sort of like trend

Starting point is 00:30:56 of folks starting on like automatic remediation at any scale. The most obvious delivery mechanism is mass pull requests. So the system will usually be like, I'll issue pull requests for you. And what I found over time is like, if I, as a central team, say a security team, say like a platform team, issue mass pull requests and product teams downstream of me, the merge rate will be pretty low, like 30%, less than 40% certainly.

Starting point is 00:31:23 And I think of it like advice coming from an in-law or something i need to have a better analogy here but it could be great advice you're just looking for a reason to reject it it's just something about it coming from like an external party you just are ready to resist it but when we develop an experience where somebody can take that same recipe and issue prs for their own code the acceptance rate is much higher. It's like the change was literally the same. But if I pull it as the person that's ultimately responsible for that code, I will do it. If it's pushed to me, I won't do it. I could be 100% confident about a particular recipe and know that I have to put a human in the loop in order for them to action it.

Starting point is 00:32:03 You know, I have an AI discussion coming up later today because, of course, whenever we say AI on anything, it triggers additional questions, you know, from prospective customers. And the last question on one I just got last night, and I'll just read it verbatim, was where do you see challenges with black box decisioning? How do you feel confident that you have full insight into what the models are doing? Is there end-to-end explainability? Whether there's gaps, what controls are in place? And my answer is always the same. Like, we'll never make a decision that we're not putting in front of the human purse. And if I didn't have that answer, honestly, I don't know how I would answer that.

Starting point is 00:32:40 And I think this is a perfect segue to what we call spicy future. And the future is complicated. I like your last note because it's not going to be all rosy. We'd like to hear what your perspective is. Give us your spicy hot take. I think this is going to be somewhere related to our own coding and stuff like that. But what is your hot take about this space that you think most people don't believe yet i think i'll reserve the spice take on ai for someone else because i think you've heard a lot of those so um or maybe you've heard enough from what i've said today but and i mean this to be collaborative i think accenture

Starting point is 00:33:18 and deloitte and the like will get an order of magnitude cheaper. And the reason why is they will sooner or later adopt this idea of building rules or recipes, whatever you call it, to solve problems horizontally across a business unit, as opposed to doing the same operation laboriously, like one repository at a time. And I think it'll be good for them because there's just, far from limiting their billable hours, there's just no end to this kind of work right now. The types of moves people want to make, there's just not enough resources to do it.

Starting point is 00:33:58 It's long been my belief that engineering worldwide is supply-side constrained, not demand-side constrained. We want more features. We want more technology. We just don't have enough people to build it. Actually, this is super fascinating that you picked on Accenture. Because I think Accenture is not just the company itself.

Starting point is 00:34:17 It's almost like the type of industry overall. And how do you see this play out maybe within your own touch points of your customers? Because the companies you sell to, I think, are employing a lot of these Accenture-type companies all the time. And so you're probably seeing maybe some

Starting point is 00:34:36 close to firsthand what's happening. Where are you seeing the places even within your view that there's like a, okay, this is clearly, it's like this type of work, maybe this kind of things clearly should be replaced by AI? Like, do you see places where like, okay, I think this is probably the first thing that's going to come?

Starting point is 00:34:58 Or some examples that would be great, you know? Yeah. And in my specific case, I'm thinking, you know, partially AI is like, you know, again, enabled by the, you know, deep data structure to answer questions or plan these sort of efforts. Yes. Like that's the tool function calling the recipe, I think. You know, so I understand what the scope of a Oracle to Postgres migration is going to

Starting point is 00:35:21 look like, you know, More specifically for the actual transformation, I think the skill that those organizations will employ is one of building automation to accomplish that change as opposed to doing the work in a rope manual way over a ladder of clusters, which is what they've had to do previously. And as far as what happens right now, that's already happened on some level. We employ a sort of deep bench of consultants and freelancers that have already developed that skill and that will put in particular customers to accomplish specific modernization objectives. And the whole point there is I'm doing it at a much lower unit cost than I would have done otherwise. And I think naturally, like anything, that kind of trend starts with smaller, more forward-looking

Starting point is 00:36:05 thinking or organizations that want to compete. And it works its way up that vertical ladder. I'm really curious what you think about the future is for vendors in general. If you look at a lot of... Tim and I, I do a lot of angel investing. Tim does a lot of seed, pre-seed stage infrastructure investing. Most infrastructure, if you really want to make it work, you got to go find out how you do the swap. How do I get you to go from old school Kafka to new school, whatever version of Kafka I replace?

Starting point is 00:36:38 This is a huge issue for the adoption of new technologies. It may be more cost effective to run this thing, or it may be more productive-effective to run this thing, or it may be more productive to build a new thing, but a lot of the world's workload is running on this old thing. So I'm really curious to understand, what do you think the future of migration is? But more importantly, what do you think the future of code swap, the ability for people to swap components in another stack actually is?

Starting point is 00:37:02 A lot of it depends upon how cheap and easy you can make a migration. Absolutely. One way of allowing swappability is to provide sort of like API-compliant sort of interface and then swap out the component underneath it. We've seen a lot of that. But the other part is just to like, we'll go back to the code and make the change

Starting point is 00:37:24 so that you don't need that compatibility layer in between. And again, like I don't want to claim that like open rewrite is the only way this is going to work. I think there's going to be ultimately multiple competing technologies that do this expand to provide recipe-like things to more quickly expand their usage inside of a customer. And that's the good thing for both the vendor and the customer because they want to use the same product, obviously. advantage to a framework author or to provide you know one version to another recipes because they can iterate faster and therefore compete you know better against alternatives in their ecosystem it may be a survival characteristic for framework authors because as somebody selecting a framework if i'm choosing one that has a well-paid like automated migration path, I know I'm not going to hurt next year when you make changes versus one that doesn't.

Starting point is 00:38:28 I think there's this interesting dynamic that we've seen play out, which is a lot of like framework authors are kind of owned by a larger organization. So you can think of like the spring team, what was a broadcom cork is belongs to the red hat. And so you have a customer that uses that framework heavily and is a customer at the same time of like a Broadcom or a Red Hat. I mean, I don't think I need to suggest this very often for them to go, hey, I can put spend at risk to like compel them to write the recipes

Starting point is 00:38:58 because they keep making breaking changes that impact me and I'm tired of it. And curiously, the largest organizations in the largest enterprise which usually you think of this like having the least impact on like forward technology versus like a google or whatever have the most impact here because they're the ones that can put the most spend at risk so they just have it's just like a thousand pound gorilla that can be like no like red hat i need i need this recipe right um or no once darkly or data dog or whoever.

Starting point is 00:39:27 It's making breaking changes for them. It's a benefit to everyone. Awesome. Tell us where folks interested in learning about Modern or Open Rewrites places we should find you and your company. Yeah, I hope you don't find this ironic.

Starting point is 00:39:45 You can find it at modern.ai now because we do have to talk to people like you somewhat, Tim. And that.ai helps a little bit to get that initial conversation. Yeah, I heard getting.ai domains increase your valuation. That's right, exactly. Within seconds. Yes, yes. So it doesn't hurt to have one. That's right. That's right. Exactly. Within seconds. Yes. So it doesn't hurt to have one.

Starting point is 00:40:09 That's right. That's what I see. There's a whole auxiliary technology there. Yeah. Well, thanks for being on the pod, sir. It's great to have you. Appreciate it. Awesome. Thanks so much.

Your Ad Here

The Infra Pod - Code refactoring is more than a AI problem! Chat with Jonathan at Moderne

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.