The Changelog: Software Development, Open Source - Machine powered refactoring with AST's (Interview)

Episode Date: September 19, 2019

Amal Hussein (Engineering Manager at npm) joined the show to talk about AST’s — aka, abstract syntax trees. Amal is giving a talk at All Things Open on the subject so we asked her to give us an ea...rly preview. She’s on a mission to democratize the knowledge and usage of AST’s to push legacy code and the web forward.

Transcript
Discussion (0)
Starting point is 00:00:00 Bandwidth for ChangeLog is provided by Fastly. Learn more at Fastly.com. We move fast and fix things here at ChangeLog because of Rollbar. Check them out at Rollbar.com. And we're hosted on Linode cloud servers. Head to Linode.com slash ChangeLog. This episode is brought to you by Linode, our cloud server of choice. It's so easy to get started with Linode.
Starting point is 00:00:21 Servers start at just five bucks a month for your big ideas. Head to Linode.com slash changelog. Choose your flavor of Linux that works for you. Then pick a location that's right for you. London, Tokyo, Dallas, and many other places in the world. They've got you covered. Go from having that amazing shower idea to a hosted website in just minutes. Start small.
Starting point is 00:00:40 Expand as your idea blossoms into a huge hit. And we trust Linode because they keep it fast. They keep it simple. Check them out at leno.com slash changelog. All right. Welcome back, everyone. This is the changelog, a podcast featuring the hackers, the leaders, and the innovators of software development. I'm Adam Stachowiak, editor-in-Chief here at Changelog.
Starting point is 00:01:07 On today's show, we're talking to ML Hussain about ASTs, aka Abstract Syntax Trees. ML is giving a talk at All Things Open on the subject, so we asked her to give us an early preview of it. She's on a mission to democratize the knowledge and usage of ASTs to push legacy code and the web forward. And by the way, we'll be at All Things Open. We're hosting a live JS party on stage. Plus, Jared is giving a talk on Svelte for a radical new approach to building user interfaces.
Starting point is 00:01:36 And as a special thanks from the team behind All Things Open, we're giving away five free passes to the conference. And all you have to do is tweet, I want a free pass to All Things Open because, and state your reason why. And copy at changelog and at All Things Open in the tweet. We'll send DMs to each winner next Friday, September 27th. Good luck and enjoy the show. Oh, one more thing for those who don't want to wait and just want 20% off your pass right now,
Starting point is 00:02:02 use the code changelog20 when you buy your tickets. The code is unlimited, so tell your friends. Head to allthingsopen.com to learn more and register. So, Amal, thanks for joining us. First of all, congratulations to your first week as engineering manager at NPM. It's bittersweet. Tell us what's new here.
Starting point is 00:02:22 Thanks so much, Jared and Adam. So, hi, everyone. My name is Amal Hussain. I am a new engineering manager at NPM. It's my first week. And I came to NPM via Boku, where I was an open web engineer working on some pretty awesome stuff in terms of web conformance suite testing with browser interoperability, as well as working most recently on GameBender, which is a scratch-based game console, which
Starting point is 00:02:55 uses computer vision and all this other cool stuff, all open web APIs, and to teach kids how to code creatively. So that's what you were doing at Boku, or that's what you're doing now at NPM So that's what you were doing at Boku or that's what you're doing now at NPM? That's what I was doing at Boku. I was doing a lot of work around products. I would say product engineering. And really it became very clear to me
Starting point is 00:03:21 that I needed to kind of boss up a little bit because I was, you know, consistent, like just really, I think strong at managing up sideways down and for a pretty large project. I was a tech lead for that project. And I just, I'm stepping into my love of product by, you know, being an engineering manager, which combines the best of, I think, for me, the best of both worlds, which, you know, you're able to be hands-on with the team and drive technical strategy. And you're also able to work with all of the stakeholders that are involved in the
Starting point is 00:04:00 software delivery process. And it's something that I really enjoy doing. I've consistently been the go-to person at every team, at every company for a variety of things. It was a really difficult decision to make, if I'm honest. It was very, very difficult. I identify as a woman and as a person of color. For me to walk away from the full time responsibilities of delivering
Starting point is 00:04:29 software, just just that aspect, it was it was a very difficult decision. And but I realized that there's even less of me, you know, in engineering leadership. And so, you know, that's, that's where I think I get some kind of solace which I'm giving folks an opportunity to have a woman of color as a manager which is a very rare thing for most people in our industry. Well that's awesome. I'll say congratulations and good luck because you're just getting started
Starting point is 00:05:01 and I hope you have a lot of success there. Boss up. It was time to boss up. I like that. It success there. Boss up. It was time to boss up. I like that. It was time to boss up. And own my bossiness too. You just have to take a step back and realize, you know what?
Starting point is 00:05:15 Hey, I can do this. It's quite simply that. And I think a lot more folks from our industry need to make the hard decision that i made because there's a ton of really bad managers um there's and uh folks who really um don't focus enough on mentoring or don't focus enough on you know just kind of uh the overall uh technical strategy so uh yeah tell you what the being a leader is one of the toughest positions because you get criticized scrutinized
Starting point is 00:05:45 not only by yourself which is where it usually begins but then also from the externals you know people who don't even know you will criticize you and then people who really know you will also criticize you so everybody's being a leader is tough yeah it's a really tough position and that one in particular that you mentioned you know with uh interfacing with so many stakeholders it really requires somebody who's very empathetic right can see all sides kind of be put the position put them in positions of everyone else's position to sort of like drive the ball forward and take nothing personal or at least try to yeah Yeah, I agree wholeheartedly with your analysis there. There's a great, great quote,
Starting point is 00:06:30 heavy lays the head who wears the crown or something like that. And there's a lot of, I think, freedom you get in a leadership role where there's a lot of autonomy, you're able to kind of drive decisions and really make an impact for good or for bad. But with that comes a lot of responsibility, and one of those is taking responsibility for failures
Starting point is 00:06:55 or missed opportunities. And I think what's interesting at NPM about this is I've always had a dream of being a toolmaker, tooling, and that's kind of like my stuff. That's like my jam. It's kind of always into architecture, infrastructure, how things connect. I'm very much like an in-between person.
Starting point is 00:07:19 When I worked on server-side code, middleware was something that was interesting to me because of the intersectional nature of it. And so at NPM, in many ways I'm fulfilling my lifelong dream of being a toolmaker. And I think as an engineer that's a toolmaker, we have the toughest customers because people are relying on us to then do their jobs and make their magic happen.
Starting point is 00:07:51 There's this extra layer of not only scrutiny, but also we're the toughest customers, software engineers. We're the toughest and we're the toughest cost customers because we we make uh we could we can make the thing that we're using if we really sat down sometimes you do sometimes you make your own thing because somebody else's thing isn't good enough right you got two things right it's it's it's anybody else's thing ever good enough let's be honest you know uh so yeah if if you could wave a magic wand and like you know have the skills to write your own ide you know in in a day or a week i bet you would you know because you want it your way and so you know there's there's an arrogance and there's a pickiness in our industry. And much of that, I think, is to be expected.
Starting point is 00:08:49 We have really hard jobs because ultimately the engineers that kind of criticize you as a toolmaker that serves them, those same engineers are also criticized by their users and customers. So ultimately they're also being judged. So it's like an exponential judging chain. What's interesting there is that contentment is often the enemy of progress, right? So like, if you're content, you tend to not want to progress and get better, you know? So then you have this idea of discontentment sort of like becoming a norm in our industry, where in some cases discontentment is sort of frowned upon right
Starting point is 00:09:25 like to be discontent means you it's just like not a good position to be in i suppose because it breeds envy and jealousy right right and so you know as an industry just based on the desire to progress which we all want to because that means that our tooling gets better our software gets better etc you know if we have to live lives of discontentment i wonder how that really impacts us psychologically in our industry yeah i i think that's like i think that is a topic that i would like to dive into um it like not not right now right here but but but but definitely in the future because i i think there's the intersection of psychology and all of the pressures that are on
Starting point is 00:10:11 us as engineers and the continuous improvement, continuous change. I wish we had more cultural anthropologists that were studying technologists because I think there's a lot of really insightful behavior and just just insightful things in general that you know that are probably very unique to our industry and and how those things kind of play out on our on our lives outside of the terminal you know is i think that's another really interesting story i like that yeah beyond the terminal yeah buy that domain now sounds like a podcast since you mentioned your desire for this it's something a podcast we're actually creating
Starting point is 00:10:51 called brain science that's true oh we have that's dope in the pre-qual thing we mentioned maybe i should be on your podcast you know we're actually taking guests sometime soon we we want to dive into this we're exploring the inner workings of the human brain to understand things like behavior change head formation mental health and basically what it means to be human so brain science applied not just what we know about the brain but how can we apply what we know about the brain to sort of transform our lives and better our lives and some of that is this thisologist-type approach towards our industry. Yeah, I'm really happy to hear that. There was a major at my college that was called Society, Technology, and Policy.
Starting point is 00:11:40 And I thought that if I was like 20 years older when I went to school, like I feel like that's what I would have done. Because I find I would have probably done that as like a double major because it's, you know, for me, I consider myself like a very intersectional human because of a variety of things. Like, you know, not just my family background and life experiences, but even just my interests within the industry. I'm an engineering manager. That job is hugely intersectional. And so I think that's a super relevant thing to explore and what the effects of that are moving forward
Starting point is 00:12:29 as we progress in this new and uncharted territory of the digital age. I'll add one more layer to that. We often look at the internet as in so many years, like being a teenager. I think, what, it's about 20 years old now? I remember a couple years ago it got its uh its driver's license so i think okay yeah drinking age like 21 in the u.s so i know that software's been around longer than that but that would mean
Starting point is 00:12:55 that in a similar way engineers in that era are similar in their maturity level not so much individually but corporately. Meaning that we've been doing this internet thing for the same amount of time the internet's been around? Basically, yeah. So we can assume, I would say to some degree, that our awareness of how to best drive the thing is predicated on how old the thing is.
Starting point is 00:13:24 Definitely a young industry. Right. So we're still learning. We make mistakes. And that's human. And a changing industry, right? Physicists, there's a lot to learn beyond astrophysicists, right? But the basics of physics
Starting point is 00:13:40 are the same as they've been. And so that's a thing that I go back to the idea of civil engineering. How to build a bridge in a structurally sound manner is a tried and true science. Right, right. It doesn't change every year. You could have written that book 100 years ago
Starting point is 00:13:57 and it'd be slightly different now, but it'd be pretty much the same foundations. Whereas we're kind of figuring out this software engineering network-based industry where we live our lives and we have our jobs and they're kind of like in the same milieu and like all that kind of stuff is we're very much living it out as we're trying to develop it and we're making mistakes that impact people that we don't even know, et cetera, et cetera. So it's very young and therefore I feel like we really don't understand
Starting point is 00:14:25 what all the implications are at this point. Yeah. I want to go a little deep, a little, like I would say a dollar store philosophy maybe on y'all, which is here's my dollar store philosophy. So what's really interesting about the web is not only how young it is, but also the impact that it's had in the amount of time, right? And just how exponential it is in so many ways. And then you look at the under-the-hood experience with developers and just how much change we've had and how actually developing for the web
Starting point is 00:15:05 is an extremely hostile thing. In what other industry do you know where we create and we're like, well, hope this works. Ship this and I hope this works. It's really interesting to watch the transitions that we've had where 15 years ago or more, it was like,
Starting point is 00:15:30 hey, actually probably about 15 years ago, a user comes to a website and the server's like, hey, tell me who you are. And it's like, Netscape. And then it's like, okay, here's your code for Netscape, right? It's like, we've come a long way, even just in that where we've kind of, we're now driven by features, you know, more of like progressive enhancement, but it's still very hostile, right?
Starting point is 00:15:57 Because there's a ton of variability now. It's a different type of variability. It's not so much that browsers have a really low interoperability score. It's that browsers are just so much more powerful. And there's a bunch of other capabilities. There's assumptions that you can make on the device size. There's assumptions that you can or can't make on the capabilities that are enabled. It's like the matrix is growing and the problems are changing.
Starting point is 00:16:28 It's really interesting. I kind of think of it like quantum computing style. There's just so many things happening. Some of this might even lead into the bigger topic we're here to talk about too, which is ASTs and legacy code and stuff like that. Maybe a smaller topic, actually. Well, something I want to say is yesterday's choices are today's consequences. So yesterday's choices, and we're talking about our maturity level in terms of an industry and people and even as an internet, that we're still learning. But yesterday's choices are today's consequences.
Starting point is 00:16:57 And that's kind of where we get legacy code from and this need to transpile it into new ways and take care of tech debt and all these things that come along with building software. Good segue, Adam. Yeah, great segue. I saw that segue coming because I'm a podcaster myself. I was like, we're getting there. This is a long-winded introduction to a talk on ASTs. I saw it coming as well, and it was so smooth
Starting point is 00:17:20 that I decided to call it out and make it completely not smooth and destroy this. That's right. I actually just, I killed the segue. You just janked it up. That's fine, Jared. It's okay. We forgive you.
Starting point is 00:17:32 Thank you for the forgiveness. But yes, change, change. The internet is change, right? So it's all about change. And that's what we're here to talk about because I'm really excited. I'm going to be talking about ASTs, um, at all things open this fall. Uh, and, uh, yeah, I'm, I'm, I'm here to, here to answer all of your questions, uh, Jared and Adam.
Starting point is 00:17:56 Give us the rundown from the uninitiated standpoint. So what are ASTs? Who uses them? Why, why do they use them? What's their purpose, et cetera. Sure. So when we write software, nowadays it's really high level. Var equals foo.
Starting point is 00:18:15 It's human-readable words that are high level. And in order for those things to be fed into a machine and for your code to get turned into ones and zeros, there's a series of steps that it goes through a compiler engine. And so one of the first steps is taking your code and tokenizing it. Tokenizing is a process where the valid syntax items, so in JavaScript that might be like a triple equals, is syntax, const is a token.
Starting point is 00:18:54 All of these things are kind of parsed. And then a tree, so it's tokenized, and then a tree is generated from the structure of your code. And so that tree is called an abstract syntax tree. It's not limited to JavaScript. It's, you know, every programming language uses abstract syntax trees to kind of feed into the compiler engine, which, you know, translates all that stuff down to bytecode.
Starting point is 00:19:26 And the abstract syntax trees are extremely useful in programming because they give us a predictable data structure, which helps us understand our code. And so if you're looking at a variable declaration, for example, const Jared equals string awesome. That's what I was going to say. That one line of code, including the semicolon, gets translated into a tree that has a predictable structure.
Starting point is 00:20:03 The first thing is, you know, it's a JSON tree that has, you know, a type program, you know, it has a body that's an array, that body has, you know, declarations of, you know, which is an array of objects, object type tree. And so it gives you this lovely output, which is like a programmatic walkthrough of your code. And the kind of secret sauce to ASTs here is that there's a structure for you to understand what something is. So you can understand const something is, right? So you can understand like const Jared equals
Starting point is 00:20:46 string awesome. You know, I know that the identifier, you know, that the it's a variable declarator, and the value is Jared. And I know that the value, you know, that the awesome is a string. And so, you know, there's no guesswork, right? And so, and if you think about things like regular expressions that we've used to kind of really parse and understand our code to like find like matches, there's a lot of, there's inherently like a conflict between like, you know, trying to find something with regex versus like using something like a tree, like that has a lot more detail and metadata, because the regular expressions are really good for analyzing static code. And also, but they're really not good at understanding the nuances about the differences
Starting point is 00:21:49 in your code. So I'll give you an example. So if you have something that's commented out, that's a variable declaration, versus something that isn't. If you have a function that uses the same name as the variable, right? So if you're trying to find matches for that thing, like it's very difficult for you. You can do it.
Starting point is 00:22:11 Technically, you can. It's like just an extremely complicated set of regex that you would have to write, you know, in order to make sure that the thing that you are looking for is a function. So what this tree allows us to do is it basically opens up a whole body of being able to really query your code and query it in a way that is extremely precise and scapular. So you can say, I want to find all of the functions that have this, you know, that contain these conditions, you know, that the conditions may be being things that are longer, you know, that have,
Starting point is 00:22:53 you know, more than 10 variable declarations, things that have like more than, you know, four if statements, functions with more than one, you I want to find promises that don't have catch, like they don't have error handling, right? And so it enables us to do a multitude of things in order to understand our code programmatically and deterministically. And then the flip side of that is using tools that allow us to take ASTs and transform the code so that we can actually do an in-place replacement. You can now not only programmatically understand your code and find things, but also you can use that to do safe in place,
Starting point is 00:23:48 refactoring of your code. This episode is brought to you by GitPrime. GitPrime helps software teams accelerate their velocity and release products faster by turning historical Git data into easy-to-understand insights and reports. Because past performance predicts future performance, GitPrime can examine your Git data to identify bottlenecks, compare sprints and releases over time, and enable data-driven discussions about engineering and product development. Shift faster because you know more, not because you're rushing. Get started at getprime.com slash changelog.
Starting point is 00:24:32 That's G-I-T-P-R-I-M-E dot com slash changelog. Again, getprime.com slash changelog. So the title of your talk is Machine-Powered Refactoring, Leverage ASTs to Push Your Legacy Code and the Web Forward. You just described what ASTs are and what's interesting about them. I think historically ASTs have been really much the playground or the domain of people who are writing languages or thinking about programming languages and have to have parsers that produce ASTs in order to take a syntax and turn it into a thing a machine can understand.
Starting point is 00:25:19 It sounds like what you're arguing for is that there's a much more mainstream use case for ASTs where lots of developers should know what they are and be able to use them because they provide this metadata and this structure. And we can use them not just to write programming language, but to actually refactor, which is, I've never thought of this before. Can you expand on how you've done this, how it works? And is this something that lots of people should be using? It's really important for me to kind of democratize this knowledge because most developers don't realize
Starting point is 00:25:53 that they are actually already using ASTs every day in their workflows if they use things like Babel, Prettier, or ESLint. All of these tools, we allow these tools to programmatically create code for us and change code for us. And we trust them because of the precision nature that comes from leveraging ASTs. And so there's a whole domain of, I think, there's a a domain of tools as well as a domain like some domain areas in our industry um asts being one of them that i that are kind of locked away esoteric in the library live yeah library author land for sure right and um and what happens with library author land is you know folks are really busy.
Starting point is 00:26:46 They're maintainers for really large projects, you know, and they're already overburdened. And, you know, getting good documentation is a challenge that like most folks have out of their projects. kind of taking the step to democratize the power of this has kind of been left on, like, I would say, us as a wider community. And so I've kind of, I've been able to leverage ASTs, actually. I used, I worked on a project at Boku where we were working with the Edge team, this is a while ago to modernize like thousands of tests that were actually written for ie but that were valid so these tests were valid because you know the web platform like is you know we don't break the web. And so when we implement the CSS feature,
Starting point is 00:27:47 when we implement this API, it's typically stable. We just usually enhance it. And so there's thousands of tests that were written for IE that were still valid for the web platform because they were testing open web standard APIs. But it was using an outdated harness. It was using a bunch of proprietary stuff, you know, et cetera. Like, so we needed to modernize it and get those tests ready to be shared with the entire world, like a via a web platform tests,
Starting point is 00:28:17 which is a project where all of the browsers, um, you know, browser engineers contribute and now have a shared test suite. And so there were a lot of similar patterns, but there were also a ton of conditions. And so I was able to leverage ASTs to help me power through a bunch of refactoring for like thousands of tests. And I was able to kind of make those changes safely and had I done that work manually it would have been like just X number of days more
Starting point is 00:28:55 and not only that but it's just yeah, error prone and like not a good use of a human brain and so I'm very pro automating repetitive work also using automation to kind of gate your uh like to limit your risk um but also to to make it easier for you to um repeat and rinse and iterate fast and when you use automated refactoring, what you're able to do is, you know, build up a set of transforms, you're able to change like 1000s of files at once. And, and if it's if you did something wrong, you just redo it, you know, you just get checkout, change your transform, and then
Starting point is 00:29:43 you know, run run your refactoring again and and so that type of like quick feedback loop is is necessary to be productive in 2019 and beyond uh and so you know we really need to examine um what type of architecture and what type of i would say not architecture what types of best practices that like we as a community have, because we are entering, you know, an age where we have a ton of aging code and infrastructure because our standards are changing so fast. And NPM dependencies are great. That's like a, like a, that's a good case study for looking at change. So if a library author changes an API or if you have an internal private module
Starting point is 00:30:33 and you want to deprecate something, you can use ASTs to upgrade to a newer version of the API safely. You can also use ASTs to write your own custom linting rules around, hey, I don't want anyone adding new versions of this. Like I'm going to, I have a count, a hard-coded count of all of the instances of this thing, and I don't want any new things added into my test. I don't want any new instances of this deprecated module being used
Starting point is 00:31:05 and you know so you can you can make that decision binary and you can enforce those things for your team you know in a way that's binary and where you know you're not having folks having unproductive discussions right so i'm a huge fan of like no nits no like like our code reviews shouldn't be uh we shouldn't be like arguing over things that are team conventions or previously agreed upon things. Brain power is expensive. If you make it binary, you'll have more productive discussions in code review.
Starting point is 00:31:44 Let's not talk about LinkedIn. let's not talk about this. And lastly, what I'll say is that using ASTs is one way to really, I think, add a resilience layer to your code base, right? Because, you know, if you're fixing a bug, the first thing you should ask yourself is, all right, I fixed this bug, could I have avoided this with a linting rule? And if the answer is no, the next question is, okay, could I have avoided this with an integration test? Sorry, a unit test. And if the answer is no, then integration test. For me, it's like writing your own custom
Starting point is 00:32:24 linting rules or custom transforms, and all of these things are like a first layer defense for a lot of things in code bases that are easy. Do ASTs typically be written in the language that you're testing against? Where do you begin? What language are they written in? Are they a separate project?
Starting point is 00:32:39 Do they live inside the monorepo? What's the landscape? Yeah, great question. I, so I've only worked with JavaScript, in terms of using tools around ASTs, and JavaScript. And so the, what you need is a parser. And, and there are projects like Babel that have their own, you know, they have their own parser. Esprima, Recast, there's tons of different JavaScript parsers. And the differences are like really nuanced because they all have the same general structure, but then they have some additional information.
Starting point is 00:33:18 Like, sorry, the trees that the ASTs trees that they output have a different information based on, you know, what the preferences are of that tool in terms of how they want to traverse their trees, etc. But typically, like it's a three like it's a it's a three step process. So the first thing you need is a actually a diagram here. I was going to say, should I share my screen? But this is a podcast, so we're going to have to talk through a diagram. So you need a parser, a transformer, and a generator. And so the parsing tool basically just creates a tree for the input code. And then you have a transform that basically lets you query the generated tree.
Starting point is 00:34:09 And then you can say, oh, here, I found the thing in the tree that I want. Now let me create a new AST. Let me create a new structure for what I want to replace. If I want to change the value of something or if I want to replace, you know, like if I want to change the value of something, or if I want to, you know, remove something or whatever, let me make that, let me make that change in the tree. And basically, a new tree gets generated from the, you know, from all the transforms. And then that tree that gets generated now needs to go back into code. So that's the third step, right? So we need a generator. So that's, that's like the reverse, it's the reverse of the parser. So it takes a tree and then it makes code. And so those are kind of the three
Starting point is 00:34:50 things. But we typically like, so depending on what tool you're using, you know, you're, you're kind of chaining together a parser, a traverser, you know, a transformer generator, or you're using something that like does everything for you altogether. JS CodeShift is what I really like to use because it's a wrapper for recast, which uses Esprima from Mozilla. It's a parser from Mozilla. So JS CodeShift wraps recast and gives it a very nice jQuery-style declarative API. So it's just really nice to write. And the folks at Facebook are behind JS Code Shift. But Recast is, you know, you can also use Recast, which is great.
Starting point is 00:35:41 I just enjoy the declarative nature of using a tool like JS Code Shift. But you're using JavaScript to write all those things, and there's an API that usually comes with whatever tool you're using so that you can query, but then you can also create. And then there's the last step, which is, okay, now that I've queried and I've created and I've you know now kind of make uh generate the tree and do an in-pile replacement and so in theory like the entire like when you babble babblify or whatever or when you run es lint you know the the if you make a if you use dash dash
Starting point is 00:36:19 fix to make the change in theory like the whole thing actually changes, but Git only shows the diff. So you only see the diff. So the whole tree, the whole file got replaced in place. So if we just take a simple example, maybe walk it through these three steps. If we had a simple example of refactoring, let's change all of our vars to const, for example. So I have all these var
Starting point is 00:36:45 statements i want to use const instead and i'm going to use an ast in order to do that so the first step would be take my file or my chunk of code that has the vars in it pass it through the parser right so i have raw text i'm passing it to a parser. The parser then generates the AST for me, returns an AST. Can I read that AST with my eyes, or is it a blob? You can read that AST. You can print it, you can log it, or you can use an awesome tool that I like to use, which is really, I think, kind of the standard around this.
Starting point is 00:37:27 It's astexplorer.net. It's a site which allows you to just place, you know, just drop code, pick your parser, pick your language, and, you know, you can view the tree. And so the really great thing is, you know, you can use this tool to visualize a tree. So there's no memorization here. Like, I don't need to know what the tree structure is for a function with, you know, that has a return value of this.
Starting point is 00:37:51 Like I can just drop it in and see the tree and then I can write the code for what I want to change it to and then see what that tree is. And so that, you know, you can, you can do reverse engineering to basically say, this is what I want to find and this is what I want to change it to. Or, you know, you have both versions and you can use that to drive how you build your transforms. And I think the best part about it is, like, this is all written in JavaScript. So you can, you know, these are node scripts that are running. And you can basically do anything you want in the middle of a transform. If you want, you can, you can say, oh, find me this like static lists of static array list of, I don't know, images from from from some cloud server. And you know and you can run a transform
Starting point is 00:38:45 to say, and then in your transform you can do an API request, get an updated list, do an in-place replacement. So you can do dynamic evaluations of your code so that you can actually have even though your code is static, it can actually be dynamic. You can use transforms to even change your code or do pre-evaluations and things like that. So it's very interesting. That is interesting.
Starting point is 00:39:08 So yeah, AST Explorer, I'll definitely recommend. I'm pulling it up here. It's a link in the show notes if you want to quick click on it. I think part of the ASTs is there's like this, like you said, you're trying to democratize this knowledge. There is like a mystical aspect of once you get below source code,
Starting point is 00:39:23 you're like, okay, we're now at a machine-generated thing. That's scary. Can I view it? It just seems like a little bit more nebulous, a little bit more vague. Abstract maybe might be a good term. But this does a good job, I think, just looking at the example. And I'm sure as you put in your own code into something like this,
Starting point is 00:39:44 it probably does a good job of demystifying some of that and saying you know what this is uh not all that unapproachable and something that is very valuable if you can if you can get past maybe a little bit of that abstractness so making it more concrete now um once i see once i have my ast like you said you can transform it so So the transformer operations, is that depending on the transforming tool that you are using? You mentioned a couple different tools, and one has a jQuery-style syntax. What would it be like if I was like,
Starting point is 00:40:14 take all my vars and make them const? Obviously you don't have to type out the code to us, but what kind of a transformer would that be? Well, you would say, so if you're on ast.net, for example, you can pick JS code shift as your transform tool. And you would basically say, so it uses a declarative jQuery style API. So, you know, your first thing is, you're looking at the file source, and then you're saying dot find, I'm looking for an identifier uh so i'm looking for like a variable name or a function name uh and i'm i'm now and then so find uh you
Starting point is 00:40:55 know identifier and then for dot for each right so it's just javascript looping on all of yeah iterate on all of the identifiers that you find and then you can have a matching. So you can say, if that node name is Jared, replace the value to be awesome. And that's it. And then.toSource, which prints the transform tree
Starting point is 00:41:25 back to the same file. And so it's as simple as that. It's actually mind-blowingly easy. On the JavaScript complexity metric, this ranks really low. This is way below TypeScript, in my opinion, for example. People look at TypeScript and they're for example like the the like you know people look at TypeScript and they're like I don't understand this you know this is like and then like a week
Starting point is 00:41:49 later they're like oh my god I'm converted forever like for me the the barrier to entry when I teach folks about ASTs is even lower than that like as soon as I show them an example they're like three minutes later they're like I'm sold I'm basically looking at this example right here and I'm pretty much sold as well because this is way more simple than I would expect it to be. I figured it would be a bigger buy-in. At least to get started, it seems like it's pretty straightforward. The tooling has made it really easy. When do you reach for something like this in terms of complexity? complexity because the simple example of like change my vars to const in my text editor i can
Starting point is 00:42:26 basically you know hit command shift a and just you know type in find all const or replace with var so there's certain things that our ides or our editors make those kind of refactorings pretty straightforward like a find all and replace but then when do you know and it's a little bit too complex so maybe it's just kind of like case by case you'll just know it when you need it or i guess the maybe the better question is like, is there enough of a barrier where you don't reach for this right away, but you kind of like upgrade to it
Starting point is 00:42:51 when it gets to a certain level of complexity? Yeah, I think that's a great question. I think it's about understanding what your needs are and what type of change you're trying to make. If you're trying to make something that's really simple and self-contained and something that you can just do with a find replace great but anytime your uh your change is conditional or or anytime your change is like more than one line right so if it's like a multi-line change uh that's where you really you know moving around function
Starting point is 00:43:21 parameters or uh you know uh like i would say deleting code, you know, things like that. There's, I would say that for the, like the true needs of like what we would do as developers to kind of refactor a set of hairy code that's widespread, that's when I would use a transform. So I would say that scale, right? So if something is repeated in multiple areas, if there's something that's a clear pattern,
Starting point is 00:43:58 if you're updating something where it can be really hard for kind of a regex to kind of pick up on the differences between things. Like, for example, modules, you know, when they're being imported, like, you know, I can also use the star syntax to change the name of something, right? So import foo, you know, as star, like, right, there's lots of little nuances there and you can use ASTs to make sure that the change that you're trying to make is you're changing the thing that you need to change and you're not
Starting point is 00:44:33 going to accidentally change something else. Maybe the first time your regex fails you. You've gotten so far with a regular expression and now it just missed a case. And you're like, instead of sitting here and iterating on that regex and just keep on tweaking it for these different cases, stop right there.
Starting point is 00:44:49 Now maybe it's time for an AST because you probably saved time that direction. Exactly. And I think the ramp up here, which is maybe your deeper question, really, I'm advocating for developers to have this in their tool chain, the same way they have like a linting support and running tests, right? So we should have an easy way for
Starting point is 00:45:11 folks to write transforms, we should just, you know, take the day or two that it takes to set that up, get that into the project with some examples, and make it so that, you know, folks have a path for for doing those things. And that can be twofold. You can use that as an opportunity to create a bunch of custom linting tools and while you're doing that, write support for using adding infrastructure for how to write transforms if you need to. But ultimately, if this is in our projects, you know, it's folks become,
Starting point is 00:45:49 even if they don't use it to check in code, even if they use it to just, you know, while they're developing something to find, you know, to find what they need, like, it's a way to, I think, level up the playing field for everybody. Because the stakes are getting higher. We have bigger code bases. Front ends are huge, right? We're not only thick clients, we have thick servers. And so I also think the culture of let's throw everything away and start over
Starting point is 00:46:23 is a really expensive one that isn't like a good thing we should be we should be promoting um folks should feel comfortable with refactoring code and they should like feel proud about it because you're you're able to still drive value for your for your product and your business while pushing your code forward uh so you know i'm just i'm personally sick of seeing like front end teams like start over from scratch every like 12 to 14 months. So let's like just not do that. TeamCity is a continuous integration and delivery server developed by JetBrains that helps you build, test, and release your software faster. It supports all popular build tools, test frameworks, version control systems, issue trackers, and cloud platforms out of the box with no plugins required. TeamCity visualizes your build, test, deploy pipelines, collects statistics on each step, pinpoints the root cause of failures, and suggests which commits might have caused the build failure.
Starting point is 00:47:26 The professional version of TeamCity is free even for commercial use and lets you set up up to 100 builds and run up to three builds in parallel. For large organizations out there, JetBrains offers TeamCity Enterprise and right now they're extending a special offer to our listeners.
Starting point is 00:47:41 Get additional build agents and new licenses of certain enterprise versions with a 50% discount. Head to teamcity.com slash changelog to learn more. Again, teamcity.com slash changelog. So, Emil, you're obviously passionate about this particular subject. It is somewhat dry. You have to convince people to pay attention to somewhat arcane knowledge like abstract syntax trees. But there's huge value that can come out of doing these refactorings and really allowing yourself to refactor better, faster, stronger.
Starting point is 00:48:28 Is this a tough sell in engineering teams? Or do you find it's pretty easy to convince people to institutionalize this kind of a tool in their toolbox? Yeah, that's a great question. I have to say that I think there's a few different things happening in our industry right now. One is like our kind of, there's like a dopamine hit that we get from new tools and new things. Fresh starts. Fresh starts and there's a problem with consistently working on new things,
Starting point is 00:48:59 which is there's a set of challenges for developing software that you just don't even get to really explore if you're constantly starting over your to-do Hello World app or your Create React app or whatever the hell else. Great to do that every once in a while. I'm not sure it's healthy to be creating new projects all the time in the sense that there's some real good engineering challenge that you get from having to
Starting point is 00:49:27 understand how to drive value, how to make change while still shipping to production. How do you maintain, how do you refactor safely? How do I refactor a billion hit a month code base while still pushing to production? And understanding how to do that safely, responsibly, what are the nuances of that in terms of testing?
Starting point is 00:49:56 There's so many interesting things. There's a class of problems that you just never get exposed to. So for me, I, you know, I, I, the heroes in our industry are really the folks who are working on legacy applications and still driving them forward and continuing, continuing to chip at them. And one of my kind of, I think my, some of my philosophical ideology comes from Martin Fowler, who has a really great article, which I think we're going to link in the show notes. I just sent that to you all. It's a strangler fig application.
Starting point is 00:50:31 So it's basically, he was on vacation somewhere, I think in New Zealand. But there's this tree where it's growing roots and it's slowly kind of strangling the thing. It's growing new roots, but it's slowly strangling the old ones. Basically, the idea here, the pattern, is that you can refactor your application module by module, bit by bit, while still driving value forward.
Starting point is 00:51:02 I'm personally sick of seeing like the next gen team versus the like old gen team. You know, I've so many companies, I've just, there's, you have a group of people that are working on something that is not shipping to production for like six months, 12 months, 14 months, 17 months, you get the drift, right? So ultimately, like you're building a whole set of things where you're not even getting that daily, you're not getting that feedback loop from your customers on what's working and what's not. You're developing the new version of your thing
Starting point is 00:51:35 in a complete silo. And so I think a really interesting problem that I had to solve a few years ago, actually, was I was to solve a few years ago actually was, uh, it was, I was new on a team and, um,
Starting point is 00:51:48 I was hired to like re re architect all of the UI, get us off of the legacy code. And it's, you know, it's really funny. I've never actually talked about this story. So I'm realizing now like that, like maybe this is the origin story for me.
Starting point is 00:52:03 Um, but you know, it was a backbone application and they wanted to switch to React. And I was like, we're not going to get rid of React. We're not going to get rid of all of these backbone views. The best part about React is it's just a library. And so maybe we just build infrastructure so that this whole new view, this new set of functionality that we're adding, maybe that's React.
Starting point is 00:52:22 And we were able to kind of push forward having all of our new views be React components while still leveraging the backbone components. Those two things lived in one ecosystem. It was a little more work, but we were able to slowly replace everything while still driving value, while getting feedback from customers in the wild
Starting point is 00:52:45 and like that's the type of challenge like for me that's that's what makes like a senior engineer that's what makes an architect that's what makes you know uh like that's what makes for somebody who really understands like the challenges and nuances of our craft and so, you know, this is like, we have more code now, like than ever, our forget our code, like, most of our code is actually third party dependencies. I think Google just did a study on, on that. And it's like, out of every 10 lines of JavaScript, like it's one line of code that's belongs to the application. That's a shocking number. But if you think about it, it's no surprise because the open source model is working.
Starting point is 00:53:32 That was what it was designed to do. We don't want to be reinventing the wheel. We want to be standing on the shoulders of giants. But at the same time, we need to understand, we need to be able to move quickly and shift, you know. And so if I need to, if I want to switch dependencies, like I want to be able to do so in a way that isn't going to set me back, or I want to be able to do so in a way that's, you know, safe. And it's not just changing dependencies, it's about
Starting point is 00:54:03 upgrading and all kinds of things. And so there's just changing dependencies. It's about upgrading and all kinds of things. And so there's a culture now with some of the larger frameworks, Angular being one of them, where, you know, they'll give you a set of transforms with the version bump, you know? So they're like, all right, like new major release, sorry for the breaking changes. But, you know, we're now going to give you a command to run so that you can migrate from five to six, six to seven. And so the bar is getting, yeah, yeah. So this is great. This is like when browsers compete for security and speed and all these other things.
Starting point is 00:54:39 These big libraries are now competing on user experience and DX more so actually, developer experience. So the bar is getting higher because the stakes are getting higher. We can start adopting those practices in our own code bases as application developers. And that's my pitch. I like that pitch. I know we have this shared metaphor that I'll just,
Starting point is 00:55:06 I'm not introducing either of you to it, but we have this metaphor of technical debt and this idea that you are taking on debt in order to gain somewhere else. And eventually, you know, the debt collector is going to come unless you manage that over time. And, you know, in finance, we have ways out. We can declare bankruptcy.
Starting point is 00:55:26 Of course, if you do it like Michael Scott, it doesn't quite work where he just walks out and says, bankruptcy. I don't know if you saw that episode, but it's one of my favorites. You can't just say the word out loud, Michael. He just goes out into the office and he just declares bankruptcy.
Starting point is 00:55:41 Who is that, Oscar the accountant? That's not how it works. You can't just declare bankruptcy. Anyways, off topic. But we have a lot of people declaring bankruptcy with their technical debts where I'm trying to get to because maybe it's part of the tie-in
Starting point is 00:55:58 with the Silicon Valley mindset, the startup mindset of you have to have a bunch of people spin up new things and then they die and then here comes a unicorn out of that, you know, you have to have a bunch of people spin up new things and then they die. And then here comes a unicorn out of that, right? Like a thousand failures, here comes one success. Maybe that mindset is tied in with the technological advances. And we get to this point where it's like, well, a new thing has to begin. I'm with you very much so on maintaining legacy code and that being really the software that provides value over a series of years
Starting point is 00:56:30 is de facto legacy, right? The reason why it's still around is because it's providing real value to real people. But is there a point where you've come across any code where it's like, you know what? You guys didn't manage the technical debt here. I like the idea of pushing the thing forward, but sometimes you're like pushing up against a wall.
Starting point is 00:56:49 Are there limits to this? Yeah, are there limits to this ideology or can we refactor, you know, all things? I'm sure there is, I'm sure there is, there are cases like, although I think they're very rare, where you have to completely, you know completely just abandon ship for the entire project. But with the kind of module-by-module approach,
Starting point is 00:57:13 the idea here is that you're taking one vertical segment and replacing it and then throwing away the code that you don't want. Right. Instead of throwing the whole thing out. Or you're doing're doing it, or you're refactoring in place. So either one. But I think for me, an acknowledgement that we don't make enough in our industry, and I think you're totally right about your kind of analysis on, maybe it's Silicon Valley culture,
Starting point is 00:57:46 maybe there's some kind of culture bleeding here with just a race to the top, right? But we don't acknowledge, like, I feel like enterprise code is like, it's its own beast in our community, you know? So you're either enterprise versus small medium versus the create React app world. And so these three kind of paradigms where I think
Starting point is 00:58:15 nobody wants to be enterprise. I think we even coined the term enterprise dude in the team that I was on. Enterprise dude always ruins everything for everybody. Enterprise Dude is always relying on the least supported version of something and is holding back people from being able to upgrade things. Anyways, but Enterprise. So real software, software that's been out in the wild and has had multiple developers work on it and just applications at scale been out in the wild that has had multiple developers work on it and like you know just
Starting point is 00:58:45 like like applications at scale have cross you know i've yet to kind of see applications at scale that don't use multiple languages that don't you have like just arcane like stories behind why this weirdo thing exists you know it's like all right when you open this file you're gonna have to turn around three times and tap your nose once it's just the most hilarious stories but applications are living, breathing they have craft, it's normal so I want to normalize weirdness
Starting point is 00:59:21 because that's just how applications evolve over time with multiple people. And so it's okay. There has to be some uncomfortableness in our code bases because ultimately you have to have something to be pushing forward as a team. I envy the folks who are really happy about everything and congratulations to them. Maybe this talk isn't for them. But this talk is for the 99% of us that are remaining that have hashtag real problems. I think it's Mike Tyson said, everybody has a plan until they get punched in the face.
Starting point is 00:59:59 And that's when everybody's plan goes out the window, basically. He knows that pretty well because he's punched a lot of people in the face. I think code is kind of like that. We all have this beautiful, perfect, pristine code until it hits production. It hits the real world. And once that happens, it hits the fan and you've got to make changes. And so the longer it's been in the real world, the more craggly it's going to look.
Starting point is 01:00:19 I'm looking at this picture on Martin Fowler's blog of the Strangler fig application. I'm thinking, that tree, that's an abstract some kind of tree. That tree is crazy looking. It's crazy looking. At the very minimum, you always have the CEO button. If your code is perfect, I challenge you to find one decision that wasn't the CEO button decision where it's just like, just put it there, make it happen, ship it now.
Starting point is 01:00:53 Thanks, CEOs. Well, now your talk is the first day of the conference, right? So you're on day one, that's October 14th. The conference actually happens October 13th or 15th. There's some workshops, et cetera, going on. If you are planning to go to this conference, which I would suggest you do so because, hey, we're going to be there. As a matter of fact, we're planning to have a live JS party at All Things Open. And I might be a future panelist or a future guest panelist on JS party.
Starting point is 01:01:21 So hopeful there at least. Yeah. CML day one. but I'm not sure which day our live thing is, but it's definitely going to be there. All things open happening in Raleigh, North Carolina, October 13th or 15th this year. And if you are thinking of registering, I would say that right now between the end of the month, their
Starting point is 01:01:40 mid-tier pricing is still active. So October 1st, it goes a little higher. It's still a very inexpensive conference. Even on its most expensive ticket period is $279. So not a very expensive conference to go to. Amazing speakers. Emily, you'll be there, of course. Jared and Cable will be on stage doing something. I'm not sure.
Starting point is 01:01:59 What is the plan, Jared? Do you have a plan? The plan will be revealed when the plan is revealed yeah it's it's it's gonna be a fit it's a fantastic conference um it's like just incredible speakers and lots of yeah i think it attracts an audience that is kind of you know really diverse and also has just an interesting breadth of problems. And so I highly recommend it. I'm really excited to be speaking there this year. I want to give a quick shout out too to Todd Lewis,
Starting point is 01:02:29 the organizer of that conference. He does such hard work to make that conference happen each year. Every time I talk to him, he's always moving. He's always moving. He's never still. He's always going. So Todd, great work on this conference.
Starting point is 01:02:44 Looking forward to being there there our first time there was in 2016 so we're glad to be back and Emma thank you so much for your time today and sharing your wisdom you are welcome back thank you so much and it was fun talking to you today thank you so much for having me it's been a pleasure alright thank you
Starting point is 01:03:00 for tuning in to this episode of the changelog hey guess what we have discussions on every single episode now so of the changelog. Hey, guess what? We have discussions on every single episode now. So head to changelog.com and discuss this episode. And if you want to help us grow this show, reach more listeners and influence more developers, do us a favor and give us a rating or review in iTunes or Apple podcasts. If you use Overcast, give us a star. If you tweet, tweet a link. If you make lists of your favorite podcasts, include us in it.
Starting point is 01:03:28 Also, thanks to Fastly, our bandwidth partner, Rollbar, our monitoring service, and Linode, our cloud server of choice. This episode is hosted by myself, Adam Stachowiak, and Jared Santo. And our music is done by Breakmaster Cylinder. If you want to hear more episodes like this, subscribe to our master feed at changelog.com slash master. Or go into your podcast app and search for Changelog Master. You'll find it. Thank you for tuning in this week. We'll see you again soon. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.