The Changelog: Software Development, Open Source - Machine powered refactoring with AST's (Interview)
Episode Date: September 19, 2019Amal Hussein (Engineering Manager at npm) joined the show to talk about AST’s — aka, abstract syntax trees. Amal is giving a talk at All Things Open on the subject so we asked her to give us an ea...rly preview. She’s on a mission to democratize the knowledge and usage of AST’s to push legacy code and the web forward.
Transcript
Discussion (0)
Bandwidth for ChangeLog is provided by Fastly.
Learn more at Fastly.com.
We move fast and fix things here at ChangeLog because of Rollbar.
Check them out at Rollbar.com.
And we're hosted on Linode cloud servers.
Head to Linode.com slash ChangeLog.
This episode is brought to you by Linode, our cloud server of choice.
It's so easy to get started with Linode.
Servers start at just five bucks a month for your big ideas.
Head to Linode.com slash changelog.
Choose your flavor of Linux that works for you.
Then pick a location that's right for you.
London, Tokyo, Dallas, and many other places in the world.
They've got you covered.
Go from having that amazing shower idea to a hosted website in just minutes.
Start small.
Expand as your idea blossoms into a huge hit.
And we trust Linode because they keep it fast.
They keep it simple.
Check them out at leno.com slash changelog.
All right.
Welcome back, everyone.
This is the changelog, a podcast featuring the hackers, the leaders, and the innovators of software development.
I'm Adam Stachowiak, editor-in-Chief here at Changelog.
On today's show, we're talking to ML Hussain about ASTs,
aka Abstract Syntax Trees.
ML is giving a talk at All Things Open on the subject,
so we asked her to give us an early preview of it.
She's on a mission to democratize the knowledge and usage of ASTs
to push legacy code and the web forward.
And by the way, we'll be at All Things Open. We're hosting a live JS party on stage. Plus,
Jared is giving a talk on Svelte for a radical new approach to building user interfaces.
And as a special thanks from the team behind All Things Open, we're giving away five free passes
to the conference. And all you have to do is tweet, I want a free pass to All Things Open because,
and state your reason why.
And copy at changelog and at All Things Open in the tweet.
We'll send DMs to each winner next Friday, September 27th.
Good luck and enjoy the show.
Oh, one more thing for those who don't want to wait
and just want 20% off your pass right now,
use the code changelog20 when you buy your tickets.
The code is unlimited, so tell your friends.
Head to allthingsopen.com to learn more and register.
So, Amal, thanks for joining us.
First of all, congratulations to your first week
as engineering manager at NPM.
It's bittersweet.
Tell us what's new here.
Thanks so much, Jared and Adam.
So, hi, everyone.
My name is Amal Hussain.
I am a new engineering manager at NPM.
It's my first week.
And I came to NPM via Boku, where I was an open web engineer working on some pretty awesome
stuff in terms of web conformance suite testing with browser interoperability, as
well as working most recently on GameBender, which is a scratch-based game console, which
uses computer vision and all this other cool stuff, all open web APIs, and to teach kids
how to code creatively.
So that's what you were doing at Boku, or that's what you're doing now at NPM So that's what you were doing at Boku
or that's what you're doing now at NPM?
That's what I was doing at Boku.
I was doing a lot of work around products.
I would say product engineering.
And really it became very clear to me
that I needed to kind of boss up a little bit
because I was, you know, consistent, like just really,
I think strong at managing up sideways down and for a pretty large project.
I was a tech lead for that project.
And I just, I'm stepping into my love of product by, you know,
being an engineering manager, which combines the best of, I think, for me, the best of both worlds, which, you know,
you're able to be hands-on with the team and drive technical strategy.
And you're also able to work with all of the stakeholders that are involved in the
software delivery process.
And it's something that I really enjoy doing.
I've consistently been the go-to person at every team,
at every company for a variety of things.
It was a really difficult decision to make, if I'm honest.
It was very, very difficult.
I identify as a woman and as a person of color.
For me to walk away from the full time responsibilities of delivering
software, just just that aspect, it was it was a very difficult decision. And but I realized that
there's even less of me, you know, in engineering leadership. And so, you know, that's, that's where I think I get some kind of solace
which I'm giving folks an opportunity
to have a woman of color as a manager
which is a very rare thing for most people in our industry.
Well that's awesome.
I'll say congratulations and good luck
because you're just getting started
and I hope you have a lot of success there.
Boss up.
It was time to boss up. I like that. It success there. Boss up. It was time to boss up.
I like that.
It was time to boss up.
And own my bossiness too.
You just have to take a step back
and realize, you know what?
Hey, I can do this.
It's quite simply that.
And I think a lot more folks from our industry
need to make the hard decision that i
made because there's a ton of really bad managers um there's and uh folks who really um don't focus
enough on mentoring or don't focus enough on you know just kind of uh the overall uh technical
strategy so uh yeah tell you what the being a leader is one of the toughest positions because
you get criticized scrutinized
not only by yourself which is where it usually begins but then also from the externals you know
people who don't even know you will criticize you and then people who really know you will also
criticize you so everybody's being a leader is tough yeah it's a really tough position and that
one in particular that you mentioned you know with uh interfacing with so
many stakeholders it really requires somebody who's very empathetic right can see all sides
kind of be put the position put them in positions of everyone else's position to sort of like
drive the ball forward and take nothing personal or at least try to yeah Yeah, I agree wholeheartedly with your analysis there.
There's a great, great quote,
heavy lays the head who wears the crown
or something like that.
And there's a lot of, I think, freedom you get
in a leadership role where there's a lot of autonomy,
you're able to kind of drive decisions
and really make an impact for good or for bad.
But with that comes a lot of responsibility,
and one of those is taking responsibility for failures
or missed opportunities.
And I think what's interesting at NPM about this is
I've always had a dream of being a toolmaker, tooling,
and that's kind of like my stuff.
That's like my jam.
It's kind of always into architecture,
infrastructure, how things connect.
I'm very much like an in-between person.
When I worked on server-side code,
middleware was something that was interesting to me
because of the intersectional nature of it.
And so at NPM, in many ways I'm fulfilling my lifelong dream
of being a toolmaker.
And I think as an engineer that's a toolmaker,
we have the toughest customers because people are relying on us to then do their jobs
and make their magic happen.
There's this extra layer of not only scrutiny,
but also we're the toughest customers, software engineers.
We're the toughest and we're the toughest cost customers because we we make uh we could we can make the thing that we're using
if we really sat down sometimes you do sometimes you make your own thing because somebody else's
thing isn't good enough right you got two things right it's it's it's anybody else's thing ever good enough let's be honest you know
uh so yeah if if you could wave a magic wand and like you know have the skills to write your own
ide you know in in a day or a week i bet you would you know because you want it your way and so you
know there's there's an arrogance and there's a pickiness in our industry. And much of that, I think, is to be expected.
We have really hard jobs because ultimately the engineers that kind of criticize you as a toolmaker that serves them,
those same engineers are also criticized by their users and customers.
So ultimately they're also being judged.
So it's like an exponential judging chain.
What's interesting there is that contentment is often the enemy of progress, right? So like,
if you're content, you tend to not want to progress and get better, you know? So then
you have this idea of discontentment sort of like becoming a norm in our industry,
where in some cases discontentment is sort of frowned upon right
like to be discontent means you it's just like not a good position to be in i suppose because
it breeds envy and jealousy right right and so you know as an industry just based on the desire
to progress which we all want to because that means that our tooling gets better our software
gets better etc you know if we have to live lives of discontentment i wonder how that really impacts
us psychologically in our industry yeah i i think that's like i think that is a topic that
i would like to dive into um it like not not right now right here but but but but definitely
in the future because i i think there's the intersection of
psychology and all of the pressures that are on
us as engineers and the continuous improvement, continuous change.
I wish we had more
cultural anthropologists that were studying technologists because I think
there's a lot of really insightful behavior and just just insightful things in general that you know that are probably very
unique to our industry and and how those things kind of play out on our on our lives outside of
the terminal you know is i think that's another really interesting story i like that yeah beyond
the terminal yeah buy that domain now sounds like a podcast
since you mentioned your desire for this it's something a podcast we're actually creating
called brain science that's true oh we have that's dope in the pre-qual thing we mentioned
maybe i should be on your podcast you know we're actually taking guests sometime soon we we want
to dive into this we're exploring the inner workings of the human brain to understand things like behavior change head formation mental health and basically what it
means to be human so brain science applied not just what we know about the brain but how can
we apply what we know about the brain to sort of transform our lives and better our lives
and some of that is this thisologist-type approach towards our industry.
Yeah, I'm really happy to hear that.
There was a major at my college that was called Society, Technology, and Policy.
And I thought that if I was like 20 years older when I went to school, like I feel like that's what I would have done.
Because I find I would have probably done that as like a double major because it's, you know, for me, I consider myself like a very intersectional human because of a variety of things.
Like, you know, not just my family background and life experiences,
but even just my interests within the industry.
I'm an engineering manager.
That job is hugely intersectional.
And so I think that's a super relevant thing to explore
and what the effects of that are moving forward
as we progress in this new and uncharted territory
of the digital age.
I'll add one more layer to that.
We often look at the internet as in so many years,
like being a teenager.
I think, what, it's about 20 years old now?
I remember a couple years ago it got its uh its driver's license so i think okay yeah drinking
age like 21 in the u.s so i know that software's been around longer than that but that would mean
that in a similar way engineers in that era are similar in their maturity level not so much
individually but corporately.
Meaning that we've been doing this internet thing
for the same amount of time the internet's been around?
Basically, yeah.
So we can assume, I would say to some degree,
that our awareness of how to best drive the thing
is predicated on how old the thing is.
Definitely a young industry.
Right. So we're still learning.
We make mistakes.
And that's human.
And a changing industry, right?
Physicists, there's a lot to learn
beyond astrophysicists, right?
But the basics of physics
are the same as they've been.
And so that's a thing
that I go back to the idea of civil engineering.
How to build a bridge in a structurally sound manner
is a tried and true science.
Right, right.
It doesn't change every year.
You could have written that book 100 years ago
and it'd be slightly different now,
but it'd be pretty much the same foundations.
Whereas we're kind of figuring out
this software engineering network-based
industry where we live our lives and we have our jobs and they're kind of like in the same milieu
and like all that kind of stuff is we're very much living it out as we're trying to develop it and
we're making mistakes that impact people that we don't even know, et cetera, et cetera. So
it's very young and therefore I feel like we really don't understand
what all the implications are at this point. Yeah. I want to go a little deep, a little,
like I would say a dollar store philosophy maybe on y'all, which is here's my dollar store
philosophy. So what's really interesting about the web is not only how young it is,
but also the impact that it's had in the amount of time, right?
And just how exponential it is in so many ways.
And then you look at the under-the-hood experience with developers
and just how much change we've had
and how actually developing for the web
is an extremely hostile thing.
In what other industry do you know
where we create and we're like,
well, hope this works.
Ship this and I hope this works.
It's really interesting to watch the transitions
that we've had where 15 years ago
or more, it was like,
hey, actually probably about 15 years ago,
a user comes to a website and the server's like,
hey, tell me who you are.
And it's like, Netscape.
And then it's like, okay, here's your code for Netscape, right?
It's like, we've come a long way, even just in that where we've kind of, we're now driven
by features, you know, more of like progressive enhancement, but it's still very hostile,
right?
Because there's a ton of variability now.
It's a different type of variability.
It's not so much that browsers have a really low interoperability score.
It's that browsers are just so much more powerful.
And there's a bunch of other capabilities.
There's assumptions that you can make on the device size.
There's assumptions that you can or can't make on the capabilities that are enabled.
It's like the matrix is growing and the problems are changing.
It's really interesting. I kind of think of it like quantum computing style.
There's just so many things happening.
Some of this might even lead into the bigger topic we're here to talk about too, which is ASTs and legacy code and stuff like that.
Maybe a smaller topic, actually.
Well, something I want to say is yesterday's choices are today's consequences.
So yesterday's choices, and we're talking about our maturity level in terms of an industry and people
and even as an internet, that we're still learning.
But yesterday's choices are today's consequences.
And that's kind of where we get legacy code from and this need to transpile it into new ways
and take care of tech debt and all these things that come along with building software.
Good segue, Adam.
Yeah, great segue.
I saw that segue coming because I'm a podcaster myself.
I was like, we're getting there.
This is a long-winded introduction to a talk on ASTs.
I saw it coming as well, and it was so smooth
that I decided to call it out and make it completely not smooth
and destroy this.
That's right.
I actually just, I killed the segue.
You just janked it up.
That's fine, Jared.
It's okay.
We forgive you.
Thank you for the forgiveness.
But yes, change, change.
The internet is change, right?
So it's all about change.
And that's what we're here to talk about because I'm really excited.
I'm going to be talking about ASTs, um, at all things open this fall.
Uh, and, uh, yeah, I'm, I'm, I'm here to, here to answer all of your questions, uh,
Jared and Adam.
Give us the rundown from the uninitiated standpoint.
So what are ASTs?
Who uses them?
Why, why do they use them?
What's their purpose, et cetera.
Sure. So when we write software,
nowadays it's really high level.
Var equals foo.
It's human-readable words that are high level.
And in order for those things to be fed into a machine
and for your code to get turned into ones and zeros,
there's a series of steps that it goes through a compiler engine.
And so one of the first steps is taking your code and tokenizing it.
Tokenizing is a process where the valid syntax items,
so in JavaScript that might be like a triple equals,
is syntax, const is a token.
All of these things are kind of parsed.
And then a tree, so it's tokenized,
and then a tree is generated
from the structure of your code.
And so that tree is called an abstract syntax tree.
It's not limited to JavaScript.
It's, you know, every programming language uses abstract syntax trees to kind of feed into the compiler engine,
which, you know, translates all that stuff down to bytecode.
And the abstract syntax trees are extremely useful in programming
because they give us a predictable data structure,
which helps us understand our code.
And so if you're looking at a variable declaration, for example,
const Jared equals string awesome.
That's what I was going to say.
That one line of code, including the semicolon, gets
translated into a tree that has a predictable structure.
The first thing is, you know, it's a JSON tree that has, you know, a type program, you know, it has a body that's an array,
that body has, you know, declarations of, you know, which is an array of objects, object type tree.
And so it gives you this lovely output,
which is like a programmatic walkthrough of your code.
And the kind of secret sauce to ASTs here
is that there's a structure for you to understand
what something is.
So you can understand const something is, right? So you can understand like const Jared equals
string awesome. You know, I know that the identifier, you know, that the it's a variable
declarator, and the value is Jared. And I know that the value, you know, that the awesome is a string. And so, you know, there's no guesswork,
right? And so, and if you think about things like regular expressions that we've used to kind of
really parse and understand our code to like find like matches, there's a lot of,
there's inherently like a conflict between like, you know, trying to find something with regex
versus like using something like a tree, like that has a lot more detail and metadata,
because the regular expressions are really good for analyzing static code. And also,
but they're really not good at understanding the nuances about the differences
in your code.
So I'll give you an example.
So if you have something that's commented out, that's a variable declaration, versus
something that isn't.
If you have a function that uses the same name as the variable, right?
So if you're trying to find matches for that thing,
like it's very difficult for you.
You can do it.
Technically, you can.
It's like just an extremely complicated set of regex
that you would have to write, you know,
in order to make sure that the thing that you are looking for is a function.
So what this tree allows us to do is it basically opens up a whole body of being
able to really query your code and query it in a way that is extremely precise and scapular.
So you can say, I want to find all of the functions that have this, you know, that contain these conditions,
you know, that the conditions may be being things that are longer, you know, that have,
you know, more than 10 variable declarations, things that have like more than, you know,
four if statements, functions with more than one, you I want to find promises that don't have catch,
like they don't have error handling, right?
And so it enables us to do a multitude of things in order to understand our code programmatically
and deterministically.
And then the flip side of that is using tools that allow us to take ASTs and transform the
code so that we can actually do an in-place replacement.
You can now not only programmatically understand your code and find things, but also you can use that to do safe in place,
refactoring of your code.
This episode is brought to you by GitPrime.
GitPrime helps software teams accelerate their velocity and release products faster
by turning historical Git data into easy-to-understand insights and reports.
Because past performance predicts future performance, GitPrime can examine your Git data to identify bottlenecks,
compare sprints and releases over time, and enable data-driven discussions about engineering and product development.
Shift faster because you know more, not because you're rushing.
Get started at getprime.com slash changelog.
That's G-I-T-P-R-I-M-E dot com slash changelog.
Again, getprime.com slash changelog. So the title of your talk is Machine-Powered Refactoring,
Leverage ASTs to Push Your Legacy Code and the Web Forward.
You just described what ASTs are and what's interesting about them.
I think historically ASTs have been really
much the playground or the domain of people who are writing languages or thinking about
programming languages and have to have parsers that produce ASTs in order to take a syntax
and turn it into a thing a machine can understand.
It sounds like what you're arguing for is that there's a much more mainstream use case
for ASTs where lots of
developers should know what they are and be able to use them because they provide this metadata
and this structure. And we can use them not just to write programming language, but to actually
refactor, which is, I've never thought of this before. Can you expand on how you've done this,
how it works? And is this something that lots of people should be using?
It's really important for me to kind of democratize this knowledge
because most developers don't realize
that they are actually already using ASTs every day in their workflows
if they use things like Babel, Prettier, or ESLint.
All of these tools, we allow these tools to programmatically create code for us
and change code for us. And we trust them because of the precision nature that comes from leveraging
ASTs. And so there's a whole domain of, I think, there's a a domain of tools as well as a domain like some
domain areas in our industry um asts being one of them that i that are kind of locked away esoteric
in the library live yeah library author land for sure right and um and what happens with library
author land is you know folks are really busy.
They're maintainers for really large projects, you know, and they're already overburdened.
And, you know, getting good documentation is a challenge that like most folks have out of their projects. kind of taking the step to democratize the power of this has kind of been left on, like, I would say,
us as a wider community.
And so I've kind of, I've been able to leverage ASTs, actually.
I used, I worked on a project at Boku
where we were working with the Edge team,
this is a while ago to modernize like thousands of tests that were actually written for ie but that were valid so these tests were valid because
you know the web platform like is you know we don't break the web. And so when we implement the CSS feature,
when we implement this API,
it's typically stable. We just usually enhance it.
And so there's thousands of tests that were written for IE
that were still valid for the web platform
because they were testing open web standard APIs.
But it was using an outdated harness.
It was using a bunch of proprietary stuff, you know, et cetera. Like, so we needed to modernize
it and get those tests ready to be shared with the entire world, like a via a web platform tests,
which is a project where all of the browsers, um, you know, browser engineers contribute and now have a shared test suite.
And so there were a lot of similar patterns, but there were also a ton of conditions.
And so I was able to leverage ASTs to help me power through a bunch of refactoring
for like thousands of tests.
And I was able to kind of make those changes safely
and had I done that work manually
it would have been like
just X number of days more
and not only that but it's just
yeah, error prone and like
not a good use of a human brain
and so I'm very pro automating repetitive work also using
automation to kind of gate your uh like to limit your risk um but also to to make it easier for you
to um repeat and rinse and iterate fast and when you use automated refactoring, what you're able to do is, you know, build up a set of
transforms, you're able to change like 1000s of files at once. And, and if it's if you did
something wrong, you just redo it, you know, you just get checkout, change your transform, and then
you know, run run your refactoring
again and and so that type of like quick feedback loop is is necessary to be productive in 2019
and beyond uh and so you know we really need to examine um what type of architecture and what type
of i would say not architecture what types of best practices that like we as a community have, because we are entering, you know, an age where we have a ton
of aging code and infrastructure because our standards are changing so fast. And NPM dependencies
are great. That's like a, like a, that's a good case study for looking at change.
So if a library author changes an API
or if you have an internal private module
and you want to deprecate something,
you can use ASTs to upgrade to a newer version of the API safely.
You can also use ASTs to write your own custom linting rules around,
hey, I don't want anyone adding new versions of this.
Like I'm going to, I have a count,
a hard-coded count of all of the instances of this thing,
and I don't want any new things added into my test.
I don't want any new instances of this deprecated module being used
and you know so you can you can make that decision binary and you can enforce those
things for your team you know in a way that's binary and where you know you're not having folks
having unproductive discussions right so i'm a huge fan of like no nits no like like our code
reviews shouldn't be uh we shouldn't be like arguing over things that are team conventions
or previously agreed upon things.
Brain power is expensive.
If you make it binary,
you'll have more productive discussions in code review.
Let's not talk about LinkedIn. let's not talk about this. And lastly, what I'll say is that
using ASTs is one way to really, I think, add a resilience layer to your code base, right? Because,
you know, if you're fixing a bug, the first thing you should ask yourself is, all right,
I fixed this bug, could I have avoided this with a linting rule?
And if the answer is no, the next question is, okay, could I have avoided this
with an integration test? Sorry, a unit test.
And if the answer is no, then integration test.
For me, it's like writing your own custom
linting rules or custom transforms, and all of these things
are like a first layer defense for a lot of things
in code bases that are easy.
Do ASTs typically be written in the language
that you're testing against?
Where do you begin?
What language are they written in?
Are they a separate project?
Do they live inside the monorepo?
What's the landscape?
Yeah, great question. I, so I've only worked with
JavaScript, in terms of using tools around ASTs, and JavaScript. And so the, what you need is a
parser. And, and there are projects like Babel that have their own, you know, they have their own parser.
Esprima, Recast, there's tons of different JavaScript parsers.
And the differences are like really nuanced because they all have the same general structure,
but then they have some additional information.
Like, sorry, the trees that the ASTs trees that they output have a different information based on, you know, what the preferences are of that tool in terms of how they want to traverse their trees, etc.
But typically, like it's a three like it's a it's a three step process.
So the first thing you need is a actually a diagram here.
I was going to say, should I share my screen?
But this is a podcast, so we're going to have to talk through a diagram.
So you need a parser, a transformer, and a generator.
And so the parsing tool basically just creates a tree for the input code.
And then you have a transform that basically lets you query the generated tree.
And then you can say, oh, here, I found the thing in the tree that I want.
Now let me create a new AST.
Let me create a new structure for what I want to replace.
If I want to change the value of something or if I want to replace, you know, like if I want to change the value of something, or if I want to, you know, remove something or whatever, let me make that, let me make that change in the tree.
And basically, a new tree gets generated from the, you know, from all the transforms. And then that
tree that gets generated now needs to go back into code. So that's the third step, right? So we need
a generator. So that's, that's like the reverse, it's the reverse
of the parser. So it takes a tree and then it makes code. And so those are kind of the three
things. But we typically like, so depending on what tool you're using, you know, you're,
you're kind of chaining together a parser, a traverser, you know, a transformer generator,
or you're using something that like does everything for you altogether. JS CodeShift is what I really like to use because it's a wrapper for recast,
which uses Esprima from Mozilla. It's a parser from Mozilla. So JS CodeShift wraps recast
and gives it a very nice jQuery-style declarative API.
So it's just really nice to write.
And the folks at Facebook are behind JS Code Shift.
But Recast is, you know, you can also use Recast, which is great.
I just enjoy the declarative nature of using a tool like JS Code Shift.
But you're using JavaScript to write all those things,
and there's an API that usually comes with whatever tool you're using
so that you can query, but then you can also create.
And then there's the last step, which is, okay, now that I've queried
and I've created and I've you know now kind of make uh generate
the tree and do an in-pile replacement and so in theory like the entire like when you babble
babblify or whatever or when you run es lint you know the the if you make a if you use dash dash
fix to make the change in theory like the whole thing actually changes, but Git only shows the diff.
So you only see the diff.
So the whole tree, the whole file got replaced in place.
So if we just take a simple example,
maybe walk it through these three steps.
If we had a simple example of refactoring,
let's change all of our vars to const, for example.
So I have all these var
statements i want to use const instead and i'm going to use an ast in order to do that so the
first step would be take my file or my chunk of code that has the vars in it pass it through the
parser right so i have raw text i'm passing it to a parser. The parser then generates the AST for me, returns an AST.
Can I read that AST with my eyes, or is it a blob?
You can read that AST.
You can print it, you can log it,
or you can use an awesome tool that I like to use,
which is really, I think, kind of the standard around this.
It's astexplorer.net. It's a site which allows you to just place,
you know, just drop code, pick your parser,
pick your language, and, you know, you can view the tree.
And so the really great thing is, you know,
you can use this tool to visualize a tree.
So there's no memorization here.
Like, I don't need
to know what the tree structure is for a function with, you know, that has a return value of this.
Like I can just drop it in and see the tree and then I can write the code for what I want to
change it to and then see what that tree is. And so that, you know, you can, you can do reverse
engineering to basically say, this is what I want to find and this is what I want to change it to.
Or, you know, you have both versions and you can use that to drive how you build your transforms.
And I think the best part about it is, like, this is all written in JavaScript. So you can, you know, these are node scripts that are running. And you
can basically do anything you want in the middle of a transform. If you want, you can, you can say,
oh, find me this like static lists of static array list of, I don't know, images from from from some
cloud server. And you know and you can run a transform
to say, and then in your transform you can do an API request,
get an updated list, do an in-place replacement. So you can do
dynamic evaluations of your code so that you can actually have
even though your code is static, it can actually be
dynamic. You can use transforms to even change your code
or do pre-evaluations and things like that.
So it's very interesting.
That is interesting.
So yeah, AST Explorer, I'll definitely recommend.
I'm pulling it up here.
It's a link in the show notes
if you want to quick click on it.
I think part of the ASTs is there's like this,
like you said, you're trying to democratize this knowledge.
There is like a mystical aspect of
once you get below source code,
you're like, okay, we're now at a machine-generated thing.
That's scary.
Can I view it?
It just seems like a little bit more nebulous,
a little bit more vague.
Abstract maybe might be a good term.
But this does a good job, I think, just looking at the example.
And I'm sure as you put in your own code into something like this,
it probably does a good job of demystifying some of that and saying you know what this is uh not all that unapproachable
and something that is very valuable if you can if you can get past maybe a little bit of that
abstractness so making it more concrete now um once i see once i have my ast like you said you
can transform it so So the transformer operations,
is that depending on the transforming tool that you are using?
You mentioned a couple different tools,
and one has a jQuery-style syntax.
What would it be like if I was like,
take all my vars and make them const?
Obviously you don't have to type out the code to us,
but what kind of a transformer would that be?
Well, you would say, so if you're on ast.net, for example, you can pick JS code
shift as your transform tool. And you would basically say, so it uses a declarative jQuery
style API. So, you know, your first thing is, you're looking at the file source, and then you're
saying dot find, I'm looking for an identifier uh so i'm
looking for like a variable name or a function name uh and i'm i'm now and then so find uh you
know identifier and then for dot for each right so it's just javascript looping on all of yeah
iterate on all of the identifiers that you find and then you can have a matching.
So you can say,
if that node name is Jared,
replace the value to be awesome.
And that's it.
And then.toSource,
which prints the transform tree
back to the same file.
And so it's as simple as that.
It's actually mind-blowingly easy.
On the JavaScript complexity metric,
this ranks really low.
This is way below TypeScript, in my opinion, for example.
People look at TypeScript and they're for example like the the like you know people look
at TypeScript and they're like I don't understand this you know this is like and then like a week
later they're like oh my god I'm converted forever like for me the the barrier to entry when I teach
folks about ASTs is even lower than that like as soon as I show them an example they're like three
minutes later they're like I'm sold I'm basically looking at this example right here and I'm pretty
much sold as well because this is way more simple than I would expect it to be.
I figured it would be a bigger buy-in.
At least to get started, it seems like it's pretty straightforward.
The tooling has made it really easy.
When do you reach for something like this in terms of complexity? complexity because the simple example of like change my vars to const in my text editor i can
basically you know hit command shift a and just you know type in find all const or replace with
var so there's certain things that our ides or our editors make those kind of refactorings pretty
straightforward like a find all and replace but then when do you know and it's a little bit too
complex so maybe it's just kind of like case by case you'll just know it when you need it or
i guess the maybe the better question is like,
is there enough of a barrier
where you don't reach for this right away,
but you kind of like upgrade to it
when it gets to a certain level of complexity?
Yeah, I think that's a great question.
I think it's about understanding what your needs are
and what type of change you're trying to make.
If you're trying to make something
that's really simple and self-contained and something that you can just do with a find replace great but anytime
your uh your change is conditional or or anytime your change is like more than one line right so
if it's like a multi-line change uh that's where you really you know moving around function
parameters or uh you know uh like i would say deleting code, you know, things like that.
There's, I would say that for the,
like the true needs of like what we would do as developers
to kind of refactor a set of hairy code that's widespread,
that's when I would use a transform.
So I would say that scale, right?
So if something is repeated in multiple areas,
if there's something that's a clear pattern,
if you're updating something
where it can be really hard for
kind of a regex to kind of pick up on the differences between things.
Like, for example, modules, you know, when they're being imported, like, you know, I can also use the star syntax to change the name of something, right?
So import foo, you know, as star, like, right, there's lots of little nuances there and you can use ASTs to make sure
that the change that you're trying to make is
you're changing the thing that you
need to change and you're not
going to accidentally change something else.
Maybe the first time your regex
fails you. You've gotten so far
with a regular expression and now it just
missed a case. And you're like, instead of
sitting here and iterating on that regex
and just keep on tweaking it for these different cases,
stop right there.
Now maybe it's time for an AST
because you probably saved time that direction.
Exactly.
And I think the ramp up here,
which is maybe your deeper question,
really, I'm advocating for developers
to have this in their tool chain, the same way
they have like a linting support and running tests, right? So we should have an easy way for
folks to write transforms, we should just, you know, take the day or two that it takes to set
that up, get that into the project with some examples, and make it so that, you know, folks
have a path for for doing those things.
And that can be twofold. You can use that as an opportunity to create a bunch of custom linting tools
and while you're doing that, write
support for using
adding infrastructure for how to write transforms if you need to.
But ultimately, if this is in our projects, you know, it's folks become,
even if they don't use it to check in code, even if they use it to just,
you know, while they're developing something to find, you know, to find what they need,
like, it's a way to, I think, level up the playing field for everybody.
Because the stakes are getting higher.
We have bigger code bases.
Front ends are huge, right?
We're not only thick clients, we have thick servers.
And so I also think the culture of let's throw everything away and start over
is a really expensive one that isn't like a good thing we should be we should be promoting um folks should feel
comfortable with refactoring code and they should like feel proud about it because you're you're
able to still drive value for your for your product and your business while pushing your
code forward uh so you know i'm just i'm personally sick of seeing like front end teams like start over from scratch every like 12 to 14 months.
So let's like just not do that. TeamCity is a continuous integration and delivery server developed by JetBrains that helps you build, test, and release your software faster.
It supports all popular build tools, test frameworks, version control systems, issue trackers, and cloud platforms out of the box with no plugins required.
TeamCity visualizes your build, test, deploy pipelines, collects statistics on each step,
pinpoints the root cause of failures, and suggests which commits might have caused the build failure.
The professional version of TeamCity
is free even for commercial use
and lets you set up up to 100 builds
and run up to three builds in parallel.
For large organizations out there,
JetBrains offers TeamCity Enterprise
and right now they're extending
a special offer to our listeners.
Get additional build agents
and new licenses
of certain enterprise versions
with a 50% discount. Head to teamcity.com slash changelog to learn more. Again, teamcity.com slash
changelog. So, Emil, you're obviously passionate about this particular subject.
It is somewhat dry.
You have to convince people to pay attention to somewhat arcane knowledge like abstract syntax trees.
But there's huge value that can come out of doing these refactorings and really allowing yourself to refactor better, faster, stronger.
Is this a tough sell in engineering teams?
Or do you find it's pretty easy to convince people to institutionalize this kind of a tool in their toolbox?
Yeah, that's a great question.
I have to say that I think there's a few different things happening in our industry right now.
One is like our kind of, there's like a dopamine hit
that we get from new tools and new things.
Fresh starts.
Fresh starts and there's a problem with consistently working on new things,
which is there's a set of challenges for developing software
that you just don't even get to really explore
if you're constantly starting over
your to-do Hello World app
or your Create React app or whatever the hell else.
Great to do that every once in a while.
I'm not sure it's healthy to be creating new projects all the time
in the sense that there's some real good engineering challenge that you get from having to
understand how to drive value, how to make
change while still shipping to production.
How do you maintain, how do you refactor safely?
How do I refactor
a billion hit a month code base
while still pushing to production?
And understanding how to do that safely, responsibly,
what are the nuances of that in terms of testing?
There's so many interesting things.
There's a class of problems that you just never get exposed to.
So for me, I, you know, I, I, the heroes in
our industry are really the folks who are working on legacy applications and still driving them
forward and continuing, continuing to chip at them. And one of my kind of, I think my, some of
my philosophical ideology comes from Martin Fowler, who has a really great article, which I think we're going to link in the show notes.
I just sent that to you all.
It's a strangler fig application.
So it's basically, he was on vacation somewhere,
I think in New Zealand.
But there's this tree where it's growing roots
and it's slowly kind of strangling the thing.
It's growing new roots, but it's slowly strangling the old ones.
Basically, the idea here, the pattern,
is that you can refactor your application module by module,
bit by bit, while still driving value forward.
I'm personally sick of seeing like the next gen team versus
the like old gen team. You know, I've so many companies, I've just, there's, you have a group
of people that are working on something that is not shipping to production for like six months,
12 months, 14 months, 17 months, you get the drift, right? So ultimately, like you're building
a whole set of things where you're not even getting that daily,
you're not getting that feedback loop from your customers
on what's working and what's not.
You're developing the new version of your thing
in a complete silo.
And so I think a really interesting problem
that I had to solve a few years ago, actually,
was I was to solve a few years ago actually was,
uh,
it was,
I was new on a team and,
um,
I was hired to like re re architect all of the UI,
get us off of the legacy code.
And it's,
you know,
it's really funny.
I've never actually talked about this story.
So I'm realizing now like that,
like maybe this is the origin story for me.
Um,
but you know,
it was a backbone application and they wanted to switch to React.
And I was like, we're not going to get rid of React.
We're not going to get rid of all of these backbone views.
The best part about React is it's just a library.
And so maybe we just build infrastructure so that this whole new view,
this new set of functionality that we're adding, maybe that's React.
And we were able to kind of push forward
having all of our new views be React components
while still leveraging the backbone components.
Those two things lived in one ecosystem.
It was a little more work,
but we were able to slowly replace everything
while still driving value,
while getting feedback from customers in the wild
and like that's the type of challenge like for me that's that's what makes like a senior engineer
that's what makes an architect that's what makes you know uh like that's what makes for somebody
who really understands like the challenges and nuances of our craft and so, you know, this is like, we have more code now, like than ever, our forget our code,
like, most of our code is actually third party dependencies. I think Google just did a study on,
on that. And it's like, out of every 10 lines of JavaScript, like it's one line of code that's
belongs to the application.
That's a shocking number.
But if you think about it, it's no surprise because the open source model is working.
That was what it was designed to do.
We don't want to be reinventing the wheel.
We want to be standing on the shoulders of giants.
But at the same time,
we need to understand, we need to be
able to move quickly and shift, you know. And so if I need to, if I want to switch dependencies,
like I want to be able to do so in a way that isn't going to set me back, or I want to be able
to do so in a way that's, you know, safe. And it's not just changing dependencies, it's about
upgrading and all kinds of things. And so there's just changing dependencies. It's about upgrading and all
kinds of things. And so there's a culture now with some of the larger frameworks, Angular being one
of them, where, you know, they'll give you a set of transforms with the version bump, you know?
So they're like, all right, like new major release, sorry for the breaking changes. But,
you know, we're now going to give you a command
to run so that you can migrate from five to six, six to seven.
And so the bar is getting, yeah, yeah. So this is great.
This is like when browsers compete for security and speed and all these other things.
These big libraries are now competing on user experience and DX
more so actually, developer experience.
So the bar is getting higher because the stakes are getting higher.
We can start adopting those practices in our own code bases
as application developers.
And that's my pitch.
I like that pitch.
I know we have this shared metaphor that I'll just,
I'm not introducing either of you to it,
but we have this metaphor of technical debt
and this idea that you are taking on debt
in order to gain somewhere else.
And eventually, you know, the debt collector is going to come
unless you manage that over time.
And, you know, in finance, we have ways out.
We can declare bankruptcy.
Of course, if you do it like Michael Scott,
it doesn't quite work where he just walks out
and says, bankruptcy.
I don't know if you saw that episode,
but it's one of my favorites.
You can't just say the word out loud, Michael.
He just goes out into the office
and he just declares bankruptcy.
Who is that, Oscar the accountant?
That's not how it works.
You can't just declare bankruptcy.
Anyways, off topic.
But we have a lot of people
declaring bankruptcy with their technical debts
where I'm trying to get to
because maybe it's part of the tie-in
with the Silicon Valley mindset,
the startup mindset of you have to have
a bunch of people spin up new things
and then they die and then here comes a unicorn out of that, you know, you have to have a bunch of people spin up new things and then
they die. And then here comes a unicorn out of that, right? Like a thousand failures, here comes
one success. Maybe that mindset is tied in with the technological advances. And we get to this
point where it's like, well, a new thing has to begin. I'm with you very much so on maintaining legacy code and that being really the software
that provides value over a series of years
is de facto legacy, right?
The reason why it's still around
is because it's providing real value to real people.
But is there a point where you've come across any code
where it's like, you know what?
You guys didn't manage the technical debt here.
I like the idea of pushing the thing forward,
but sometimes you're like pushing up against a wall.
Are there limits to this?
Yeah, are there limits to this ideology
or can we refactor, you know, all things?
I'm sure there is, I'm sure there is,
there are cases like,
although I think they're very rare,
where you have to completely, you know completely just abandon ship for the entire project.
But with the kind of module-by-module approach,
the idea here is that you're taking one vertical segment
and replacing it and then throwing away the code that you don't want.
Right.
Instead of throwing the whole thing out. Or you're doing're doing it, or you're refactoring in place.
So either one.
But I think for me, an acknowledgement that we don't make enough in our industry,
and I think you're totally right about your kind of analysis on,
maybe it's Silicon Valley culture,
maybe there's some kind of culture bleeding here
with just a race to the top, right?
But we don't acknowledge, like,
I feel like enterprise code is like,
it's its own beast in our community, you know?
So you're either enterprise versus small medium versus
the create React app world.
And so these three kind of paradigms where I think
nobody wants to be enterprise. I think we even coined the term
enterprise dude in the team that I was on.
Enterprise dude always ruins everything for everybody.
Enterprise Dude is always relying on the least supported version of something
and is holding back people from being able to upgrade things.
Anyways, but Enterprise. So real software,
software that's been out in the wild and has had multiple developers work on it
and just applications at scale been out in the wild that has had multiple developers work on it and like you know just
like like applications at scale have cross you know i've yet to kind of see applications at
scale that don't use multiple languages that don't you have like just arcane like stories
behind why this weirdo thing exists you know it's like all right when you open this file
you're gonna have to turn around three times and tap your nose once
it's just the most hilarious stories
but applications are living, breathing
they have craft, it's normal
so I want to normalize weirdness
because that's just how applications evolve over time with multiple
people. And so it's okay. There has to be some uncomfortableness in our code bases because
ultimately you have to have something to be pushing forward as a team. I envy the folks who
are really happy about everything and congratulations to them.
Maybe this talk isn't for them.
But this talk is for the 99% of us that are remaining that have hashtag real problems.
I think it's Mike Tyson said,
everybody has a plan until they get punched in the face.
And that's when everybody's plan goes out the window, basically.
He knows that pretty well
because he's punched a lot of people in the face.
I think code is kind of like that.
We all have this beautiful, perfect, pristine code until it hits production.
It hits the real world.
And once that happens, it hits the fan and you've got to make changes.
And so the longer it's been in the real world, the more craggly it's going to look.
I'm looking at this picture on Martin Fowler's blog of the Strangler fig application.
I'm thinking, that tree, that's an abstract
some kind of tree. That tree is crazy looking.
It's crazy looking.
At the very minimum, you always have the CEO button.
If your code is perfect, I challenge you to
find one decision that wasn't the CEO button decision
where it's just like, just put it there, make it happen, ship it now.
Thanks, CEOs.
Well, now your talk is the first day of the conference, right?
So you're on day one, that's October 14th.
The conference actually happens October 13th or 15th.
There's some workshops, et cetera, going on.
If you are planning to go to this conference, which I would suggest you do so because, hey, we're going to be there.
As a matter of fact, we're planning to have a live JS party at All Things Open.
And I might be a future panelist or a future guest panelist on JS party.
So hopeful there at least.
Yeah.
CML day one. but I'm not sure
which day our live thing is, but it's definitely
going to be there. All things open happening
in Raleigh, North Carolina, October 13th or 15th this year.
And if you are thinking of registering, I would say
that right now between the end of the month, their
mid-tier pricing is still active. So October 1st, it goes a little
higher. It's still a very inexpensive conference.
Even on its most expensive ticket period is $279.
So not a very expensive conference to go to.
Amazing speakers.
Emily, you'll be there, of course.
Jared and Cable will be on stage doing something.
I'm not sure.
What is the plan, Jared?
Do you have a plan?
The plan will be revealed when the plan is revealed
yeah it's it's it's gonna be a fit it's a fantastic conference um it's like just incredible speakers
and lots of yeah i think it attracts an audience that is kind of you know really diverse and also
has just an interesting breadth of problems. And so I highly recommend it.
I'm really excited to be speaking there this year.
I want to give a quick shout out too to Todd Lewis,
the organizer of that conference.
He does such hard work
to make that conference happen each year.
Every time I talk to him, he's always moving.
He's always moving.
He's never still.
He's always going.
So Todd, great work on this conference.
Looking forward to being there there our first time there was
in 2016 so we're glad to be back
and Emma thank you so much for your time today
and sharing your wisdom you are welcome
back thank you so much and it was fun
talking to you today thank you so much for having me
it's been a pleasure
alright thank you
for tuning in to this episode of the changelog
hey guess what we have discussions
on every single episode now so of the changelog. Hey, guess what? We have discussions on every single episode now.
So head to changelog.com and discuss this episode.
And if you want to help us grow this show, reach more listeners and influence more developers,
do us a favor and give us a rating or review in iTunes or Apple podcasts.
If you use Overcast, give us a star. If you tweet, tweet a link.
If you make lists of your favorite podcasts, include us in it.
Also, thanks to Fastly, our bandwidth partner, Rollbar, our monitoring service, and Linode, our cloud server of choice.
This episode is hosted by myself, Adam Stachowiak, and Jared Santo.
And our music is done by Breakmaster Cylinder. If you want to hear more episodes like this, subscribe to our master feed at changelog.com slash master.
Or go into your podcast app and search for Changelog Master.
You'll find it.
Thank you for tuning in this week.
We'll see you again soon. Bye.