The Changelog: Software Development, Open Source - The PHP Language Specification (Interview)

Episode Date: November 11, 2014

Adam and Jerod talk with Sara Golemon about her work at Facebook, The PHP Language Specification, and making PHP awesome....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome back everyone, this is The Change Log and I'm your host Adams Dekowiak. This is episode 129. Jared and I talked to Sarah Goldman about her awesome work at Facebook and making PHP fast, awesome, and specced. This entire conversation is about getting the PHP spec out there, Facebook leading the way, but more importantly, Sarah leading the way on that front. This show is significantly delayed. Sarah, you're awesome. I'm really sorry. Please accept my apology.
Starting point is 00:00:40 This show is sponsored by DigitalOcean, CodeShip, and TopTile. We'll tell you a bit more about CodeShip and TopTile later in the show, but our friends at DigitalOcean, simple cloud hosting built for developers. In 55 seconds, you can have a cloud server with full root access, and it just doesn't get any easier than that. Pricing plans start at only $5 a month for half a gig of RAM, 20 gigs of SSD drive space, one CPU, and one terabyte of transfer. They got locations all over the world.
Starting point is 00:01:12 New York, San Francisco, Amsterdam, Singapore, and now their newest location, London. And you can easily migrate your data in between any of those regions, making sure that your data is always closest to your users. Use the promo code changelogVEMBER in all lowercase. Again, CHANGELOGNOVEMBER, all lowercase, very important, to get a $10 hosting credit. When you sign up, head to DigitalOcean.com right now to get started. And now, on to the show.
Starting point is 00:01:41 We're joined today by Sarah Goldman. She is, man, Sarah, I'm so impressed with what you're doing. You work at Facebook, so that's kind of a big deal. But not only do you work there, but you also make Facebook fast, which I think that that's been like the mantra of Facebook, to be fast since the beginning. So today we're joined by my managing editor, Jared Santo, and also Sarah Goldman from Facebook to talk about some cool stuff happening in the PHP world, specifically the PHP spec that's brand new. So
Starting point is 00:02:12 Sarah, welcome to the show. Thanks for having me. So I guess the best way to start navigating this conversation might be to tee up the post that you shared on the PHP Million List list which was sort of the announcement it was kind of at oscon and um and i'm not sure if it's oscon or oscon i kind of wasn't sure i've never been there so i've never heard anybody actually say it until just now so is it oscon or is it oscon you know i always say oscon but that doesn't mean that i'm right what do you think jared i'm'm going to go with OSCON. I think OSCON too.
Starting point is 00:02:46 Okay, so I wish I didn't say that at all then now because I feel like an idiot for thinking it's OSCON. Why would it be OS? Well, now that you actually say it out loud, it does seem like it should be OSCON. It's open source con, so that would be, I don't know. That's what I was thinking. Who says I should write some OS software?
Starting point is 00:03:02 Yeah, right? Good point. That's true. This is a heated debate. So this post was on Tuesday, July 22nd, which wasn't too long ago, but long enough ago that a lot of stuff's happened between now and then. So help us. And Jared, I don't want to speak for you, but I know that I'm pretty much a PHP novice. Like I've done some stuff with WordPress.
Starting point is 00:03:21 I've never written anything of any extent that Sarah's been to. So I'm totally a novice in the room just asking questions. I would consider myself an intermediate. An intermediate. Yeah. Okay. Not a pro, but I have some experience. So hold our hands along the way, Sarah.
Starting point is 00:03:38 Yes, please do. Please do. But tee this up. What happened? What does this mean for the PHP community? Well, I mean, so PHP has been around for like 18 years now. And just sort of grasp that in your mind for a second. And in that 18 years, it's gone completely as an organic growth, right?
Starting point is 00:03:57 It's sort of Rasmus wanted something to display his resume better. So he put together some scripts to do that. And then that kind of turned into a more of a compiled program to turn some HTML with a few little bits of code into something real. And it's all been organic ever since then. Even when Andy and Zev got involved to build PHP 3 with more like real engine, like you would find in any kind of sensible language, it was still organic because they were just trying to scratch their itch. And it's been a whole bunch of it scratching. And what you wind up with is what got popularly described as a fractal of bad design. And, you know, a lot of us kind of take that tongue in cheek because, well, all right,
Starting point is 00:04:40 it might be a fractal of bad design, but it runs most of the internet. So whatever. Yeah. Because, well, all right, it might be a fractal of bad design, but it runs most of the Internet. So whatever. But it's done all this without really having a clear picture of itself. It doesn't know how do you define what is proper PHP? All of the really serious languages like C, C++, they have these massive documents that describe what syntax should look like, what's valid grammar, that sort of thing. And we've been talking about it. I'm sorry, I'm going to say we and us in a lot of different contexts today. I'm going to try and keep track of which context that is. We, the PHP community, have been talking for a lot of years about how we kind of need to formalize what the language is. We need to say, all right, these are the behaviors you should
Starting point is 00:05:27 expect from the parser and what a script, a well-written script actually looks like, as opposed to having two different ways of doing if statements that look completely different or whatever it happens to be. So it's always been like, yeah, we should do that. We should do that. We should do that. But who wants to write documentation, right? None of the programmers, I don't want to write documentation. So fast forward years and years and years, Facebook's got this HHVM thing that we've built for, uh, running face, learning Facebook code very fast and hopefully other people's PHP code very fast. Um, and we're thinking, well, what can we do to give back, really? Because Facebook was built on PHP.
Starting point is 00:06:09 It was built on the public version of PHP. You know, Zach sitting in his dorm room putting together the first Facebook.net or whatever was just running regular PHP. Funnily enough, probably some code that I wrote in there. That's kind of cool. Yeah, I know. He, that's kind of cool. Yeah. And then we saw a Bosco figure.
Starting point is 00:06:27 Um, so what can we do to, to give back and show that we're serious about taking the PHP language seriously? You know, we want PHP to be seen as a better language instead of the fractal of bad design. So we said, well, here's something that not only has the community sort of been asking for this and hoping that they can put together a spec properly, but this will actually help HHVM at the same time because we want to be able to write a parser that is fully compliant with PHP. But how do we do that if we don't know what PHP is apart from looking at the source code? So it's not a completely selfless gesture either. So let's, can we pause there for just a second? Maybe for those listening, kind of catching up,
Starting point is 00:07:10 real quick mention, what is HHVM? Oh, of course, I'm sorry. HHVM stands for Hip Hop Virtual Machine. It's the basically third generation of a compiler that Facebook's been working on to run PHP code. It's ostensibly PHP syntax compatible. The problem we ran into about five years ago or so at this point is that PHP's code base is massive and we have a couple of users. So we need to be able to run that PHP as fast as possible. Changing to another language is possible, but it is obviously a large task.
Starting point is 00:07:53 We have something like 10 to the 7th lines of code. That's not a small project. Very big. Wow. Yeah, very big. I remember reading about your choice of Mercurial versus Git 2, and it was the choice between those two version controls was also based on how large your code base was
Starting point is 00:08:09 and how many developers you have committing to it on a daily basis too. Yeah, no. So our main code base of PHP, I don't touch it often. I'm mostly touching C++ code, but sometimes I go to and touch the PHP repo. And if I'm doing the checkout on git because we we're still supporting both modes at the moment um i can say git pull and then i'll walk away you know go down have lunch uh check myself in the mirror come back a long time is the point yeah um i do it
Starting point is 00:08:41 on mercurial and i just say HG update and now it's done. It is blazingly faster. We might need to earmark that topic just for the listeners sake because I know we covered that on the changelog. I know it's a big deal anytime Facebook makes choices and it sort of provides this rift for others to follow in the community because of your sheer size and also because of your engineering team and the talent you have, you know, you obviously tend to have a pretty good opinion, any pretty definitive opinion that sort of provides this divide to the community. And we covered just quickly your choice of Mercurial over Git. And I thought it was just enlightening the reasons why you chose it. Yeah. And there's more reasons than just speed. And I'm not going to go into all those
Starting point is 00:09:26 because that's actually not my area of expertise and I'll probably get some things wrong. I do just want to say that I have a lot of love for Git. I don't want to poop on Git about saying it's slower than Mercurial in all cases. It was a decision that Facebook made because our code base particularly needed speed to get developer efficiency up.
Starting point is 00:09:52 And that's developer efficiency is one of our watchwords when it comes to what we want to focus on. Focus on 10 to the seventh lines of code. That is just astounding. Yeah. You know, you have a large app when you consider, you know, reworking the underpinnings less work than actually rewriting it in a separate language. Well, really. I mean that's what it comes down to. What's going to be easier, rewriting in another language or making the language better? Can you give us maybe a snapshot too of the importance of HHVM to Facebook?
Starting point is 00:10:20 Because I remember reading – and help me piece this together. This is totally up you know off the cuff here but I remember reading a blog post about and I can't remember the names of who's involved so you could probably even name them if you'd like to but it was basically like down to the wire of getting this done or you'd have to like do something massive to get this just-in-time virtual machine in place to kind of read PHP code and from what I can understand basically decompile that down to binary or some other way of doing it. It was like this big deal and it was like down to the minute
Starting point is 00:10:50 and a five-year-long project and finally you had cracked it. Can you kind of give us a snapshot of that moment? That might be slightly dramatized for internet effect. I'm not sure. Okay, because it would seem dramatic to me. I will certainly say that when we know, when we, when we started building the hip hop project, which initially, by the way, was not a virtual machine or a just in time compiler, it was actually a, a PHP to C plus plus transpiler. When we first got that project going, we actually
Starting point is 00:11:20 were sort of hitting the limits of how much blood we could squeeze out of the PHP turn up for our code base and the number of users we had. Um, we literally could not buy hardware fast enough to be able to serve up every user that wanted to hit the site. Um, so in that sense, it was probably a bit of a crunch time. It was, it was a bit of, um, God, what are we going to do? Do we need to train everybody to write C++ code and get this thing running at real speeds? Are we going to pick up, I don't know, compiled Python or something like that?
Starting point is 00:11:53 I don't know. Consider the undertaking when you have that many engineers working on that much code. How long is that going to take? Turned out, the process of transpiling PHP to C++ code at the very base of it wasn't all that difficult. I don't want to take you away from Hyping who wrote the first version of hip hop,
Starting point is 00:12:19 but the basic of just doing that bits of transpiling got us a huge performance win. I think it was like an 80% win right off the bat and it came to like a two and a half times win within like a year or something like that. That's a huge gain when you can run two and a half times fewer sewers, right? Absolutely.
Starting point is 00:12:42 And that just gives you that breathing room to say, oh, thank thank god you know that ultimately uh led to the vm project because we looked at this transpiler option and we said well this has got a bunch of problems with it number one our developer environment now looks nothing like a production environment and it can't because you can you imagine as a developer if you make one tiny change to a little php file you then have to recompile all of these millions of lines of code just to see what difference comes out on your web page you would run screaming from that yes what was the compile time do you recall like um so uh i i yeah i can say that number sorry i was trying to decide if i could say that number um at the time that we switched off of the transpiler onto the vm um i want to say it
Starting point is 00:13:37 took about 20 minutes to build the entire site but that's not on a single machine that's actually on a fleet of machines because we're using dist cc to do this wow i think if you tried to do this on a single machine um it would be like you know a day's process or something like that it was definitely not something that developers could do so developers uh for a while wound up doing just regular php because it's close enough but then we started adding functionality to the language, like generators, for example. We've had for years, and PHP just got them in version 5.5. So we had these sort of hacks in place, like HPHPI,
Starting point is 00:14:17 which was slower than regular PHP, but it worked for development purposes and things like that. And it was just kind of messy. It led to some weird inconsistencies between dev and production. Excuse me. So that led off the VM project. And we had a bunch of guys who came from Microsoft at that time. They've worked on the CLR.
Starting point is 00:14:40 So they've built, you know, just-in-time compilers before. Recently, in fact so they brought a lot of that information to bear and that I think that kicked off somewhere around like 2009 something like that, slightly before we actually released Hip Hop to the
Starting point is 00:14:58 World in 2010 but it didn't really hit the point of running production code until January of 2013. So it took a while to get that one right. If I can maybe do a callback to our last show too, Jared. I want to make a note, I guess, to kind of go from where we are to talking about the PHP spec and what it's actually written. And it's kind of an aside, but a throwback to our most recent show, which was just released today, episode 127,
Starting point is 00:15:29 talking about keeping a changelog or the project Keep a Changelog from Olivier Lacan, which I could not say correctly on the show, but it's just whatever. I can't get over it. Anyways, what you say, though, Sarah, is that the first thing you'll notice is that it's written in Markdown. And that there's this slight lean towards something called restructured text. And it's something that I have an interface with.
Starting point is 00:15:55 Can you kind of talk a bit about your choice of what to write the spec in? Well, the original spec was actually written in MS Word. The contractor that we hired to work on the spec, he's got a lot of spec chops. He's worked on the C spec before. His name is Rex, and I'm going to butcher his last name, Jash, J-A-S-C-H-E, something like that. That's how I'd say it, yeah. See, I can't pronounce last names either. He's worked on specs before, but his tool of choice is MS Word. So God bless him, let him do what he needs to do. We're not going to put that into any kind of open source collaborative editing system because that just doesn't work for that. So we had to pick something. We look at
Starting point is 00:16:38 GitHub, we say, oh, okay, Markdown is natively supported by GitHub. It seems like it's probably expressive enough for what we need to do. So let's just use that as a starting point and we can switch off after that. When I made the original announcement at Ofcon and released that sort of PDF of the sample chapter, I asked for people's opinions, you know, what makes sense to you guys? You know, what formats do we want to be editing it in? And in those responses from the PHP mailing list, not from internally at Facebook, there were, of course, some bike shedding about,
Starting point is 00:17:16 oh, maybe we should go this direction. Well, this one has this advantage. That one has that advantage. Maybe ASCII docs are the right way to go. As is pretty typical with those kind of forums, there were a lot of answers. Slight lean towards restructured text from what I could see, but nothing really definitive. At the end of the day, the guy who was actually doing the transformation from Word doc to something sensible, Joel Marcy, who I was hoping was going to be on
Starting point is 00:17:42 this podcast, but he didn't make it. Bummer, Joel. You couldn't make it, man. I miss you. Where are you, Joel? At the end of the day, he had already started migrating things into WordDoc and they were looking great. So I just said, you know what? Finish the WordDoc and we will fix that later. There's always time for pull requests. And sure enough, one of the first big commits that was done by somebody outside of Facebook was to take this big monolithic markdown file and split it up into chapters, which was
Starting point is 00:18:13 something I was initially asking Joel for. And he's like, I got so much going on. I can't even think about that much. So it's great to see the PHP community have been so well receptive of this. Like, like I was, I was worried that there was going to be some sort of like, Oh, Facebook started to take over the language by imposing the spec on us. Right. But it's, it's really just been sort of like, Oh gosh, thanks guys. We were looking for this. Where'd you find it? So how long has this project been in the making? Is it, I mean,
Starting point is 00:18:42 I know 20 years, the language, the kind of the story we've kind of painted here, but how long has it been on your particular mind to sort of start lifting this up and actually making it happen, even from your perspective or Facebook's? I want to say that we made the decision that we were going to write a spec and publish one somewhere around last February. I think we actually started properly working on it, sorting out Rex's contract, things like that. I want to say we probably started working on it around March or possibly April. I can't say for sure. So just this year, not very long. So it seems like specs are far more important when you have many implementations. You look at something like JavaScript.
Starting point is 00:19:24 You have all these browser implementers, and they all need a spec to conform to. Is HHVM the second major PHP implementation, or is there a more diverse ecosystem that I'm not aware of? It depends on what you mean by major. I consider it the second major implementation, but a lot of people who have worked on other implementations would certainly disagree with me. There's implementations like
Starting point is 00:19:51 Phalinger, PHC, what's the other one I'm thinking of? HippieVM, which was released very recently and has spent a lot of time comparing themselves to us. So I'm not going to say they, they picked their name as a, as a bit of a gesture, but maybe. Um, so there's, there's a number of PHP implementations out there. I haven't seen a lot of chatter about many of them. Oh, road's end. I forgot to mention them. They're another implementation, but I'm pretty sure they're gone. Um, so having, having a spec is definitely important to bringing all of these different implementations together. But I think that's not the only benefit that we get out of it. Because if you look actually at PHP itself, it goes through these version cycles. Four to five was a big jump. Five to seven now is going to be a big jump. By the way, we're skipping six.
Starting point is 00:20:46 Why? There's history behind six. I don't think you want me to get in there. Very much like fertile six, yes. This is like in certain hotels, they don't have a 13th floor. You go from the 12th straight to the 14th. But come on, those people on the 14th know what floor they're really on. That's right
Starting point is 00:21:05 you laugh but in the discussion about what version to call it seven was actually highlighted as a lucky number oh is it yeah humans in our numbers uh no we were going to make unicode into the language for php6 um like four years ago or something like that. And the project got really far along to the point that even books were published about it. Those of us who worked on the Unicode implementation felt sort of a, you know, a connection to that. And then the project kind of died because of a number of reasons. And so there was never a 6. So a discussion came up about whether they'd pick 6 or 7. I don't want to belabor it.
Starting point is 00:21:46 Bottom line, we picked 7. Gosh, what was I talking about before we went off on a tangent? The spec and the next version, kind of. Oh, yes. So the usefulness of the spec is partially to give the PHP project something to make sure that, you know, we don't break things accidentally along the way. And we have broken things accidentally on a number of occasions. Remind me to explain to you why 0x0 plus 2 equals 4 sometimes.
Starting point is 00:22:19 It's also important for some of the revisions we're making to the language right now. There are two RFCs up on the PHP list. One for what's called uniform variable syntax. This is to make it sort of consistent when you say something like $a, square brackets, some subscript, parentheses, some function call, arrow, some method call, whatever you happen to do, piling these things together. What's the right evaluation order? Left to right, right to left, middle outwards, which is actually sort of like what it currently does and makes no sense.
Starting point is 00:22:55 Unifying that and making it make sense. Another guy, Nikita Popov, who's been really a big contributor in the PHP circles in the past few years. He's working on an abstract syntax tree for PHP, which is also another huge thing. PHP's compiler doesn't have an AST. It says, here are my parse expressions coming through. Let's just compile those straight to bytecode and don't look at the overall program at all. So he's introducing an AST, which is obviously a big opportunity to screw up the language.
Starting point is 00:23:28 Having, again, a conformance suite and a spec helps make sure that that doesn't happen. All right, let's pause the show for just a minute. Give a shout out to our sponsor, CodeShip. CodeShip is a hosted continuous deployment service that just works. We've been working with CodeShip for quite a while now. We really, really enjoy not only the product they've built, but the people behind it. You can easily set up continuous integration for your app today in just a few steps. And CodeShip has great support for lots of languages, all the test frameworks, as well as notification services.
Starting point is 00:24:03 They easily integrate with everything you can think of, GitHub, Bitbucket. You can deploy to cloud services like Heroku, AWS, Nojitsu, Google App Engine, or even your own service because that's the way you want to do it sometimes too. Setup only takes three minutes. It's so quick. It really is just so quick. Get started today with their free plan and make sure you use the code, the changelogpodcastOGPODCAST.
Starting point is 00:24:25 That's really important. Use THECHANGELOGPODCAST, and when you do that, you're going to get 20% off for three months on any plan you choose. Head to codeship.io and tell them The Changelog sent you. Well, let's talk about this backlash that didn't happen. What you maybe perhaps feared is that the community would say, okay, this is Facebook trying to grab a stranglehold around PHP, the language, by introducing the spec. And I don't necessarily believe that, but could you still speak to those fears perhaps maybe from Facebook's perspective? And then maybe – like you said, we have all these different ways, you know, you represent Facebook a little bit, and then you also represent just the PHP community, and how you balance those two as well would be interesting. Well, yeah, I mean, I'll start to speak to the second part of that first, because I've actually
Starting point is 00:25:17 been working on PHP for about the past dozen years or so. So I've got a lot of skin in the game in terms of code contributed to the PHP source code and involvement with the community. I wrote pretty much the book on writing extensions for PHP. But at the same time, I'm also working here for Facebook on HHVM largely because of that PHP work. I'm doing things like writing the actual extension API itself on the HHVM side. So I have interests on both sides of the fence. And when I come to the list, you know, it's, it's on the one hand, it's coming with the history of, of like having time and skin in the game with PHP, but it's also coming in with this yeah but she's working on that other php thing and um how how much of what she's requesting in this rfc or whatever is to improve hhvm so it can take over the world um i i don't think i have to tell you that there there is um
Starting point is 00:26:20 there is some degree of sort of distrust about Facebook and Facebook's intentions. I mean, do any Google search and you'll get plenty of those conspiracy theories. And some of those come through because we're all people and we want to protect what we see as good. And PHP's open source philosophy, I think, is actually really good. It's a really open project. It's got no BDFL. It's got nobody saying, no, this is making this really open source now. We're making this really friendly to developers out there. And, hey, here's a spec for it. You can look at that as, gosh, PHP is seeing a resurgence.
Starting point is 00:27:24 Or you can look at it as, hmm, embrace, extend, and extinguish. So I have personally gotten some of that kickback on other posts that I've put on the mailing list, but that did not happen at all here. I think everybody sort of saw the way we released this and the way that we, you know, tried to make sure that we focused on PHP as the source of truth and said, how can I fault this? You know, it's, it's, this is just a thing that now belongs to the PHP community. Like we with Facebook hat on didn't maintain any control over this. We said, here it is, public domain license, CC zero. We're putting it into PHP's Git repository. So they completely control the documents. It's, it's completely out of Facebook's hands at this point.
Starting point is 00:28:18 And maybe that's where we can dig in just a quick bit. Cause I know we talk about licensing on the show here and there, but maybe to catch up why you chose CC0, it's in quotes, no rights reserved. Can you talk about maybe the choice of that license versus say GPL or some other license you may have chose for other open source that Facebook has out there? Well, I can only speak to it so much because I didn't specifically pick the CC0 license. My personal favorite for my projects is BSD license, because I just like the little bits of attribution. But it comes down to what your philosophy about this sort of information is. We're just talking about a document at the end of the day.
Starting point is 00:28:57 We're not even talking about software. What is going to be most useful to a project like PHP? And like I said, PHP is a really open project. And for something like PHP, it makes sense to just say, you know what, here's some information for the world. What do we have to gain by putting a more restrictive license on it? Very little. You mentioned GPL. I could see the advantage of wanting to say that if somebody else grabs this and, you know, adds to it and extends it, you know, we would want to make sure that that's open and visible to everyone. I personally don't like the GPL license.
Starting point is 00:29:44 Well, I'm not holding it to the fence you're trying to figure out why you chose this place i just wanted to kind of get a snapshot because mostly from the the vantage point of ill will right when somebody does something in the world you you want to um you know depending upon the person obviously you want to say that person has goodwill for me so or that entity or that organization or that organization, or, you know, so your reputation does proceed you in a way that you've done a lot for
Starting point is 00:30:09 open source. And I just want to make sure that you have a chance here clearly to, to say, we chose this license for this reason, for the reasons it's open. It's, you know, it's not ours,
Starting point is 00:30:18 it's the communities and that kind of thing. So I didn't want to, uh, dang on that too far, but get the point across. Yeah. I mean, the only thing I could say about that is just like, that's the beautiful thing about CC0. It's literally no strings attached, you know? Yeah.
Starting point is 00:30:33 And it's a simple license. It's about three lines. You don't need a lot of grade to understand a license like that. So maybe this is just a left-wing question question but it seems kind of an obvious one to to me but you know it's just a document you just said that um it's not like it's code it's not like it's changing php really but what does this spec what does having it written out um fleshed out open source uh ccl uh c CC0 license attached to it. What does that do?
Starting point is 00:31:08 How do you expect or desire for the community to change because of this document now being there to specify how PHP should be? It's interesting you use the phrase, it's not changing the language, because as it turns out, it actually is. One of the first payoffs that we've seen from this is as people are looking through the document, a lot of pull requests coming through for simple things like grammar fixes and things like that, whatever. A few bugs have come up. One of them that I just worked on the other day noted that the spec says switch statements may only have one default block, which I think we can all agree makes sense.
Starting point is 00:31:49 And this user had noticed at some point in his code that he wrote a switch statement with two default blocks. And it caused a weird bug for him because he couldn't understand why that first default block wasn't getting executed. And so he filed a bug report and he said, this doesn't match. PHP allows multiple default statements. And when you have multiple, it'll execute the last one, which I think we can all agree is a bit clowny. So what should we do with that? Should we fix the spec to say multiple are allowed because that's what PHP does? Well, no, we shouldn't
Starting point is 00:32:21 actually, because that's really silly code. And I put it exactly that way to the list. I said, this is silly behavior that PHP supports probably by accident. Let's fix the language so it matches the spec. So that's what we're doing. And that's the benefit of having that spec. You've got a lot of eyes looking at it, this, and you've got that lived experience of these developers out in the wild who are saying, that doesn't jive with what i know so facebook has another uh language that
Starting point is 00:32:52 they're very interested in their very own hack language which i think they announced was it this year i think it was 2014 it's a few months back i think it was in april yeah april ish we know hbm compiles to Hack and PHP. How does Hack fit into this landscape with Facebook? Obviously it's not going to affect the PHP spec, or will it? So, Hack, we are writing a second spec, actually. Rex is
Starting point is 00:33:16 already busy back at work writing a spec for the Hack. It's got a second Word document open, huh? A second Word document, yes. Command new, or what's that? Control new. Never mind. When that's done, we're most likely going to publish that as well. Of course, that will be under the Facebook namespace on GitHub or possibly the HHVM namespace, I'm not sure.
Starting point is 00:33:36 Because it does make sense for us to own that document, at least for now. Hack is sort of, you could describe it as its own language, but I think if you know any PHP, you can look at a hack document and immediately understand what it does because it's, it's really more like PHP plus plus, um, which for those of you keeping track of PHP's rules, uh, if you have a string that you post, uh, incrementing that would turn out PHQ. Try and pronounce that in your head. I'll leave that to you. Um, so hack is, uh, as I said, PHP plus plus, uh, it's a different open tag. It drops a whole bunch of some of the clownier bits of PHP, the things that we look at and we say, why is that even in the language? Um, and it can do so safely because obviously if you're writing hack code, this is not something that was written in 1989 and still needs to function. Sorry, I meant 1999.
Starting point is 00:34:30 89 is a bit too far back. It also adds a number of things that we noticed sort of developing our own code base. It would have been really nice to have, and we're not really sure why PHP didn't add them. I know why, but that's another story. Things like scalar type hinting um php only allows type hinting for arrays and objects so we add type hinting for everything we even go beyond that parameterized type hinting um the sort of workhorse of php the array that can be a vector or a map or a set or whatever um we actually define these specifically as a vector a map map, a set, a pair, whatever else. So you can define more specialized structures that can behave more sensibly under
Starting point is 00:35:12 the hood. If I've got a vector event, that should literally be in memory, int, int, int, int, int, in a nice tight packed array. So there's a performance gain to be had there, but there's also a readability gain to be had. You don't have to look at $foo as an array and wonder what kind of array that is. You can look at $foo as a vector event and know exactly what you're dealing with. That helps the static analysis type checker, and it also helps you, as a human, understand what the code's doing. So, I mentioned static analysis type checker. That's sort of the workhorse of Hack. This is an extra program that runs in the background on a developer workstation.
Starting point is 00:35:49 And it reads all of your code base constantly, watches for updates on the file system. And it looks at all of the code paths for data moving through your system. So it says, okay, this is coming from dollar underscore request. Obviously, it's a string because that's what comes from the user. It's going into this function. so this function apparently accepts strings does it accept other types elsewhere no okay we'll say this type this function accepts strings it's going from there into some function elsewhere and it goes down to other paths it gets concatenated whatever else if you've got any sort of type error in that system, it's going to let you know that, hey, you should probably check this bit of code over here. We've converted 98 or something percent like that of our code base of our 10 to the seventh lines of code to using hack by running
Starting point is 00:36:37 a program that automatically goes through and makes all those changes. So now when somebody works on Facebook code, they see this code that's fully type annotated, has all these parameterized expressions to let them know what's moving through. And we have a lot fewer problems of people saying, oh, I would refactor my little helper class that surely nobody else is using. And then finding out that the site breaks because somebody was passing the wrong kind of data and it happened to work before. So, you know, there's an old saying, a servant can't serve two masters.
Starting point is 00:37:09 It seems like PHP has generated themselves a nice, or PHP. Facebook has this new, maybe not a master, but maybe a new toy. And you said that 98% of your code base is now over onto it, being a subset or maybe a superset of PHP. Is it a superset? Is that fair to say? Well, it's both a sub and a superset. Yeah, it's sort of a side set. I got you.
Starting point is 00:37:33 It's in a Venn diagram or something. Right, right. So just your personal opinion, where do you see Facebook's interest lie long term? At the same time, Facebook is investing into an open source public domain PHP spec. So it seems like they have interest in both things. Where do you see that moving into the future? So there's a few pieces of that answer. So as you see, you can't serve two masters. And that's a very fair statement on it. How much attention are we really paying to the regular PHP side of things? Well,
Starting point is 00:38:05 a language is more than just its syntax, right? It's also the whole runtime that comes behind it. And PHP has a massive runtime library. Those are completely shared in common. So, you know, we're obviously taking good care of those in common. The other half of that is a lot of the extra features that go into Hack are actually just development time features. They're not necessarily used in the runtime. Some pieces of them are, but not all of them. So what works for hack works equally well for PHP. We want to make sure that we still pass the conformance suite
Starting point is 00:38:39 and we're still behaving the way PHP expects. But we can work on hack without losing sight of PHP, um, modulo those, those sort of missing things. Gotcha. Um, you know, we, we kind of rely on external users to tell us when we're doing PHP wrong at this point, because we are all hack. Um, but we do have, you know, tens of thousands of tests that run on every single diff. So hopefully we're finding most of those things ourselves. And what was the other half of your question? I've already paged out.
Starting point is 00:39:14 So did I. Oh, I think it's, I think the, the kind of maybe the, the leave behind on that one might be just that you've got kind of these two parallels you're running. And to some, it seems like maybe it's a competitor. And to some, they can clearly see what you just described there, which was this sort of parallel effort. And it's sort of like sugar on top instead of like a competitor and a squashing. Well, I mean, hack is not meant to be a completely new language. It's meant to be something that can live alongside PHP.
Starting point is 00:39:49 And in fact, in most cases, it kind of has to. One of the things Hack doesn't let you do is have any top-level code. Well, your entry point can't actually launch without top-level code. So there has to be a PHP file in there somewhere. And it's about giving the developer the opportunity to use as much or as little of that functionality as they want to. And one other thing I think that's kind of neat about hack is just, I think the hacker hack culture that Facebook has propped up and just how, I guess, how awesome it is, I guess, in a sense to
Starting point is 00:40:28 say that you, you get not only to do some really awesome stuff, um, for developers across the world worldwide. Um, but you also get to come up with a language that's kind of named after your mantra, which to me is just like completes the world, you know? Yeah. At the end of the day, that's, that's pretty much, um, so, so the, the name of the language, that's another story. Um, it just completes the world, you know? Yeah, at the end of the day, that's pretty much... So the name of the language. That's another story. It's in a lot of our opinions, and even internally, it's a horrible name for a language. Because how do you Google that, right?
Starting point is 00:40:56 Yeah, I was thinking. Well, that's great. Now the NSA is watching me because I've talked about hacking something. They're already watching, so... Well, they're watching us, certainly. Oh, God, somebody's going to read something into that. No, I did not mean anything by that. Tell us more.
Starting point is 00:41:12 I already started. Just kidding. I just tweeted that out. Anytime you, and I think this natural addition of Lang after whatever it is, so Foo Lang, Hack Lang, PHP Lang, that makes sense. You've got SAS Lang, you know,
Starting point is 00:41:27 all these other different Ruby Langs. So the, the addition of Lang kind of helps maybe keep the NSA at bay. Well, I mean, it certainly is the same, same problem that go ran into. How generic is the word go,
Starting point is 00:41:38 right? Right. Yeah. It's a movie. It's a drug. It's a verb. It's a game. Whoa,
Starting point is 00:41:43 there's a drug called go. Yeah. I think, i don't know i'm not on the kids these days if y'all know we're gonna read into that yes we're definitely catching echelon's attention at this point well sarah you know the the one other thing i wanted to mention and you kind of did it a little tiny bit and i think i have to give you a little bit of applause because you seem to be pretty humble about maybe either the fact that we didn't allow you to give yourself a proper intro in the front of the show. But I think it's awesome that you've written this really awesome book, Extending and Embedding PHP.
Starting point is 00:42:18 You've been involved in the PHP community for a very long time. So you definitely have the battle scars to prove you are where you are for a reason. And obviously Facebook saw something in you because they hired you to work on making it fast, which is pretty much what everybody wants Facebook to be, right? It's what everybody wants all their sites to be. Yes. Um, yeah, that's, that's a true statement just as well. Um, I think you mentioned a couple tangential conversations we could probably have. I'm not sure if you want to bring them out or maybe take a minute or two just to touch on a couple of them. You're welcome to, but I think you mentioned uniform variable syntax and a couple others.
Starting point is 00:42:57 So feel free to riff for a minute or so. I'm not sure how much more I can say about uniform variable syntax as an example, because I mean, that's just sort of it was an RFC put forward as a guys were doing this kind of clowny. How can we fix this without breaking all the code out there? Which is really what the consternation on that particular subject has come down to. You know, people are expecting their expressions to work a certain way because they've always worked a certain way. They may even be muttering about it and saying, why do I have to put extra parentheses or why do I have to do weird things for this language that doesn't understand order of precedence?
Starting point is 00:43:37 At the same time, that could exist. And if we just like introduce that in like 5.7 or something like that, there would be uproar because stuff would break. Not my stuff. I put ridiculous numbers of parentheses and braces everywhere. I've been told off for using too many parentheses, in fact. But, you know, we, there are, there are warts on the language and everyone in the PHP internal list knows what those warts are because we get, you we get pelted with them on a regular basis.
Starting point is 00:44:07 PHP is a fractal of bad design. It's a double-claw hammer. It's a silly language, whatever it happens to be. It tends to get a bad rap, honestly. I mean, especially as, I dare to say even like this, but more modern ways or more modern things just meaning that they're newer. A lot of things happening in the JavaScript space with node and just with all sorts of other areas. Ruby is around 10 years. I think it's just, just turned 10 or just turned 15 or so now.
Starting point is 00:44:32 What rails is growing up and rails has turned 10. That's what it was. You know, so like people kind of cling to these new things, but there's been PHP for quite a while on it. And it tends to kind of get this bad rap because it's been around for so long and people almost look down upon it in some ways not that's why i really thought it would be important to have you on the show just to talk about the spec its importance and what you've been doing for the language and the community itself and then also kind of how that ties into facebook's approach to to making itself fast hhvm and everything else we've talked about so kind of
Starting point is 00:45:02 kind of neat there's a couple others do you you want to mention Abstract Syntax Tree or the other two that you've mentioned that were side conversations? Yeah, I mean, I sort of touched on both of them already. But yeah, the Abstract Syntax Tree is something, like I said, Nikita's working on. This used to never matter to me when I was working on a regular PHP engine,
Starting point is 00:45:25 because I'd look at the compiler and I'd say, well, you know, it gets the job done. It probably makes it faster not to have this intermediate representation. It's fine. We can just compile an expression. Here's a ternary statement. Okay. Make, emit the outputs for a ternary statement. Why do you need an extra abstract representation?
Starting point is 00:45:43 And then I started working on HHVM and I saw people who really understood how to write compilers. And I saw the way this abstract syntax tree got used in the process of compilation. I'm like, oh, that's why that makes a lot of sense. We can do a lot more optimization. We can do, we can do a lot fewer hacks to make these expressions work. We can make things just function right without being inscrutable. And I look back at the Zen engine and I say, you know, there are some parts in here that are kind of messed up. And the AST is going to help us fix that. It's not going to be anything visible to end users. Nobody's going to know what's gone in. But it's going to make our life as PHP engine developers a lot simpler.
Starting point is 00:46:30 All right, let's pause the show for a minute, give a shout out to a sponsor. We've been working with TopTile for a very long time. These guys are super awesome. And I kind of wanted to take a moment and pause this for a bit. And rather than just kind of give you an ad about what they're doing and what they're about, I kind of wanted to tell you a personal story. And part of that personal story is telling you a little bit about my day job. So beyond just the change log and what we do here, I have a full-time job at a nonprofit called Pure Charity. And we have a rail stack. And earlier this year, we had some developers leave the company and we had a rail stack. And earlier this year, we had some developers leave the company,
Starting point is 00:47:07 and we had a big push coming for the summertime for a new feature we were working on. And it hit me that we should call upon our awesome friends at TopTow. And just to kind of give you a snapshot, TopTal is a matchmaking service for really awesome developer opportunities and developers to get started. So we had a need for some really great Ruby and Rails developers, and TopTal helped us find developers that fit not only our budget, but also our culture, our coding style, all sorts of things. And long story short, they basically perform magic because these people we work with, I'm
Starting point is 00:47:53 going to give a shout out to them real quick if you don't mind, Guillerme, Andre, and Rafael, all listeners of the Change Law too, by the way. These guys are phenomenal. Good people, good coders coders and just great all around great and I have to say thanks to TopTow because they made this possible and if you've been thinking about freelancing if you're thinking about trying out a new technology or you wanted some flexibility in your work life balance and doing some. TopTile is a great place to be an elite engineer. Go to toptile.com slash developer to get started and tell them that ChangeLog sent you.
Starting point is 00:48:35 And totally, I think it might be completely in left wing here, but you also wrote LibSSH2. Do you want to touch on that real quick before we start to tell the call? Yeah, I mean, that's really nothing particularly PHP related, except in that at the time I was working on a lot of streams work in PHP. Streams are sort of this abstraction layer we have underneath all the fopen, fread, fwrite, those sort of calls, so that you can work with different sorts of resources. So you can do something like F open an HTTP URL, and that'll talk HTTP to the remote server, and you can F read off of that remote network resource. And it's great.
Starting point is 00:49:14 I thought, gosh, how cool would that be if I could do that with like SFTP files. Sorry, it's been a while since I even touched this. SFTP files or SCP resources, or just even be able to SSH into a system and send a command to it. You know, how cool would that be? Well, I looked at OpenSSL and said, can I actually, you know, pull a library out of this? Oh God. Oh God, no. Oh God, look away. OpenSSL is a lovely piece of software, but it's also got a very interesting code base. So I ended up just going to IETF and I said, where's the RFCs for secure shell? Oh, here they are. Let's start implementing a transport. Let's start implementing a few channels. Let's start implementing a transport. Let's start implementing a few channels. Let's start implementing this. Next thing I know, I've got this entire, you know,
Starting point is 00:50:11 client side library for connecting to SSH servers so that I can shove it into PHP and then promptly not use it because while it's cool, it's actually not that, you know, practically useful for anything that I'm working on. It was just sort of, I was working for the university at the time. And the thing about working for public institutions is that you have very relaxed goals and extra time on your hands, which is actually how I ended up getting involved in PHP in the first place. It sure seems like you enjoy diving in deep and getting into the nitty gritty. Is that fair to say?
Starting point is 00:50:50 Well, I like understanding what I'm working with. I will search Google for how-tos and documentations with the best of them. But if I'm really going to do something with something, I really want to understand how it works underneath. On the HHVM project right now, my main job is to make PHP a good open source project, which really means I don't have to look at much of the code at all, theoretically. I can work on the build system, some of the runtime library APIs, things like that, but I don't need to get down into the JIT and start issuing machine code instructions to do what I need to do for my job.
Starting point is 00:51:27 But, gosh, I'd actually like to understand how that stuff actually operates, wouldn't I? So I have. So I've got commits down in there. And I now, for no further use in my life probably, have the ABIs for Intel x64 architectures and ARMv8. I know that the first six integer arguments of a function call go to RDI, RSI, RDX, RCX, R8, and R9. The first eight SIMD registers go into XMM0 through XMM7, and then everything else goes on the stack. Will I use that again? Probably not.
Starting point is 00:52:07 But it was fun to write the code that actually used it, and it shortened the compile time of one of our files from 100 seconds down to 10. So that was good. That was good. That was good. That was really good. Yeah, that was really good.
Starting point is 00:52:21 We were using these recursive variadic templates, which, you know, God bless C++11. It's a beautiful extension to the C++ language. But, oh, it hurt my head to read that thing. Like, reading assembly was easier than reading this. That's saying a lot. Well, after listening to you talk for a while, I'm sure there might be people out there
Starting point is 00:52:43 to whom you're becoming their programming hero because you seem to have a lot of skills and, I'm sure. You know, there might be people out there to whom you're becoming their programming hero because you seem to have a lot of skills and a lot of knowledge. I want to turn that on you and ask as we wrap up here, who's a programmer that you look up to and that you would consider your programming hero? Well, I'm glad you said
Starting point is 00:52:59 look up to because the word hero is a really heavily loaded term for me. And I'm not going to say I have programming heroes. I definitely have people that I admire. Because the word hero is a really heavily loaded term for me. I'm not going to say I have programming heroes. I definitely have people that I admire. A couple of people on my team that I just want to give shout-outs to. Mark Williams, he's been on the project for a very long time. He understands everything about repo authoritative mode in our system and a bunch of the weirdly arcane bits of our system.
Starting point is 00:53:26 When somebody has a question, they go to Mark because Mark knows it top to bottom. He's a really good compiler designer. And he's actually really friendly in his responses. He's very generous with his information. Similarly, Jordan DeLong, I want to call out because this man knows the C++ spec by heart. He probably listens to it on tape every night. And he, like, when I come to him and I say, you know, I'm trying to solve this particular problem
Starting point is 00:53:55 and I need to achieve these two things, but I just don't see how they fit together. He'll just be like, oh, well, here. And he'll scribble something on a piece of paper and he'll hand it to me and say, try something like that. I mean, he'll explain be like oh well here and he'll scribble something on a piece of paper and he'll hand it to me and say try something like that i mean he'll explain it as well it's not as though he's just you know throwing a piece of paper at me but like he'll actually sketch out an implementation while we're sitting there and and and say you could try something like this this will probably do what you
Starting point is 00:54:18 want you may have to you know check the other thing over there um he smiles a lot he's a really friendly guy so i definitely want to call those guys out um heroes gosh you know honestly anyone who who looks at a piece of open source software that they use that they make their living on that they that they uh that they care about at all and says i I want to make this better. I want to give back. I want to do something that's not going to profit me immediately at all. Those are my heroes, man. Like just open source developers in general.
Starting point is 00:54:53 Like I love that there is this community out there. And I had a conversation on the last night of OSCON with a guy I've known for a long time, John Coggeshall. He's very concerned that some of our culture is getting lost. Some of some of our like collectively our commitment to to open source and real open source is getting sort of sucked up by the corporate machine. He actually made a bet with me that night. We were standing out in an intersection in Portland at like two o'clock in the morning shouting at each other He actually made a bet with me that night. We were standing out in an intersection in Portland at like 2 o'clock in the morning shouting at each other.
Starting point is 00:55:28 He made a bet with me. He said, I'll bet you $20. Facebook never actually lets go of the spec and never actually makes it a properly community open source thing. And he emailed me after the spec actually got released on PHP's Git server. He says, all right, I'll owe you $20. That's fast. That's a conversation I think we've kind of had here and there on this show, too,
Starting point is 00:55:52 just this sort of descent towards corporations and their takeover of open source and what true open source is. We've had to kind of call that. Yeah, you called it corporate source. Right, yeah. We had Chad Whitaker on who runs GitUp, and he's obviously pretty prolific in that. He's an open company kind of person. We had some deep conversations both on the show and then after the show as well with him on that.
Starting point is 00:56:13 So we've kind of danced around that quite a bit. I think that's just a natural fear when it comes to profit and source code. They just – then you've got things like Bounty Source and people wanting, and there's legitimate reasons for people wanting to raise money to build something. I'm thinking like Tim Caswell.
Starting point is 00:56:33 I don't know if you listened to that show or not, but he did some pretty cool stuff. And he's just really interested in building infrastructure code, not really building products on it. And he's trying to find ways to do that full-time and make it completely open source and i think that's just naturally it's something we want to support but it's it's neat to see the contrast of like corporates taking over and what you call real
Starting point is 00:56:57 open source i'm curious to know what he meant by that but one last question we have is, is it called arms? It's a call to arms to like the PHP spec, you know, whatever you can think of really that you're spending your days on. How can the community wrap themselves around whatever you think is most important? What's some good guidance to the PHP community as it as it is to what you're working on? Well, I mean, the first piece of guidance I would give, no matter what project we're talking about, whether it's PHP or anything else, you know, don't feel afraid to get involved in an open source project just because you don't think your coding skills are up to par or because you think that, you know, somebody's not going to like your ideas. You might get yelled at a couple of times because people are kind of jerks. Sometimes, yes. But not everyone's a jerk. Most people aren't jerks all the time.
Starting point is 00:57:51 And you can also pick something that you feel comfortable with. If that means documentation, God, you will get loved for writing documentation. You want to keep people from yelling at you? Write documentation, and they will love you for life. Something like the spec. We knew there were grammatical and spelling mistakes in the spec when we released it. And we're like, you know what? We're okay with that because that's a nice low-hanging fruit that somebody can come along and just say, hey, here's a pull request.
Starting point is 00:58:22 And the next thing you know, you've got somebody who's involved in the project who has this feeling of stakeholdership over it. Even if it's just, I got them to use the right spelling of the word to, you know, that I've done that before. That's something. Yeah. I mean, and the next thing that person's going to do is they're going to actually start writing some real documentation in there.
Starting point is 00:58:43 And then the next thing they're going to do is they're going to fix some little runtime function that is a nice, easy little tweak of code. My first patch to PHP, I should say, by the way, I did not go to school. Well, not to college anyway. I don't have any formal training in any language. I've learned C just kind of by jumping in and trying it out. My first patch to PHP with very little C experience was just to take the log function and give
Starting point is 00:59:12 it a second parameter so you can get, get logs in an arbitrary base. It was a really easy patch to do. It was a very tiny one. I sent it to the mailing list. They said, this is formatted wrong. Do it again. Oh, okay. And then I reformatted it. I sent it and they said, oh is formatted wrong. Do it again. Oh, okay. And then I reformatted it.
Starting point is 00:59:25 I sent it and they said, oh, this looks ugly. Thank you. Here, would you like some karma to commit some more patches in the future? Like that's literally all it takes to get involved in open source. And if you're sitting there and if you're thinking, gosh, I'd like to work on some project, but I'm just not up to it. You're wrong. Just do it. Just do it. Yeah. You're not going to get fired. I think the barriers are even lower now with the way that coding has become social with GitHub. I think back when, you know, in the karma days, it might have been a little different and higher barriers. And now it's even lower barriers. Oh, GitHub has done wonderful things for just bringing everybody out of the woodwork because you can find your project so fast. You can fork it with a single button press. You can make a little branch,
Starting point is 01:00:10 publish it to your own version of it. You don't have to find some place to host your code. It's just right there next to the project. People can even discover your fork of it through the UI. It's fantastic. Love GitHub. Love GitHub. Well, Sarah, it definitely has been quite a blast having this chat with you. Thank you so much for taking the time you have taken to step away from what you do at eight in the morning, your time to have this chat with us. I'm sorry for making you get up maybe a little bit earlier, or at least talking for this long and this excitedly about what you do at eight in the morning. It's just probably not your, maybe it's your norm. I don't know.
Starting point is 01:00:46 But I usually wake up about an hour and a half from now. Okay. So she woke up earlier just to have the conversation. So we really appreciate you taking the time and just your passion for, you know, for open source and even, you know, your hero statement. There was like anybody who commits to open source with a generous heart and just really wants to see it grow and not so much gain profit from it. And I really appreciate you sharing all that you have shared today on the show. And as best you can, keep in touch with us.
Starting point is 01:01:15 We'll do whatever we can to help mention whatever you do in the future. And maybe we can get someone back on the show again. I like the conversation we had there at the end, so I'll ping you via email and see if we can't extend some conversations we had here today. But I do want to mention three of our sponsors, DigitalOcean, CodeShip, and TopTile for helping make this show possible. They are awesome.
Starting point is 01:01:36 5x5 is awesome. If you don't listen to any other shows on 5x5, go to 5x5.tv right now and check some other shows out. The ChangeLog is on there at changelog we broadcast every week live myself Jared and awesome guests like Sarah so at this time everyone let's
Starting point is 01:01:53 say goodbye bye goodbye We'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.