The Changelog: Software Development, Open Source - The PHP Language Specification (Interview)
Episode Date: November 11, 2014Adam and Jerod talk with Sara Golemon about her work at Facebook, The PHP Language Specification, and making PHP awesome....
Transcript
Discussion (0)
Welcome back everyone, this is The Change Log and I'm your host Adams Dekowiak.
This is episode 129.
Jared and I talked to Sarah Goldman about her awesome work at Facebook and making PHP
fast, awesome, and specced.
This entire conversation is about getting the PHP spec out there,
Facebook leading the way, but more importantly, Sarah leading the way on that front.
This show is significantly delayed.
Sarah, you're awesome. I'm really sorry. Please accept my apology.
This show is sponsored by DigitalOcean, CodeShip, and TopTile.
We'll tell you a bit more about CodeShip and TopTile later in the show,
but our friends at DigitalOcean, simple cloud hosting built for developers.
In 55 seconds, you can have a cloud server with full root access,
and it just doesn't get any easier than that.
Pricing plans start at only $5 a month for half a gig of RAM,
20 gigs of SSD drive space, one CPU, and one terabyte of transfer.
They got locations all over the world.
New York, San Francisco, Amsterdam, Singapore,
and now their newest location, London.
And you can easily migrate your data in between any of those regions,
making sure that your data is always closest to your users.
Use the promo code changelogVEMBER in all lowercase.
Again, CHANGELOGNOVEMBER, all lowercase, very important, to get a $10 hosting credit.
When you sign up, head to DigitalOcean.com right now to get started.
And now, on to the show.
We're joined today by Sarah Goldman.
She is, man, Sarah, I'm so impressed with what you're doing.
You work at Facebook, so that's kind of a big deal.
But not only do you work there, but you also make Facebook fast,
which I think that that's been like the mantra of Facebook, to be fast since the beginning.
So today we're joined by my managing editor, Jared Santo,
and also Sarah Goldman from Facebook to talk about
some cool stuff happening in the PHP world, specifically the PHP spec that's brand new. So
Sarah, welcome to the show. Thanks for having me.
So I guess the best way to start navigating this conversation might be to tee up the post that you
shared on the PHP Million List list which was sort of the
announcement it was kind of at oscon and um and i'm not sure if it's oscon or oscon i kind of
wasn't sure i've never been there so i've never heard anybody actually say it until just now so
is it oscon or is it oscon you know i always say oscon but that doesn't mean that i'm right
what do you think jared i'm'm going to go with OSCON.
I think OSCON too.
Okay, so I wish I didn't say that at all then now
because I feel like an idiot for thinking it's OSCON.
Why would it be OS?
Well, now that you actually say it out loud,
it does seem like it should be OSCON.
It's open source con, so that would be, I don't know.
That's what I was thinking.
Who says I should write some OS software?
Yeah, right?
Good point.
That's true.
This is a heated debate.
So this post was on Tuesday, July 22nd, which wasn't too long ago, but long enough ago that a lot of stuff's happened between now and then.
So help us.
And Jared, I don't want to speak for you, but I know that I'm pretty much a PHP novice.
Like I've done some stuff with WordPress.
I've never written anything of any extent that Sarah's been to.
So I'm totally a novice in the room just asking questions.
I would consider myself an intermediate.
An intermediate.
Yeah.
Okay.
Not a pro, but I have some experience.
So hold our hands along the way, Sarah.
Yes, please do.
Please do.
But tee this up.
What happened?
What does this mean for the PHP community?
Well, I mean, so PHP has been around for like 18 years now.
And just sort of grasp that in your mind for a second.
And in that 18 years, it's gone completely as an organic growth, right?
It's sort of Rasmus wanted something to display his resume better.
So he put together some scripts to do that. And then that kind of turned into a more of a compiled program to turn some HTML with a few little bits of code into
something real. And it's all been organic ever since then. Even when Andy and Zev got involved
to build PHP 3 with more like real engine, like you would find in any kind of sensible language,
it was still organic because
they were just trying to scratch their itch. And it's been a whole bunch of it scratching. And
what you wind up with is what got popularly described as a fractal of bad design.
And, you know, a lot of us kind of take that tongue in cheek because, well, all right,
it might be a fractal of bad design, but it runs most of the internet. So whatever.
Yeah. Because, well, all right, it might be a fractal of bad design, but it runs most of the Internet. So whatever.
But it's done all this without really having a clear picture of itself.
It doesn't know how do you define what is proper PHP?
All of the really serious languages like C, C++, they have these massive documents that describe what syntax should look like, what's valid grammar,
that sort of thing. And we've been talking about it. I'm sorry, I'm going to say we and us in a lot of different contexts today. I'm going to try and keep track of which context that is.
We, the PHP community, have been talking for a lot of years about how we kind of need to
formalize what the language is. We need to say, all right, these are the behaviors you should
expect from the parser and what a script, a well-written script actually looks like,
as opposed to having two different ways of doing if statements that look completely different or
whatever it happens to be. So it's always been like, yeah, we should do that. We should do that.
We should do that. But who wants to write documentation, right? None of the programmers, I don't want to write documentation.
So fast forward years and years and years, Facebook's got this HHVM thing that we've built
for, uh, running face, learning Facebook code very fast and hopefully other people's PHP code very
fast. Um, and we're thinking, well, what can we do to give back, really?
Because Facebook was built on PHP.
It was built on the public version of PHP.
You know, Zach sitting in his dorm room
putting together the first Facebook.net or whatever
was just running regular PHP.
Funnily enough, probably some code that I wrote in there.
That's kind of cool.
Yeah, I know. He, that's kind of cool. Yeah.
And then we saw a Bosco figure.
Um, so what can we do to, to give back and show that we're serious about taking the PHP
language seriously?
You know, we want PHP to be seen as a better language instead of the fractal of bad design.
So we said, well, here's something that not only has the community sort of been asking for this and hoping that they can put together a spec properly, but this will actually help HHVM at the same time because we want to be able to write a parser that is fully compliant with PHP.
But how do we do that if we don't know what PHP is apart from looking at the source code?
So it's not a completely selfless gesture either.
So let's, can we pause there for just a second?
Maybe for those listening, kind of catching up,
real quick mention, what is HHVM?
Oh, of course, I'm sorry.
HHVM stands for Hip Hop Virtual Machine.
It's the basically third generation of a compiler
that Facebook's been working on to run PHP code.
It's ostensibly PHP syntax compatible. The problem we ran into about five years ago or so at this
point is that PHP's code base is massive and we have a couple of users. So we need to be able to run that PHP as fast as possible.
Changing to another language is possible, but it is obviously a large task.
We have something like 10 to the 7th lines of code.
That's not a small project.
Very big.
Wow.
Yeah, very big.
I remember reading about your choice of Mercurial versus Git 2,
and it was the choice between those two version controls
was also based on how large your code base was
and how many developers you have committing to it on a daily basis too.
Yeah, no.
So our main code base of PHP, I don't touch it often.
I'm mostly touching C++ code,
but sometimes I go to and touch the PHP repo.
And if I'm doing the checkout on git because we we're still
supporting both modes at the moment um i can say git pull and then i'll walk away you know go down
have lunch uh check myself in the mirror come back a long time is the point yeah um i do it
on mercurial and i just say HG update and now it's done.
It is blazingly faster.
We might need to earmark that topic just for the listeners sake because I know we covered that on the changelog.
I know it's a big deal anytime Facebook makes choices and it sort of provides this rift for others to follow in the community because of your sheer size and also because of your engineering team and the talent
you have, you know, you obviously tend to have a pretty good opinion, any pretty definitive opinion
that sort of provides this divide to the community. And we covered just quickly your choice of
Mercurial over Git. And I thought it was just enlightening the reasons why you chose it.
Yeah. And there's more reasons than just speed. And I'm not going to go into all those
because that's actually not my area of expertise
and I'll probably get some things wrong.
I do just want to say that I have a lot of love for Git.
I don't want to poop on Git
about saying it's slower than Mercurial in all cases.
It was a decision that Facebook made
because our code base particularly needed speed
to get developer efficiency up.
And that's developer efficiency is one of our watchwords when it comes to what we want to focus on.
Focus on 10 to the seventh lines of code.
That is just astounding.
Yeah. You know, you have a large app when you consider, you know, reworking the underpinnings less work than actually rewriting it in a separate language.
Well, really.
I mean that's what it comes down to.
What's going to be easier, rewriting in another language or making the language better?
Can you give us maybe a snapshot too of the importance of HHVM to Facebook?
Because I remember reading – and help me piece this together.
This is totally up you know
off the cuff here but I remember reading a blog post about and I can't remember the names of who's
involved so you could probably even name them if you'd like to but it was basically like down to
the wire of getting this done or you'd have to like do something massive to get this just-in-time
virtual machine in place to kind of read PHP code and from what I can understand basically
decompile that down to binary or some other way of doing it.
It was like this big deal and it was like down to the minute
and a five-year-long project and finally you had cracked it.
Can you kind of give us a snapshot of that moment?
That might be slightly dramatized for internet effect.
I'm not sure.
Okay, because it would seem dramatic to me.
I will certainly say that when we know, when we, when we started building the hip hop project,
which initially, by the way, was not a virtual machine or a just in time compiler, it was
actually a, a PHP to C plus plus transpiler. When we first got that project going, we actually
were sort of hitting the limits of how much blood we could squeeze out of the PHP turn
up for our code base and the number of users we had. Um, we literally could not buy hardware fast
enough to be able to serve up every user that wanted to hit the site. Um, so in that sense,
it was probably a bit of a crunch time. It was, it was a bit of, um, God, what are we going to do?
Do we need to train everybody to write C++ code
and get this thing running at real speeds?
Are we going to pick up, I don't know,
compiled Python or something like that?
I don't know.
Consider the undertaking when you have that many engineers
working on that much code.
How long is that going to take?
Turned out, the process of transpiling PHP to C++ code
at the very base of it wasn't all that difficult.
I don't want to take you away from Hyping
who wrote the first version of hip hop,
but the basic of just doing that bits of transpiling
got us a huge performance win.
I think it was like an 80% win right off the bat
and it came to like a two and a half times win
within like a year or something like that.
That's a huge gain when you can run
two and a half times fewer sewers, right?
Absolutely.
And that just gives you that breathing room to say,
oh, thank thank god you know
that ultimately uh led to the vm project because we looked at this transpiler option and we said
well this has got a bunch of problems with it number one our developer environment now looks
nothing like a production environment and it can't because you can you imagine as a developer if you make one tiny change to a little php file you then have to recompile all of these millions
of lines of code just to see what difference comes out on your web page you would run screaming from
that yes what was the compile time do you recall like um so uh i i yeah i can say that number sorry i was trying to decide if i could say that number
um at the time that we switched off of the transpiler onto the vm um i want to say it
took about 20 minutes to build the entire site but that's not on a single machine that's actually
on a fleet of machines because we're using dist
cc to do this wow i think if you tried to do this on a single machine um it would be like you know
a day's process or something like that it was definitely not something that developers could
do so developers uh for a while wound up doing just regular php because it's close enough but
then we started adding functionality to the language,
like generators, for example. We've had for years, and PHP just got them in version 5.5.
So we had these sort of hacks in place, like HPHPI,
which was slower than regular PHP,
but it worked for development purposes and things like that.
And it was just kind of messy.
It led to some weird inconsistencies between dev and production.
Excuse me.
So that led off the VM project.
And we had a bunch of guys who came from Microsoft at that time.
They've worked on the CLR.
So they've built, you know, just-in-time compilers before.
Recently, in fact
so they brought a lot of that information to bear
and that
I think that kicked off somewhere
around like 2009
something like that, slightly before we actually
released Hip Hop to the
World in 2010
but it didn't really hit the
point of running production code until
January of 2013.
So it took a while to get that one right.
If I can maybe do a callback to our last show too, Jared.
I want to make a note, I guess, to kind of go from where we are to talking about the PHP spec and what it's actually written. And it's kind of an aside, but a throwback to our most recent show,
which was just released today, episode 127,
talking about keeping a changelog or the project Keep a Changelog
from Olivier Lacan, which I could not say correctly on the show,
but it's just whatever.
I can't get over it.
Anyways, what you say, though, Sarah,
is that the first thing you'll notice is that it's written in Markdown.
And that there's this slight lean towards something called restructured text.
And it's something that I have an interface with.
Can you kind of talk a bit about your choice of what to write the spec in?
Well, the original spec was actually written in MS Word. The contractor that we hired to work
on the spec, he's got a lot of spec chops. He's worked on the C spec before. His name is Rex,
and I'm going to butcher his last name, Jash, J-A-S-C-H-E, something like that.
That's how I'd say it, yeah. See, I can't pronounce last names either.
He's worked on specs before, but his tool of choice is MS Word. So God bless him,
let him do what he needs to do. We're not going to put that into any kind of open source collaborative
editing system because that just doesn't work for that. So we had to pick something. We look at
GitHub, we say, oh, okay, Markdown is natively supported by GitHub. It seems like it's probably
expressive enough for
what we need to do. So let's just use that as a starting point and we can switch off after that.
When I made the original announcement at Ofcon and released that sort of PDF of the sample chapter,
I asked for people's opinions, you know, what makes sense to you guys? You know, what formats do we want to be editing it in?
And in those responses from the PHP mailing list,
not from internally at Facebook,
there were, of course, some bike shedding about,
oh, maybe we should go this direction.
Well, this one has this advantage.
That one has that advantage.
Maybe ASCII docs are the right way to go.
As is pretty typical with those kind
of forums, there were a lot of answers. Slight lean towards restructured text from what I could
see, but nothing really definitive. At the end of the day, the guy who was actually doing the
transformation from Word doc to something sensible, Joel Marcy, who I was hoping was going to be on
this podcast, but he didn't make it. Bummer, Joel. You couldn't make it, man.
I miss you.
Where are you, Joel?
At the end of the day, he had already started migrating things into WordDoc and they were looking great.
So I just said, you know what?
Finish the WordDoc and we will fix that later.
There's always time for pull requests. And sure enough, one of the first big commits that was done by somebody outside of
Facebook was to take this big monolithic markdown file and split it up into chapters, which was
something I was initially asking Joel for. And he's like, I got so much going on. I can't even
think about that much. So it's great to see the PHP community have been so well receptive of this.
Like, like I was,
I was worried that there was going to be some sort of like, Oh,
Facebook started to take over the language by imposing the spec on us.
Right. But it's, it's really just been sort of like, Oh gosh, thanks guys.
We were looking for this. Where'd you find it?
So how long has this project been in the making? Is it, I mean,
I know 20 years, the language, the kind of the story we've kind of painted here, but how long has it been on your particular mind to sort of start lifting this up and actually making it happen, even from your perspective or Facebook's?
I want to say that we made the decision that we were going to write a spec and publish one somewhere around last February.
I think we actually started properly working on it,
sorting out Rex's contract, things like that.
I want to say we probably started working on it around March or possibly April.
I can't say for sure.
So just this year, not very long.
So it seems like specs are far more important when you have many implementations. You look at something like JavaScript.
You have all these browser implementers,
and they all need a spec to conform to.
Is HHVM the second major PHP implementation,
or is there a more diverse ecosystem that I'm not aware of?
It depends on what you mean by major.
I consider it the second major implementation,
but a lot of people who have
worked on other implementations would certainly disagree with me. There's implementations like
Phalinger, PHC, what's the other one I'm thinking of? HippieVM, which was released very recently and
has spent a lot of time comparing themselves to us. So I'm not going to say they, they picked their name as a, as a bit of a gesture, but maybe. Um, so there's, there's a number of PHP
implementations out there. I haven't seen a lot of chatter about many of them. Oh,
road's end. I forgot to mention them. They're another implementation, but I'm pretty sure
they're gone. Um, so having, having a spec is definitely important to bringing all of these different implementations
together. But I think that's not the only benefit that we get out of it. Because if you look
actually at PHP itself, it goes through these version cycles. Four to five was a big jump.
Five to seven now is going to be a big jump. By the way, we're skipping six.
Why?
There's history behind six.
I don't think you want me to get in there.
Very much like fertile six, yes.
This is like in certain hotels, they don't have a 13th floor.
You go from the 12th straight to the 14th.
But come on, those people on the 14th know what floor they're really on.
That's right
you laugh but in the discussion about what version to call it seven was actually highlighted as a
lucky number oh is it yeah humans in our numbers uh no we were going to make unicode into the
language for php6 um like four years ago or something like that. And the project got really far along to the point that even books were published about it.
Those of us who worked on the Unicode implementation felt sort of a, you know, a connection to that.
And then the project kind of died because of a number of reasons.
And so there was never a 6.
So a discussion came up about whether they'd pick 6 or 7.
I don't want to belabor it.
Bottom line, we picked 7.
Gosh, what was I talking about before we went off on a tangent?
The spec and the next version, kind of.
Oh, yes.
So the usefulness of the spec is partially to give the PHP project something to make sure that, you know,
we don't break things accidentally along the way.
And we have broken things accidentally on a number of occasions.
Remind me to explain to you why 0x0 plus 2 equals 4 sometimes.
It's also important for some of the revisions we're making to the language right now.
There are two RFCs up
on the PHP list. One for what's called uniform variable syntax. This is to make it sort of
consistent when you say something like $a, square brackets, some subscript, parentheses, some
function call, arrow, some method call, whatever you happen to do, piling these things together.
What's the right evaluation order?
Left to right, right to left, middle outwards, which is actually sort of like what it currently
does and makes no sense.
Unifying that and making it make sense.
Another guy, Nikita Popov, who's been really a big contributor in the PHP circles in the
past few years.
He's working on an abstract syntax tree for PHP, which is also another huge thing.
PHP's compiler doesn't have an AST.
It says, here are my parse expressions coming through.
Let's just compile those straight to bytecode and don't look at the overall program at all.
So he's introducing an AST, which is obviously a big opportunity to screw up the language.
Having, again, a conformance suite and a spec helps make sure that that doesn't happen.
All right, let's pause the show for just a minute.
Give a shout out to our sponsor, CodeShip.
CodeShip is a hosted continuous deployment service that just works.
We've been working with CodeShip for quite a while now.
We really, really enjoy not only the product they've built, but the people behind it.
You can easily set up continuous integration for your app today in just a few steps.
And CodeShip has great support for lots of languages, all the test frameworks, as well as notification services.
They easily integrate with everything you can think of, GitHub, Bitbucket.
You can deploy to cloud services like Heroku, AWS, Nojitsu, Google App Engine,
or even your own service because that's the way you want to do it sometimes too.
Setup only takes three minutes.
It's so quick.
It really is just so quick.
Get started today with their free plan and make sure you use the code,
the changelogpodcastOGPODCAST.
That's really important.
Use THECHANGELOGPODCAST, and when you do that, you're going to get 20% off for three months on any plan you choose.
Head to codeship.io and tell them The Changelog sent you.
Well, let's talk about this backlash that didn't happen. What you maybe perhaps feared is that the community would say, okay, this is Facebook trying to grab a stranglehold around PHP, the language, by introducing the spec.
And I don't necessarily believe that, but could you still speak to those fears perhaps maybe from Facebook's perspective?
And then maybe – like you said, we have all these different ways, you know, you represent Facebook a little bit, and then you also
represent just the PHP community, and how you balance those two as well would be interesting.
Well, yeah, I mean, I'll start to speak to the second part of that first, because I've actually
been working on PHP for about the past dozen years or so. So I've got a lot of skin in the game in terms of code contributed to the PHP
source code and involvement with the community. I wrote pretty much the book on writing extensions
for PHP. But at the same time, I'm also working here for Facebook on HHVM largely because of that
PHP work. I'm doing things like writing the actual extension API itself on the
HHVM side. So I have interests on both sides of the fence. And when I come to the list, you know,
it's, it's on the one hand, it's coming with the history of, of like having time and skin in the
game with PHP, but it's also coming in with this yeah but she's working on that other php thing and um how how much of what she's requesting in this rfc or whatever is to improve
hhvm so it can take over the world um i i don't think i have to tell you that there there is um
there is some degree of sort of distrust about Facebook and Facebook's intentions.
I mean, do any Google search and you'll get plenty of those conspiracy theories.
And some of those come through because we're all people and we want to protect what we see as good. And PHP's open source philosophy, I think, is actually really good.
It's a really open project.
It's got no BDFL. It's got nobody saying, no, this is making this really open source now.
We're making this really friendly to developers out there.
And, hey, here's a spec for it.
You can look at that as, gosh, PHP is seeing a resurgence.
Or you can look at it as, hmm, embrace, extend,
and extinguish. So I have personally gotten some of that kickback on other posts that I've put on
the mailing list, but that did not happen at all here. I think everybody sort of saw the way we released this and the way
that we, you know, tried to make sure that we focused on PHP as the source of truth and said,
how can I fault this? You know, it's, it's, this is just a thing that now belongs to the PHP
community. Like we with Facebook hat on didn't maintain any control over this. We said, here it
is, public domain license, CC zero. We're putting it into PHP's Git repository. So they completely
control the documents. It's, it's completely out of Facebook's hands at this point.
And maybe that's where we can dig in just a quick bit. Cause I know we talk about licensing
on the show here and there, but maybe to catch up why you chose CC0, it's in quotes, no rights reserved. Can you talk about maybe the choice of that license
versus say GPL or some other license you may have chose for other open source that Facebook has out
there? Well, I can only speak to it so much because I didn't specifically pick the CC0 license.
My personal favorite for my projects is BSD license,
because I just like the little bits of attribution.
But it comes down to what your philosophy about this sort of information is.
We're just talking about a document at the end of the day.
We're not even talking about software.
What is going to be most useful to a project like PHP? And like I said, PHP is a
really open project. And for something like PHP, it makes sense to just say, you know what,
here's some information for the world. What do we have to gain by putting a more restrictive
license on it? Very little. You mentioned GPL.
I could see the advantage of wanting to say that if somebody else grabs this and, you know, adds to it and extends it, you know,
we would want to make sure that that's open and visible to everyone.
I personally don't like the GPL license.
Well, I'm not holding it to the fence you're trying to figure out why you chose
this place i just wanted to kind of get a snapshot because mostly from the the vantage point of
ill will right when somebody does something in the world you you want to um you know depending
upon the person obviously you want to say that person has goodwill for me so or that entity or
that organization or that organization,
or,
you know,
so your reputation does proceed you in a way that you've done a lot for
open source.
And I just want to make sure that you have a chance here clearly to,
to say,
we chose this license for this reason,
for the reasons it's open.
It's,
you know,
it's not ours,
it's the communities and that kind of thing.
So I didn't want to,
uh,
dang on that too far,
but get the point across.
Yeah. I mean, the only thing I could say about that is just like, that's the beautiful thing about CC0.
It's literally no strings attached, you know?
Yeah.
And it's a simple license.
It's about three lines.
You don't need a lot of grade to understand a license like that.
So maybe this is just a left-wing question question but it seems kind of an obvious one to
to me but you know it's just a document you just said that um it's not like it's code it's not like
it's changing php really but what does this spec what does having it written out um fleshed out
open source uh ccl uh c CC0 license attached to it.
What does that do?
How do you expect or desire for the community to change
because of this document now being there to specify how PHP should be?
It's interesting you use the phrase, it's not changing the language,
because as it turns out, it actually is.
One of the first payoffs that we've seen from this is as people are looking through the document, a lot of pull requests coming through for simple things like grammar fixes and things
like that, whatever. A few bugs have come up. One of them that I just worked on the other day
noted that the spec says switch statements may only have one default block, which I think
we can all agree makes sense.
And this user had noticed at some point in his code that he wrote a switch statement
with two default blocks.
And it caused a weird bug for him because he couldn't understand why that first default
block wasn't getting executed.
And so he filed a bug report and he said, this doesn't match. PHP
allows multiple default statements. And when you have multiple, it'll execute the last one,
which I think we can all agree is a bit clowny. So what should we do with that? Should we fix
the spec to say multiple are allowed because that's what PHP does? Well, no, we shouldn't
actually, because that's really silly code. And I put it exactly that way to the list.
I said, this is silly behavior that PHP supports probably by accident.
Let's fix the language so it matches the spec.
So that's what we're doing.
And that's the benefit of having that spec.
You've got a lot of eyes looking at it, this,
and you've got that lived experience of these developers out in the wild
who are saying, that doesn't jive with what i know so facebook has another uh language that
they're very interested in their very own hack language which i think they announced
was it this year i think it was 2014 it's a few months back i think it was in april yeah april
ish we know hbm compiles to Hack and PHP. How does Hack
fit into this landscape with Facebook? Obviously
it's not going to affect the PHP spec, or will it?
So,
Hack, we are writing
a second spec, actually. Rex is
already busy back at work writing
a spec for the Hack. It's got a second Word document open, huh?
A second Word document, yes.
Command new, or what's that?
Control new. Never mind.
When that's done, we're most likely going to publish that as well.
Of course, that will be under the Facebook namespace on GitHub
or possibly the HHVM namespace, I'm not sure.
Because it does make sense for us to own that document, at least for now.
Hack is sort of, you could describe it as its own language, but I think if you know any PHP, you can look at a hack document and immediately understand what it does because it's, it's really more like PHP plus plus, um, which for those of you keeping track of PHP's rules, uh, if you have a string that you post, uh, incrementing that would turn out PHQ.
Try and pronounce that in your head.
I'll leave that to you. Um, so hack is, uh, as I said, PHP plus plus, uh, it's a different open tag. It drops a whole
bunch of some of the clownier bits of PHP, the things that we look at and we say, why is that
even in the language? Um, and it can do so safely because obviously if you're writing hack code,
this is not something that was written in 1989 and still needs to function.
Sorry, I meant 1999.
89 is a bit too far back.
It also adds a number of things that we noticed sort of developing our own code base.
It would have been really nice to have, and we're not really sure why PHP didn't add them.
I know why, but that's another story.
Things like scalar type hinting um php only allows type hinting for arrays and objects so we add type hinting for everything we even go beyond that parameterized type hinting um the sort
of workhorse of php the array that can be a vector or a map or a set or whatever um we actually
define these specifically as a vector a map map, a set, a pair,
whatever else. So you can define more specialized structures that can behave more sensibly under
the hood. If I've got a vector event, that should literally be in memory, int, int, int, int, int,
in a nice tight packed array. So there's a performance gain to be had there, but there's
also a readability gain to be had. You don't have to look at $foo as an array and wonder what kind of array that is.
You can look at $foo as a vector event and know exactly what you're dealing with.
That helps the static analysis type checker, and it also helps you, as a human, understand what the code's doing.
So, I mentioned static analysis type checker.
That's sort of the workhorse of Hack.
This is an extra program that runs in the background on a developer workstation.
And it reads all of your code base constantly, watches for updates on the file system.
And it looks at all of the code paths for data moving through your system.
So it says, okay, this is coming from dollar underscore request.
Obviously, it's a string because that's what comes from the user.
It's going into this function. so this function apparently accepts strings does it accept other types elsewhere no okay we'll say this type this function accepts strings it's going
from there into some function elsewhere and it goes down to other paths it gets concatenated
whatever else if you've got any sort of type error in that system, it's going to let you know that, hey, you should probably check this bit of code over here. We've converted 98 or something
percent like that of our code base of our 10 to the seventh lines of code to using hack by running
a program that automatically goes through and makes all those changes. So now when somebody
works on Facebook code, they see this code that's fully type annotated, has all these parameterized expressions to let them know what's moving through.
And we have a lot fewer problems of people saying, oh, I would refactor my little helper
class that surely nobody else is using.
And then finding out that the site breaks because somebody was passing the wrong kind
of data and it happened to work before.
So, you know, there's an old saying,
a servant can't serve two masters.
It seems like PHP has generated themselves a nice, or PHP.
Facebook has this new, maybe not a master, but maybe a new toy.
And you said that 98% of your code base is now over onto it,
being a subset or maybe a superset of PHP.
Is it a superset? Is that fair to say?
Well, it's both a sub and a superset.
Yeah, it's sort of a side set.
I got you.
It's in a Venn diagram or something.
Right, right.
So just your personal opinion, where do you see Facebook's interest lie long term?
At the same time, Facebook is investing into an open source public domain
PHP spec. So it seems like they have interest in both things. Where do you see that moving
into the future? So there's a few pieces of that answer. So as you see, you can't serve
two masters. And that's a very fair statement on it. How much attention are we really paying to
the regular PHP side of things? Well,
a language is more than just its syntax, right? It's also the whole runtime that comes behind it.
And PHP has a massive runtime library. Those are completely shared in common. So, you know,
we're obviously taking good care of those in common. The other half of that is a lot of the
extra features that go into Hack are actually just development time features.
They're not necessarily used in the runtime.
Some pieces of them are, but not all of them.
So what works for hack works equally well for PHP.
We want to make sure that we still pass the conformance suite
and we're still behaving the way PHP expects.
But we can work on hack without losing sight of PHP,
um, modulo those, those sort of missing things. Gotcha. Um, you know, we,
we kind of rely on external users to tell us when we're doing PHP wrong at this point,
because we are all hack. Um, but we do have, you know, tens of thousands of tests that run on every single diff.
So hopefully we're finding most of those things ourselves.
And what was the other half of your question?
I've already paged out.
So did I.
Oh, I think it's, I think the, the kind of maybe the, the leave behind on that one might
be just that you've got kind of these two parallels you're running.
And to some, it seems like maybe it's a competitor.
And to some, they can clearly see what you just described there, which was this sort of parallel effort.
And it's sort of like sugar on top instead of like a competitor and a squashing.
Well, I mean, hack is not meant to be a completely new language.
It's meant to be something that can live alongside PHP.
And in fact, in most cases, it kind of has to.
One of the things Hack doesn't let you do is have any top-level code.
Well, your entry point can't actually launch without top-level code.
So there has to be a PHP file in there somewhere. And it's about giving the developer the opportunity to use as much
or as little of that functionality as they want to.
And one other thing I think that's kind of neat about hack is just,
I think the hacker hack culture that Facebook has propped up
and just how, I guess, how awesome it is, I guess, in a sense to
say that you, you get not only to do some really awesome stuff, um, for developers across the world
worldwide. Um, but you also get to come up with a language that's kind of named after your mantra,
which to me is just like completes the world, you know? Yeah. At the end of the day, that's,
that's pretty much, um, so, so the, the name of the language, that's another story. Um, it just completes the world, you know? Yeah, at the end of the day, that's pretty much...
So the name of the language.
That's another story.
It's in a lot of our opinions, and even internally, it's a horrible name for a language.
Because how do you Google that, right?
Yeah, I was thinking.
Well, that's great.
Now the NSA is watching me because I've talked about hacking something.
They're already watching, so...
Well, they're watching us, certainly.
Oh, God, somebody's going to read something into that.
No, I did not mean anything by that.
Tell us more.
I already started.
Just kidding.
I just tweeted that out.
Anytime you, and I think this natural addition of Lang
after whatever it is, so Foo Lang, Hack Lang, PHP Lang,
that makes sense.
You've got SAS Lang,
you know,
all these other different Ruby Langs.
So the,
the addition of Lang kind of helps maybe keep the NSA at bay.
Well,
I mean,
it certainly is the same,
same problem that go ran into.
How generic is the word go,
right?
Right.
Yeah.
It's a movie.
It's a drug.
It's a verb.
It's a game.
Whoa,
there's a drug called go.
Yeah.
I think, i don't
know i'm not on the kids these days if y'all know we're gonna read into that yes we're definitely
catching echelon's attention at this point well sarah you know the the one other thing i wanted
to mention and you kind of did it a little tiny bit and i think i have to give you a little bit
of applause because you seem to be pretty humble about maybe either the fact that we didn't allow you to give yourself a proper intro in the front of the show.
But I think it's awesome that you've written this really awesome book, Extending and Embedding PHP.
You've been involved in the PHP community for a very long time.
So you definitely have the battle scars to prove you are where you are
for a reason. And obviously Facebook saw something in you because they hired you to
work on making it fast, which is pretty much what everybody wants Facebook to be, right?
It's what everybody wants all their sites to be.
Yes. Um, yeah, that's, that's a true statement just as well. Um, I think you mentioned a couple tangential conversations we could probably have.
I'm not sure if you want to bring them out or maybe take a minute or two just to touch on a couple of them.
You're welcome to, but I think you mentioned uniform variable syntax and a couple others.
So feel free to riff for a minute or so.
I'm not sure how much more I can say about uniform variable syntax as an example, because I mean, that's just sort of it was an RFC put forward as a guys were doing this kind of clowny.
How can we fix this without breaking all the code out there?
Which is really what the consternation on that particular subject has come down to.
You know, people are expecting their expressions to work a certain way because they've always worked a certain way.
They may even be muttering about it and saying, why do I have to put extra parentheses or
why do I have to do weird things for this language that doesn't understand order of
precedence?
At the same time, that could exist.
And if we just like introduce that in like 5.7 or something like that, there would be
uproar because stuff would break.
Not my stuff.
I put ridiculous numbers of parentheses and braces everywhere.
I've been told off for using too many parentheses, in fact.
But, you know, we, there are, there are warts on the language and everyone in the PHP internal
list knows what those warts are because we get, you we get pelted with them on a regular basis.
PHP is a fractal of bad design.
It's a double-claw hammer.
It's a silly language, whatever it happens to be.
It tends to get a bad rap, honestly.
I mean, especially as, I dare to say even like this, but more modern ways or more modern things just meaning that they're newer.
A lot of things happening in the JavaScript space with node and just with all
sorts of other areas. Ruby is around 10 years.
I think it's just, just turned 10 or just turned 15 or so now.
What rails is growing up and rails has turned 10. That's what it was.
You know, so like people kind of cling to these new things,
but there's been PHP for quite a while on it.
And it tends to kind of get this bad rap because it's been around for so long
and people almost look down upon it in some ways not that's why i really thought it would
be important to have you on the show just to talk about the spec its importance and what you've been
doing for the language and the community itself and then also kind of how that ties into facebook's
approach to to making itself fast hhvm and everything else we've talked about so kind of
kind of neat there's a couple others do you you want to mention Abstract Syntax Tree
or the other two that you've mentioned
that were side conversations?
Yeah, I mean, I sort of touched on both of them already.
But yeah, the Abstract Syntax Tree
is something, like I said, Nikita's working on.
This used to never matter to me
when I was working on a regular PHP engine,
because I'd look at the compiler and I'd say, well, you know, it gets the job done.
It probably makes it faster not to have this intermediate representation.
It's fine.
We can just compile an expression.
Here's a ternary statement.
Okay.
Make, emit the outputs for a ternary statement.
Why do you need an extra abstract representation?
And then I started working on HHVM and I saw people who really understood how to write compilers. And I saw the way this abstract
syntax tree got used in the process of compilation. I'm like, oh, that's why that makes a lot of
sense. We can do a lot more optimization. We can do, we can do a lot fewer hacks to make these expressions work. We can make things just function right without being inscrutable.
And I look back at the Zen engine and I say, you know, there are some parts in here that are kind of messed up.
And the AST is going to help us fix that.
It's not going to be anything visible to end users.
Nobody's going to know what's gone in.
But it's going to make our life as PHP engine developers a lot simpler.
All right, let's pause the show for a minute, give a shout out to a sponsor.
We've been working with TopTile for a very long time.
These guys are super awesome.
And I kind of wanted to take a moment and pause this for a bit.
And rather than just kind of give you an ad about what they're doing and what they're about, I kind of wanted to tell you a personal story. And part of that
personal story is telling you a little bit about my day job. So beyond just the change log and what
we do here, I have a full-time job at a nonprofit called Pure Charity. And we have a rail stack. And
earlier this year, we had some developers leave the company and we had a rail stack. And earlier this year, we had some developers leave the company,
and we had a big push coming for the summertime for a new feature we were working on.
And it hit me that we should call upon our awesome friends at TopTow.
And just to kind of give you a snapshot, TopTal is a matchmaking service for
really awesome developer opportunities and developers to get started. So we had a need
for some really great Ruby and Rails developers, and TopTal helped us find developers that fit not
only our budget, but also our culture, our coding style, all sorts
of things.
And long story short, they basically perform magic because these people we work with, I'm
going to give a shout out to them real quick if you don't mind, Guillerme, Andre, and Rafael,
all listeners of the Change Law too, by the way.
These guys are phenomenal.
Good people, good coders coders and just great all around
great and I have to say thanks to TopTow because they made this possible and if you've been
thinking about freelancing if you're thinking about trying out a new technology or you wanted
some flexibility in your work life balance and doing some. TopTile is a great place to be an elite engineer.
Go to toptile.com slash developer to get started and tell them that ChangeLog sent you.
And totally, I think it might be completely in left wing here, but you also wrote LibSSH2.
Do you want to touch on that real quick before we start to
tell the call? Yeah, I mean, that's really nothing particularly PHP related, except in that at the
time I was working on a lot of streams work in PHP. Streams are sort of this abstraction layer
we have underneath all the fopen, fread, fwrite, those sort of calls, so that you can work with
different sorts of resources. So you can do something like F open an HTTP URL, and that'll talk HTTP to the remote server,
and you can F read off of that remote network resource.
And it's great.
I thought, gosh, how cool would that be if I could do that with like SFTP files.
Sorry, it's been a while since I even touched this.
SFTP files or SCP resources, or just even be able to SSH into a system and send a command to it.
You know, how cool would that be? Well, I looked at OpenSSL and said, can I actually, you know, pull a library out of this? Oh God. Oh God, no. Oh God, look away. OpenSSL is a lovely piece of software,
but it's also got a very interesting code base. So I ended up just going to IETF and I said,
where's the RFCs for secure shell? Oh, here they are. Let's start implementing a transport. Let's
start implementing a few channels. Let's start implementing a transport. Let's start implementing
a few channels. Let's start implementing this. Next thing I know, I've got this entire, you know,
client side library for connecting to SSH servers so that I can shove it into PHP and then promptly
not use it because while it's cool, it's actually not that, you know, practically useful for anything
that I'm working on.
It was just sort of, I was working for the university at the time.
And the thing about working for public institutions is that you have very relaxed goals and extra time on your hands,
which is actually how I ended up getting involved in PHP in the first place.
It sure seems like you enjoy diving in deep and getting into the nitty gritty.
Is that fair to say?
Well, I like understanding what I'm working with.
I will search Google for how-tos and documentations with the best of them.
But if I'm really going to do something with something, I really want to understand how it works underneath.
On the HHVM project right now, my main job is to make PHP a good open source project,
which really means I don't have to look at much of the code at all, theoretically.
I can work on the build system, some of the runtime library APIs, things like that,
but I don't need to get down into the JIT and start issuing machine code instructions
to do what I need to do for my job.
But, gosh, I'd actually like to understand how that stuff actually operates, wouldn't I?
So I have.
So I've got commits down in there.
And I now, for no further use in my life probably, have the ABIs for Intel x64 architectures and ARMv8.
I know that the first six integer arguments of a function call go to RDI, RSI, RDX, RCX, R8,
and R9. The first eight SIMD registers go into XMM0 through XMM7, and then everything else goes
on the stack. Will I use that again?
Probably not.
But it was fun to write the code that actually used it,
and it shortened the compile time of one of our files
from 100 seconds down to 10.
So that was good.
That was good.
That was good.
That was really good.
Yeah, that was really good.
We were using these recursive variadic templates,
which, you know, God bless C++11.
It's a beautiful extension to the C++ language.
But, oh, it hurt my head to read that thing.
Like, reading assembly was easier than reading this.
That's saying a lot.
Well, after listening to you talk for a while,
I'm sure there might be people out there
to whom you're becoming their programming hero because you seem to have a lot of skills and, I'm sure. You know, there might be people out there to whom you're becoming their programming
hero because you seem to
have a lot of skills and a lot of knowledge.
I want to turn that on you and ask
as we wrap up here,
who's a programmer that you look up to
and that you would consider your programming hero?
Well, I'm glad you said
look up to because the word hero is a really
heavily loaded term for me.
And I'm not going to say I have programming heroes. I definitely have people that I admire. Because the word hero is a really heavily loaded term for me.
I'm not going to say I have programming heroes.
I definitely have people that I admire.
A couple of people on my team that I just want to give shout-outs to.
Mark Williams, he's been on the project for a very long time.
He understands everything about repo authoritative mode in our system and a bunch of the weirdly arcane bits of our system.
When somebody has a question, they go to Mark because Mark knows it top to bottom.
He's a really good compiler designer.
And he's actually really friendly in his responses.
He's very generous with his information.
Similarly, Jordan DeLong, I want to call out because this man knows the C++ spec by heart.
He probably listens to it on tape every night.
And he, like, when I come to him and I say,
you know, I'm trying to solve this particular problem
and I need to achieve these two things,
but I just don't see how they fit together.
He'll just be like, oh, well, here.
And he'll scribble something on a piece of paper
and he'll hand it to me and say, try something like that. I mean, he'll explain be like oh well here and he'll scribble something on a piece of paper and he'll hand it to
me and say try something like that i mean he'll explain it as well it's not as though he's just
you know throwing a piece of paper at me but like he'll actually sketch out an implementation while
we're sitting there and and and say you could try something like this this will probably do what you
want you may have to you know check the other thing over there um he smiles a lot he's a really friendly guy so i definitely
want to call those guys out um heroes gosh you know honestly anyone who who looks at a piece of
open source software that they use that they make their living on that they that they uh that they
care about at all and says i I want to make this better.
I want to give back.
I want to do something that's not going to profit me immediately at all.
Those are my heroes, man.
Like just open source developers in general.
Like I love that there is this community out there.
And I had a conversation on the last night of OSCON
with a guy I've known for a long time, John Coggeshall.
He's very concerned that some of our culture is getting lost.
Some of some of our like collectively our commitment to to open source and real open source is getting sort of sucked up by the corporate machine.
He actually made a bet with me that night.
We were standing out in an intersection in Portland at like two o'clock in the morning shouting at each other He actually made a bet with me that night. We were standing out in an intersection in Portland
at like 2 o'clock in the morning shouting at each other.
He made a bet with me.
He said, I'll bet you $20.
Facebook never actually lets go of the spec
and never actually makes it a properly community open source thing.
And he emailed me after the spec actually got released on PHP's Git server.
He says, all right, I'll owe you $20.
That's fast.
That's a conversation I think we've kind of had here and there on this show, too,
just this sort of descent towards corporations
and their takeover of open source and what true open source is.
We've had to kind of call that.
Yeah, you called it corporate source.
Right, yeah.
We had Chad Whitaker on who runs GitUp, and he's obviously pretty prolific in that.
He's an open company kind of person.
We had some deep conversations both on the show and then after the show as well with him on that.
So we've kind of danced around that quite a bit.
I think that's just a natural fear when it comes to profit and source code.
They just – then you've got things like Bounty Source
and people wanting,
and there's legitimate reasons
for people wanting to raise money
to build something.
I'm thinking like Tim Caswell.
I don't know if you listened to that show or not,
but he did some pretty cool stuff.
And he's just really interested
in building infrastructure code,
not really building products on it.
And he's trying to find ways to do that full-time and make it
completely open source and i think that's just naturally it's something we want to support but
it's it's neat to see the contrast of like corporates taking over and what you call real
open source i'm curious to know what he meant by that but one last question we have is, is it called arms? It's a call to arms to like the PHP spec, you know, whatever you can think of really that you're spending your days on. How can the community wrap themselves around whatever you think is most important? What's some good guidance to the PHP community as it as it is to what you're working on? Well, I mean, the first piece of guidance I would give,
no matter what project we're talking about, whether it's PHP or anything else, you know,
don't feel afraid to get involved in an open source project just because you don't think
your coding skills are up to par or because you think that, you know, somebody's not going to
like your ideas. You might get yelled at a couple of times because people are kind of jerks.
Sometimes, yes.
But not everyone's a jerk.
Most people aren't jerks all the time.
And you can also pick something that you feel comfortable with.
If that means documentation, God, you will get loved for writing documentation.
You want to keep people from yelling at you?
Write documentation, and they will love you for life.
Something like the spec.
We knew there were grammatical and spelling mistakes in the spec when we released it.
And we're like, you know what?
We're okay with that because that's a nice low-hanging fruit that somebody can come along and just say, hey, here's a pull request.
And the next thing you know, you've got somebody who's involved in the project who has this
feeling of stakeholdership over it.
Even if it's just, I got them to use the right spelling of the word to, you know, that I've
done that before.
That's something.
Yeah.
I mean, and the next thing that person's going to do is they're going to actually start writing
some real documentation in there.
And then the next thing they're going to do is they're going to fix some little runtime function
that is a nice, easy little tweak of code.
My first patch to PHP, I should say, by the way,
I did not go to school.
Well, not to college anyway.
I don't have any formal training in any language.
I've learned C just kind of by jumping in and trying it out.
My first patch to PHP with very little C experience was just to take the log function and give
it a second parameter so you can get, get logs in an arbitrary base.
It was a really easy patch to do.
It was a very tiny one.
I sent it to the mailing list.
They said, this is formatted wrong.
Do it again.
Oh, okay.
And then I reformatted it. I sent it and they said, oh is formatted wrong. Do it again. Oh, okay. And then I reformatted it.
I sent it and they said, oh, this looks ugly. Thank you. Here, would you like some karma to
commit some more patches in the future? Like that's literally all it takes to get involved
in open source. And if you're sitting there and if you're thinking, gosh, I'd like to work on some
project, but I'm just not up to it. You're wrong. Just do it. Just do it. Yeah.
You're not going to get fired. I think the barriers are even lower now with the way that
coding has become social with GitHub. I think back when, you know, in the karma days, it might
have been a little different and higher barriers. And now it's even lower barriers. Oh, GitHub has
done wonderful things for just bringing everybody out of the woodwork because you can find your project so fast. You can fork it with a single button press. You can make a little branch,
publish it to your own version of it. You don't have to find some place to host your code. It's
just right there next to the project. People can even discover your fork of it through the UI.
It's fantastic. Love GitHub. Love GitHub. Well, Sarah, it definitely has been quite a blast
having this chat with you. Thank you so much for taking the time you have taken to step away from
what you do at eight in the morning, your time to have this chat with us. I'm sorry for making you
get up maybe a little bit earlier, or at least talking for this long and this excitedly about
what you do at eight in the morning. It's just probably not your, maybe it's your norm.
I don't know.
But I usually wake up about an hour and a half from now.
Okay.
So she woke up earlier just to have the conversation.
So we really appreciate you taking the time and just your passion for, you know, for open
source and even, you know, your hero statement.
There was like anybody who commits to open source with a generous heart and just really wants to see it grow and not so much gain profit from it.
And I really appreciate you sharing all that you have shared today on the show.
And as best you can, keep in touch with us.
We'll do whatever we can to help mention whatever you do in the future.
And maybe we can get someone back on the show again.
I like the conversation we had there at the end, so I'll ping you via email and see if we can't
extend some conversations we had here today.
But I do want to mention three of our sponsors,
DigitalOcean, CodeShip, and TopTile
for helping make this show possible.
They are awesome.
5x5 is awesome.
If you don't listen to any other shows on 5x5,
go to 5x5.tv right now and check some other shows out.
The ChangeLog is on there at
changelog we broadcast every week
live myself Jared and
awesome guests like Sarah so at this
time everyone let's
say goodbye
bye
goodbye We'll see you next time.