The Changelog: Software Development, Open Source - Burnout, open source, Datasette (Interview)
Episode Date: May 9, 2018Adam is on location at ZEIT Day talking with Jessica Rose about burnout, Henry Zhu about his passions and pursuit of open source, and Simon Willison about data and his passion for interesting datasets... in the world.
Transcript
Discussion (0)
Bandwidth for Changelog is provided by Fastly. Learn more at fastly.com. We move fast and fix
things here at Changelog because of Rollbar. Check them out at rollbar.com and we're hosted
on Linode servers. Head to linode.com slash changelog. This episode is brought to you by
Rollbar. Move fast and fix things. Resolve errors in minutes and deploy with confidence.
Head to rollbar.com slash changelog
request a demo get started today it's loved by developers trusted by enterprises and most of all
we use it here at changelog move fast and fix things with rollbar once again rollbar.com slash
changelog All right, welcome back, everybody.
This is the ChangeLog, a podcast featuring the hackers, leaders, and innovators of open source.
I'm Adam Stachowiak, editor-in-chief of ChangeLog.
On today's episode, I'm on location at Zite Day 2018 in San Francisco, talking with Jessica Rose about burnout, Henry Zhu about his passions and pursuit of open source.
And finally, Simon Williston about data, data sets and his passion for interesting data sets in the world.
I personally connected pretty deeply with your talk because I almost said podcast because it's on the brain.
No worries.
But because I think everybody's experienced some level of burnout, whether they admit it or not, you know, and I, you know, I kind of empathize with you because you said you're getting older.
And I'm also realizing that I'm not immortal and I am getting older.
So I realize that, you know, things start to hurt or it's harder to wake up and be excited, even though I run the thing.
Like I'm in control of what I'm doing, aside from not really being in control.
Yeah, it is one of those things where as we're getting a little bit older and I think part of that isn't just getting older,
but I think it's getting either the space within the industry or sort of the backbone ourselves to say, oh, no, wait, I am tired. Oh,
no, wait, this is hurting me. So the folks I see really impacted by burnout and swept off their
feet are the youngins who feel like they're invincible. Like I said, oh, no, I'm going to
be fine. I can work these 80 hour weeks. Weren't you there though? I was there at one point in my
life. I was the young one who was invincible. I've been old for forever. And I can remember pulling all-nighters often.
I remember having insomnia and not being able to sleep just because
and still getting up and doing the work.
I can't say I was the best at it all the time,
but I remember days where I felt young and invincible.
And I think that's a really important point.
When you look at doing meaningful work, your ability to stop doing meaningful work ends after six or seven, eight, if you want to go
by like standard workday hours, but like that ends a lot early on their people keep working.
Yeah. And I saw a really interesting, someone mentioned, someone said something really valuable
to me. I think they got it off the internet where they say, if you're doing, if you're, if you're going at 150%, you're really just taking out a loan from your future self.
Yeah.
Can you talk about your very visual, so it's probably difficult for a podcast,
but this visual process of memory load.
Can you just remind me what the term was for it?
So I'm really into pop psychology, and I'm really into pop psychology from the 60s because I'm a very specific kind of nerd.
And cognitive psychology was sort of a branch of psych.
Someone is going to hear this and fuss at me for what I got wrong.
No, I'm so I'm delighted for it.
I'm pretty, pretty laid back about being wrong.
Cognitive psychology was a period where it was a specific branch,
a study of psychology, and folks started thinking about
and talking about the brain's organic processes
and psychological processes in computer terms.
Right.
So when you talk about working memory from cognitive psychology...
Working memory, that's right.
It's exactly what you think it is.
It's the amount of memory, it's the amount of space you have available for mental processes.
Whereas cognitive load is the amount of working memory you have booked up.
It's how many mental processes you're doing and how much of that cognitive load they eat.
So I often like to ask people to visualize having too many browser tabs open.
Like that's eating up too much of your browser's working memory.
Right, right.
Last time I actually had a bout of this.
I opened up Slack with the intention of going to a particular person's private conversation
to pull back essentially what email is.
You know, very similar.
Like, I'm going into Slack to get some information.
I know they share with me.
I've got to do a couple scrolls.
I knew I had to do that. But before I knew it was five minutes
later, something else grabbed my attention. I'm in Slack and I'm like, what am I doing?
I didn't even do the thing I was supposed to be doing. So I was pretty tired last night. I was
flew in from Houston to San Francisco. I was, I went to bed early when I normally don't go to bed very early, maybe 10, 11, 12 or
whatever. But I mean, I was 12 being going to bed early. If my wife's listened to this, she knows
I'm lying. Cause it's more like one or two. So there, there I am. I'm stretching a little bit.
I'm just trying to like not embarrass myself. It's one of those things, isn't it? Where you're
like, do as I say, not as I do to be like, Oh yeah, take care of yourself, kids. I'll stay up
all night and worry. My wife's like, do you have to work tonight? I'm like, I do to be like, oh, yeah, take care of yourselves, kids. But I'll stay up all night and worry.
My wife's like, do you have to work tonight?
I'm like, I don't have to, but I probably could do one or two things to make tomorrow easier.
And it only makes it, it probably makes it a little easier, but it still probably makes it just as hard.
I love how listeners can't possibly catch this.
But as soon as you said that, you made clearly the face your wife would have made, which is just a gentle eye roll.
Yes, yes.
And she realizes that I try my best to do balance. It's tough. It's not always easy. Um, but I can feel like I teeter
on burnout and I kind of say it's okay in seasons. Have you ever heard this term?
I have not heard this except being burnt out for a season. And then, cause I know I got a lot going
on, say the first quarter of the year, the half a year or some kind of commitment where i say it's okay for this length of time
but then once i get there i'm starting to do what you said in your talk just starting to say no to
things cutting things out in positive ways that don't hurt you um so that's i'm making it about
me i want to get back to you and parts of your talk but something you said though in your talk
was like and i think what a good message might be to share is like, it's okay to be overloaded, but to be
realizing it and then say, don't just like eject all the things. Right. And then screw your life
up, you know, pull out things that you said it in such a way, basically that it seemed like don't
hurt yourself in the process of like saying no to things and removing things that are bad for you to do
and taking up too much of your time in your life?
Well, even when you're critically burned out,
there are some things that you have to do to sort of save your future self.
So for me, when stuff got really bad,
internet grocery shopping is absolutely my savior.
Buying a bunch of quote-unquote healthy frozen burritos,
getting an accountant to make sure
I was paying all the things that needed to be paid.
And even outsourcing some stuff,
like having some folks help with cleaning and laundry,
which is a massive privilege
and I feel so lucky to have been able to do.
But just either getting other people to help
or me myself making sure that the stuff I'm dropping
isn't gonna make my life much more difficult in two months.
But I think a lot of our dear listeners may not have sort of the funds or the flexibility to do it.
The advice I would still give is when everything seems overwhelming, I've got to do this and the other thing and that things do.
Really triage what's going to doom you.
Well, everything feels like doom when you're overloaded,
but what's going to make your life more difficult if you don't solve it,
you probably don't have to call your aunt back. You probably do have to pay your tax bill. Like,
yeah, just triage where you can. Yeah. There's a book by Brian Tracy that may not be the best
prescription for this, but I think it's called
eat that frog or eat the frog or something like that. Have you heard of that? And essentially,
you know, it's not a great to do list type of thing, but essentially do the thing that's the
thing you have to do the day that day first, you know, it doesn't mean, you know, literally eat a
frog. It just means that if you've got this one big thing to do today,
and today's a success because you did that thing, do that thing. Don't wait two or three days to do
that thing and call your aunt back or whatever. Instead of taking my approach where it's like,
oh, I'm really dreading this. I'll just put it over in the corner. Do the dread thing first.
That's his basic advice. I'm not sure if it's the best advice. It's difficult to do that. But
I think what he means is to say is just don't put the things off that matter most too far.
Do them pretty quickly.
You shared stories, too, about people may not realize this, but right now it sounds like you may be in a case of burnout.
Not so much from our conversation, but from your talk.
No, I was really critically burned out seven, eight months ago.
Okay.
And I got really lucky.
Being able to successfully work through burnout is really rare.
But I'm working over at FutureLearn now.
We're like an ed tech platform.
And it's the most reasonable place I think I've ever worked.
So I'm managing a team of really brilliant engineers.
I go home right at five and I don answer emails, and I don't do Slack.
And I get to fuss at my team, too, to go home right at 5 or at 6.
And not do – yeah, the only thing I ever think I fussed folks out for is, like, I saw you were emailing on a weekend.
Yes, I like that.
We've been doing that.
I'm not the best at it personally.
However, I do like it when my team
is that way because it lets me know they care about balanced life. And if I don't ever want
to make them feel like they have to do something that's for us outside of like normal hours. It's
just like, I just don't want you to feel that way. But an important part of leading in that way is
making sure you don't do stuff that they say. I know I'm kind of a bad example. I'm working on
it. So that's part of burnout too, is like, you may not be able to get back to like perfect you
in this burnout stage to like get back to some, you know, equal balance, you know, balancing.
I think it takes time. I'm a big fan of iteration. I realized that today's delivery may not be
perfect, but it's good enough. It's enough to put out there and get a feedback loop going on to
learn on how to better course correct for the next iteration. To me, that's a pretty profound thing.
So what does burnout feel like? What do you think, what would you describe burnout as? So
if someone's listening to this and they're thinking, kind of sounds like it might be burnout.
What do you think burnout is? What was it to you?
Fantastically, there's a ton of research around this that says burnout is effectively
indistinguishable from clinical depression.
Yay, it's just great.
That's pretty common though, right?
Being depressed.
Or clinical depression.
Is it different?
Clinical depression is just diagnosable depression, I believe.
So it's really challenging because the symptoms of occupational burnout both mirror depression and other mental illness issues and can trigger them. So it's like a one-two whammy of misery. If you are listening and you're
concerned, like, wow, maybe I'm burned out, the Mayo Clinic has a really fantastic checklist,
which you can search for online and go through it. A checklist. I'm assuming if I took that
checklist right now, they'd probably say yes. If you were in my talk, if you said yes to a couple of things, I was, I was side-eyeing.
I wasn't, uh, I didn't want to say yes to any of those things, but I couldn't help but doing so.
And, you know, there's times when I get up, I love what I do. And it's a shame because I do
really enjoy what I do, but not every day. I don't always enjoy it the first day in the morning. And I was, I was, uh, really listening closely when you said, you know, you have this, this bug
or this, this thing you're dealing with, you know, with, with a code problem or whatever.
And, oh, by the way, I've got to take care of laundry. I've got to help with this or like,
that's what happens to me sometimes is that I heard myself in your talk today, basically.
And it's a lot of like everybody's brain works really, really differently.
And like neurodiversity is one of the most exciting things about the way people engage with the world around them.
And like mental process and social process are so cool.
I think wedding self-care into the way we view our own mental processes and approach the world is absolutely
critical. What is that? What do you mean by that? So for me, when I was burned out and coming
through recovery, one of the hardest things was you aren't running on all cylinders cognitively.
So I had very, very serious memory issues, which is quite common. I was consistently despairing
and miserable, and I had a really hard time
seeing success in my own work. So I'd do something, I'd do a measurably good job and
they'd be like, well, I guess that's okay. And for me, if you've heard me wandering around the
halls today, you've probably heard me sort of morosely go, everything is fine.
To yourself?
To other people as well, where they're like oh this
thing of like everything so i had to kind of not quite home cbt but like work to build new pathways
where it's like this is terrible and everything's bad but like you know it's it's probably fine
it reminds me of something that uh happened on my trip here i was I had to drop off my car in parking and take the, I guess it's some sort of
bus over to the departures. And I was in a rush. She was only supposed to take C, but she decided
to pick up A and B as well. And then everybody started piling on. And then I had to scoot over
and share half of my C with somebody. Long story short, it was just like, she thought I was
complaining, but I wasn't. But I liked her response to her thinking I was like being an abrasive person. I may have been, but I
don't think I was trying. I definitely wasn't trying to be. But she says, it's a beautiful day.
And I was like, you just abused, I'm not a bomb, but you just abused a bomb. You know, because like
you can say that thing, you say it to yourself. It's not like you say it to other people too, but it's that one thing you can say that's like, you know,
everything's fine. It's a beautiful day. It's just, you know, it's just something you could
say to sort of like take away the stress and crazy, the chaos that might be coming.
What do you think? I was going to say, I've got something that sounds very nice and placid,
but actually means piss right off. And I wasn't supposed to be doing that. Sure. You can do that. Cause sometime. Yeah. One of my favorite things is, oh, well,
perhaps, you know, best, uh, which does mean like, um, please. She's mouthing something.
We, we, we, we try to remove the explicit tag for our podcast for our young hacker fans out there,
but she's, she's, uh, I think that, yeah, if, if, if I'm ever in a
situation with somebody being terribly unpleasant, be like, oh, well, I suppose, you know, best
folks know what you mean. I hope. Yeah. The one thing I would say about burnout or because burnout
so closely mirrors depression is that if anybody listening to this thinks, oh man, wow, maybe that
is an issue for me. Yeah. One thing I'd encourage people to do almost immediately, but immediately is seek
some kind of healthcare. So go and talk to your doctor, talk to them about how you're feeling
just because it could be burnout. It could be depression. It could be one of a dozen physical
or mental health issues. Just going and making sure that you're not listening to a podcast and going, oh, well, Jess said, just balance my... No, no, go see your
doctor, please. Right, right. This is not medical advice. This is only two people talking about
sharing war stories of burnout. Jess, you said that in your talk, you mentioned that your doctor
had actually written you a prescription to quit your job. Can you talk about that?
It was very jokingly. So that wasn't real?
No, it was... It did happen. She did. Okay. Can you talk about that? It was very jokingly. So that wasn't real. No, it was, she did. Okay. Um, can you share the story? Uh, so I went in to talk to my GP and
I've been working as a contracting consultant for, for a while. Um, and I went into chat with her
because I was having a really hard time sleeping and I was lethargic. I, I was just pretty miserable
and my, my GP, my general doctor, is just absolutely glorious.
She's very, very sarcastic and absolutely delightful.
She's like, okay, okay, okay.
Well, I can give you this antidepressant, this antidepressant, or this antidepressant,
but you should probably just quit your job and we can skip all of those.
I was like, are you telling me that's your medical advice?
She was like, yeah, yeah, look.
And then wrote me out, grabbed a prescription pad and just wrote, please quit your job.
Please quit your job.
Well, I mean, technically she did write on paper that was her, you know, her medical pad.
I don't want you to call those things prescription pads.
It's a legit thing, right?
It was just a bit of stationery.
I don't think it was a proper prescription.
Oh, it wasn't?
Yeah.
Okay, cool.
I was taking it a little further, but.
No, you want it to be like a, a sticky
pad where they tear off at the top. No, not, not quite so romantic. Okay. But so did you
quit your job? Uh, it was a contract. So I got to the end of my contract and then yeah,
I got into something that was less stressful, which isn't fair to the contract. It was a
completely reasonable job and completely reasonable people,
but had 60, 70% travel and early stage startups, you know?
Yeah.
Just unhealthy for you.
Not so much unhealthy in general, like to other, somebody else who could do the role.
Just not the best for you.
Developer relations folks talk about there being a, like 18 like 18 20 you can do it maybe two years
and i think at the time i'd been doing it four or five you know there's there's actually several
people i know that have been in like head of something or dev evangelist or relations whatever
name you want to apply to that role essentially telling people about products to encourage you
to use them through developer means whether it's how to use something or a demonstration or meetups or whatever.
There's a lot of people I've seen do that job that eventually do for sure burn out.
Oh, yeah.
We just drop off the mat for a year and a half, two years, go to Thailand.
Folks do tend to take a chunk of time off.
Well, it's fun to do.
You may go into it thinking, won't be me.
You know, I can conquer this job.
I can travel the world and not get burnt out.
I could put in 15-hour days, seven days a week or whatever the time constraint is.
What's a normal day-to-day?
Like is it 10-hour days, 15-hour days?
It really depends.
So I was doing a lot.
I'm such a ham.
I was doing a lot of travel and speaking.
So there was, I think I was in maybe 50 countries in two years.
I was doing maybe two or three conference talks a week.
My husband switched to being a house husband just because I would come home and need somebody to swap out laundry and run my life admin.
I don't necessarily do.
I love DevRel.
I absolutely recommend doing it. I don't necessarily recommend love DevRel I absolutely recommend doing it I don't necessarily recommend that level of travel
I absolutely recommend
house husbands though, they're just glorious
Yeah, is that right?
Everyone deserves one
Well, how do you get one?
I suppose you marry
someone who
is more caring than job focused.
Okay.
Yeah.
Okay.
Interesting.
Part of good advice.
That's always good.
Like what's on the horizon for you?
Where are you going?
What's next for you?
What's some good advice to give back to people?
How can we tell this out?
Oh dear, that's so many questions.
So for me, I'm doing a lot less traveling and speaking.
And instead I've been focusing on stuff that's scalable and I can do for my flat.
So I'd run a podcast myself, the Pursuit Podcast.
Which you've got to scroll down a bit because there's also a lot of church podcasts with similar names.
The Pursuit Podcast.
Yeah, I guess so, right?
So if you're searching for it, you may find a bunch.
But if you go to...
It's the blue one.
What's the website?
If you go to pursuit.tech, you'll find us there.
Nice.tech. It's a good domain. That is a pursuit.tech. If you want to fuss at me directly,
I'm at jessica.tech. Always very on brand. Okay. What is the podcast about? So we tend to focus on
a different topic each month. And we just talk to folks in that space and about their experiences
and about what kind of advice
they could give. So we talked about wearables this past month. And then next month we're going to be
talking next. Oh dear. That's like tomorrow, isn't it? In May, we'll be talking about different types
of fundings. We'll be talking to somebody about bootstrapping, talking to somebody about working
with VCs. Yeah, I'm pretty excited for it. What you read the summary of it, what's the promise of it?
Oh, man.
I wrote that like two years ago.
So I think it's your guide to getting the things you wanted.
Oh, dear.
Oh, no.
It says some stuff.
It says some stuff.
Okay.
Well, listen to Jess's podcast and learn some stuff.
Yeah, there we go.
Pursuit podcast.
Is it pursuitpodcast.net?
Or sorry,.tech..tech..tech. Or pursuitpod on Twitter. Pursuitpod on Twitter. Okay, easy enough.
And parting advice. Parting advice is, oh, absolutely fall in love to saying no.
The way I got so burned out was, oh yeah, I would love to come to your conference. I'd love to help
with this project. Of course I'll work overtime. If you learn to say no judiciously earlier, you can make your life a
lot easier. Saying no is the toughest two-letter word in the English language. It's so tough. But
I can agree with you that saying no has its fruits because I've said no to a lot of stuff
and have seen the rewards of that. So it's definitely a good thing for it.
And once you get used to it, like being a little bit burned out, I've fallen in love with it.
Are you familiar with Derek Sivers by any chance?
I'm not.
So Derek Sivers wrote several blog posts.
He started CD Baby.
He's got a really interesting life.
He's a real interesting person.
Let's just say that.
And he has this way of saying it's either when he says a decision to do something for him, it's either heck yes or no.
It's like he's really excited about it or it's definitely no.
Because if he's not really excited about it, he doesn't have time.
Almost like in your talk where you don't have enough hex for something.
It's like that for him.
It's like, I don't have enough to do this all in,
like I do things, so then I said no.
That makes quite a bit of sense.
That's his way.
I kind of like that one.
I might politely disagree a little bit
because some of the stuff I've found most rewarding
is stuff I never thought I should do or that I'd enjoy.
I can kind of agree with that too
because there's been times I've said yes to things I'm like,
ain't going to work out.
It's not going to be fun.
It worked out and it was fun.
It was great, yeah.
You know, so.
Kind of the night out principle.
When you're going for a night out with your friends,
the nights where you're like, yeah, this is going to be great,
are often duds.
And the times where you're like, oh, I'm not sure I want to go out.
Best time of your life.
Best time of your life.
I can agree with that.
This episode is brought to you by Linode, our cloud server of choice.
It's so easy to get started.
Head to linode.com slash changelog.
Pick a plan, pick a distro, and pick a location, and in minutes, deploy your Linode cloud server.
They have drool-worthy hardware, native SSD cloud storage, 40 gigabit network, Intel E5 processors, simple, easy control panel, 99.9% uptime guarantee.
We are never down.
24-7 customer support, 10 data centers, 3 regions, anywhere in the world they got you covered.
Head to leno.com slash changelog to get $20 in hosting credit.
That's 4 months free.
Once again, leno.com slash changelog.
So, Henry, you're in the middle of an experiment,
which I think is, to some might seem risky, but you've got it kind of fleshed out.
You've been down this road.
You've been doing open source for a while.
You kind of you've even said you've accidentally became a maintainer at one point, I believe, either on the React podcast or request for somewhere.
I've heard it.
I'm not really sure where.
Where's this leading you to?
Like you're in open source full time now, but solo.
Not for like a company's not paying you to do it.
You're doing it yourself.
Right.
The reason why I left, part of it was that I don't think a company would pay me to do the things
I would want to explore and experiment with.
Mostly they're going to pay you to, I guess,
work on features or just the things that they need
for their business.
And if you talk about like, you know, how do we build community or bring in contributors,
they want that, but they're not going to pay you specifically to do that.
Or what does it mean to do mentorship when it's an intern,
but they want to hire them to work on other things, not open source.
And all these companies don't want to take ownership of open source or Babbel for me.
And it's like, where can I actually do that?
And I feel like the only way to do that is doing it on my own then.
I mean, it makes sense, right?
A business has to have some sort of value from the exchange.
At some point, you'll be your own business.
You may have to have similar considerations of how you spend your time, right?
Is it valuable?
Does it help me meet?
Or you may actually have to make a choice of being able to earn money
or serve open source. What do you think about that?
Right, because the things I want to do aren't things that people would normally pay for.
So it's like either I have to make money doing freelance, which I don't want to do,
or get enough money from doing maybe consulting or support contract
or workshops for Babbel such
that I can do the other work. And this is why I made the Patreon because then people can support
me for whatever I'm doing. They want to support the person rather than the project. But I have
found that like for most companies, they only want to support the project, right? Because it makes,
maybe it's just better for them from sort of
like a business point of view or just talking about to their boss. They don't want to support
a person, but individuals would rather support a person because they understand who you are
and you're, you know, you're literally supporting their livelihood. And I think that's more like
something that they want to like contribute to rather than a project where it's like,
it could be a lot of people.
We don't know how you're spending it.
You know, what are they, what are they doing with this money?
And learning to figure out how to spend the money for a project is really difficult.
Like what's that conversation like look like, look like and stuff.
So to not assume that the listening audience is familiar with your story, give me a quick
version of like, not so much you're getting started, but like this transition, you worked at Behance, you decided to make the shift, kind of give us
that and we'll go further from there. So first I got my job at Behance because of open source. Like
I got involved in the project and then they emailed me, Hey, do you want to move to New York
on this team? And so I was already wanting to do open source more. So I was like, if they found me
through it, then I think that means they care about it. And so I was kind of to do open source more. So I was like, if they found me through it,
then I think that means they care about it.
And so I was kind of doing it,
but I was working on the product team
because Behance, Adobe, they're mostly focused on that.
And so it didn't really make sense for them
to like have someone working on it,
like just for open source.
But, you know, halfway, like, I guess,
not halfway through, like a year later,
I'm working on Bavel in my free time. We use it at work. Why are we like, we shouldn't wait for me to work on it
if we need something. So like, well, why don't we like, let me work on it at work then. And so they,
I think I asked for full time and they gave me half, which was like already amazing. Right.
And so I kind of like, just was like, oh, that's, that's cool. Okay. I'm good. so i kind of like just was like oh that's that's cool okay i'm good and
i kind of just started working on that um but the problem was that that's always going to be hard
like what does it mean to do 50 is it like half of my sprints are babble and the other half is
how did that work out actually practically yeah did it work out were you still yeah the same amount
of responsibility is just like oh now half of it really is this, but it's not.
Yeah, I think it's kind of like that.
It was an experiment for them, too.
Yeah.
Because they've never done that with other people.
I mean, kudos to them, too, for trying.
I mean, you can't knock, I mean, obviously there's no hard feelings, but it's just like, that's risky for a business to like, that's brand new for them, even.
I think it's just that they, a lot of the people there have a, like, previous experience in culture and open source and another product like jQuery and other projects and so they're they understand what that's like
and they they want to they see where like where my passions are and they want to support that
because I'm still working on other things and then the things I'm doing only help us too so like it
was they were able to justify it um it's just that in the end, I would have this,
my own guilt of not feeling good
about doing open source,
even though it is my job.
And like, you know,
we have deadlines.
And so it's like,
I would always tend to,
like I wanted the open source,
but I would be doing my other work.
And it's just like a weird situation.
And, you know,
everyone was really supportive.
It's just like mentally,
I don't know if I could take like doing it and then doing this other thing and so in the end i was like i think i need to just
figure out do i actually want to do this i had lots of conversations with my boss about
you know what is it do i actually want do i just say i want to do open source because it sounds
like a good idea like full-time and you know the reality is like a lot of people don't want to do
it full-time because it should be just like something that's fine and like you shouldn't have to like worry
about your whole day like every day and dealing with like community and all these other things
which are not just purely coding and so but then I realized that's the part that I like I like
working on those non-technical parts anyway they They were totally on board when I said, oh, I want to leave.
When I made the decision, they knew I was sure
and that they knew I wasn't just making it up or something.
They're like, okay.
My boss was like, if anyone can do it, I think you can do it.
Were you concerned in those moments?
I can just imagine me, so I'm not you obviously.
I would be nervous.
I would be like, gosh, man, what are they going to think?
Are they going to like, you know,
is the world going to think I'm an idiot for doing this?
Like, is this a wrong move?
Like, take us through some of the, you know, like the morning of.
Like, I know today I'm going to give my resignation.
How's it going to go?
So I don't think I ever thought like people think it's going to be dumb or bad.
I actually
never thought that it was just more like what's gonna happen to me and like am I just running
away or not running away I'm seeking after something and like not taking the like risky
road instead of just having a safe comfortable you know job and working on it at halftime like
maybe I should just be okay with working on halftime, which is already better than a lot of people.
But I guess in the end,
I was so, I felt so convicted to want to do it that much
that I decided that was the right thing to do.
And when I went there,
I felt pretty confident in going in and doing it.
And in the end, I didn't really have any issues
when I got to that point.
But before it was like a lot of
just not even wanting to think about it in
the first place. Cause you're like, that's such a dream or that's such a like impossible thing
that I never just thought about it seriously. And so trying to do that, you know, thinking about the
taxes and insurance and all that, you know, just to try to make it seem more realistic.
Not that in terms of like, it's easy or hard, just that those are things I need to think about if I'm going to do this.
Maybe I'm maybe I'm just missing it.
But what's a day like in your life?
Not so much Henry's life, you know, like get up, brush your teeth kind of thing.
But like, you know, what is being an open source to you?
What is it you said you get excited about community, the things that they weren't willing to pay for?
What is it that are the intangibles that are not so much like code level human level what's that for
you yeah i think so that's something that i kind of talked about this in my talk it's something i
always wanted to do but i never really did it like i would kind of half do it because i would
it's like you have a picture in your mind of what a maintainer should be and usually it's like about
code and so even if i wanted to do something else i knew like oh no i'm in your mind of what a maintainer should be. And usually it's like about code.
And so even if I wanted to do something else,
I knew like,
Oh no,
I'm in like this,
I have like this box where it's like,
Oh,
this is what a maintainer should be.
So I was just focused on code,
but now it's like,
okay,
I don't really know what it looks like,
but I want to figure out those things.
And so that means like,
um,
role right now,
I'm only just focusing on like getting funding and that kind of thing.
But I do want to spend my time like, okay, should we make a Babel meetup where like, Right now I'm only just focusing on getting funding and that kind of thing,
but I do want to spend my time like, okay, should we make a Babel meetup where instead of giving talks, it's just a meetup where we come and I'm there
and I'll teach you about how to contribute to open source or get involved in our project.
Instead of waiting for people to randomly show up in our GitHub making a PR or an issue,
obviously everyone's very intimidated.
So if I'm there, then people will feel at least willing to show up and see like what's this about you know maybe we
make a podcast maybe we do live streaming or we start doing more like video chat instead of just
talking on Slack or Twitter like just things that make it so that I seem more accessible and human instead of just like some person and
also so that people know that there's like a few people working on this not just like a huge company
and like a huge team or something like that it's like a few volunteers but yet you're using it
and then people feel like oh I'm not good enough but it's like I got started in the same way not
knowing anything right accidentally and I'm here and I, but I got started in the same way, not knowing anything, right, accidentally, and I'm here.
And I want to help other people get to that if they want to.
So in a lot of ways, you're the on-ramps to Babbel
slash future open source work.
Yeah, I think so.
And that on-ramp is defined by either face-to-face meetup workshops,
intro to open source, first contribution to Babbel, or XYZ project,
or here's how you use GitHub, or you name it.
It's hard because then if you're doing the one-on-one
or that kind of thing, it doesn't really scale.
But I think maybe we get caught up in this idea
of just wanting to tell everyone about open source
and everyone should do it, everyone should be a maintainer.
But I don't think it means that people can't't it's just that i don't know if they
know what they're getting into and so we we want to make sure that they it's not like a prepare
it's just like in a lot of cases you're warning signs yeah not in a bad way but like it it takes
you have to be committed it's not that the road's bad or hard it's just a matter of like you got to
be committed to do that kind of job yeah and and i think it's also fine if road's bad or hard. It's just a matter of like, you gotta be committed to do that kind of job.
Yeah.
And,
and I think it's also fine if like people want to do those kind of,
I guess we call them like drive by and PRs or whatever.
Like people like,
maybe you just don't have contributors.
Yeah.
Yeah.
You don't,
maybe you don't have time and you don't have,
or you can't do it at work or you have kids or your family,
like all these other priorities,
you still might want to do open source and there should be a way for you to
contribute like maybe once a month or, or once or whatever. And that's fine. It's just
that if you like go about it in a certain way, you might find that you're taking on too much.
You don't realize you're like, Oh wow, why did I suddenly, you know, make my own project and
the millions of people are using it. How do you think Babbel will benefit from now you being
essentially, are you, is there anybody
else full-time on Babbel? Is there a certain thing? Like, is there a full-time on Babbel at all for,
for the project? No. Um, the other main contributor, Logan, he, he isn't full-time, but he left his job
a while ago and he's my other partner in doing all this and everyone else is volunteer. So like,
we don't have any full-time people and i mean it's weird like i say
full-time but it's like it's full-time for the project but it's not full-time for the code like
right now i'm not basically not coding anything i'm just like trying to raise awareness or get
funding which are like sales i guess right and all these other things development yeah business
development that's still part of it it's just not what people normally think.
And I'm mostly just doing like reviewing and like very like meta level things that I hope are like long term things that we would never consider before.
Like before when you're at work or you have limited time, you're always thinking about
like taking out fires or like this short term things.
But it's like if you're full time, well, then why aren't we doing these longer term goals
like getting contributors, investing in people so that maybe some of them will be committed
instead of just asking 100 people to make a good first PR.
So you decided to go with the Patreon route.
And I know I'm not trying to get into politics on the different routes you could go to seek funding.
So I'm not asking you to share the advantages of this platform versus another.
But what is it about, not so much Patreon, but how are you incentivizing people and or corporations or companies to support you?
Who is the customer that you're seeking?
If you're doing business development, you're seeking out who can fund you, who can believe in this this mission who can help sustain you to make it happen
like who is that person and or company i should probably have a better specific group of people
but there's a lot of different people that it could be so i'll first i'll say that we we have
an open collective which is our other crowdfunding site that's for the project itself so i can take
money from there which is good because i'm not making enough from Patreon right now. The reason why I picked Patreon is because it's a way for certain people to be able to invest in a
person rather than a project. I think there's going to be people that would just rather donate
to a person. Other people would rather donate to a company. Maybe companies would rather donate to
the open collective or Babel itself. In terms of incentives, I don't really want to go out of my way to make
incentives and i think that's true for anyone that does patreon because it becomes a job in itself
yeah of like getting people to like and then the kickstarter where you say i'll just give you a
bunch of free stuff and well just give me the thing yeah i don't want the swag i just want the
thing yeah and you're like am i gonna ship people stickers and stuff it's like that's like a lot of
work that you have to do so all of my incentives i've kind of tried to make it funny actually so like i have
one incentive that's like 11 because a month and it's like the ping pong tier because like ping
pong you play the 11 and so it's like if you donate to this i'll play you in ping pong it
doesn't mean i'm not gonna play ping pong with you it's just like that's like a a reason for you to pick this yeah pick that um i did one for video games like
mario kart like i had a 50 cc tier so i was just trying to get creative um you don't really get
anything per se it's just like if you want to donate that much you can and i did one for like
board games just the things i'm interested in maybe they're interested in that too so those
are just individual people and i guess it's going to be difficult to get that because it's more of an
awareness thing right like i feel like once people know they might decide to do it if they know me
but for the higher tiers you know i might have like a hundred five hundred or a thousand dollars
a month and that goes to the basically emulating what we did in open collective,
where it's more like,
we'll put your,
your logo on our website.
It's more of like an advertising thing.
And then that is kind of,
it's not risky.
It's just,
I'm going to have to keep pitching because once someone's like,
Oh,
we're not getting enough out of this logo for a thousand dollars,
we don't need you anymore.
And then we're not going to donate.
So like what happened was like a few days ago i had like three thousand a month and then like suddenly like two people
dropped out they both were giving a thousand uh for this next month so now i'm already down to
like a thousand and that's because you know these two companies they're probably not that big they're
not like google it's just like two people or something. The goodwill or charity almost. Yeah. And then maybe they want to support me for a month or they have their own money issues. So
it's like, I don't expect them to do that, but it does suck when it's like, you're like living off
of this number where like, oh, one day it's all gone. And then you're like, okay, now I got to
think about that. The ups and downs of, you know, crowd, i don't want to say crowdfunded i couldn't think
of any other way to say it but like donation based living i don't know like you're you have
that number you look at your own patreon you see what's coming in or what you think is going to
come in or what's expected and you probably plan life around that it should also be said that you
live in new york city right it's not exactly cheap to live there right i'm sure you got
fairly you know even if
you trimmed your expenses you'd be like what kind of what size is your apartment no i don't want to
say it's smaller because maybe you're in the smallest one already i don't know yeah it's not
enough to cover the rent and then not have to pay for insurance too so even the base of that is
already very high but the good thing is that our open collective has a good amount of money there now, so I can at least use that.
And then the other plan is to, like, reach out to companies, which I've done, and hopefully those go through soon, where it's like, okay, one idea is this idea of a support contract, which is something that Webpack's doing, where you pay a certain amount, say it like a thousand dollars a month and then we'll
give you two hours of time and to help you with whatever like and then that means i can work a lot
less hours and you can still get paid um and usually like if you can convince them to do that
that would be really good and you only have to do that for like 10 companies it's just but the
problem is like we have none it's like it feels like there's like a huge potential.
But right now there's none.
So it's like, where is that?
So I would say after hearing that, I would say maybe you could treat the human side of this equation for you.
That these companies are also humans, right?
Like companies aren't just companies.
There's humans behind that.
Users of Babel.
Potentially even contributors of Babel, right?
Potentially, who knows?
And that maybe this is a growth hack that you're using to, one, keep in touch with your constituents, right?
And also potentially get some, for lack of better terms, sales.
Yeah, I think it's definitely a good way in.
And it doesn't have to be two hours.
It could be more and they can just pay more.
I can have a lot of other ideas around
what if they want to be contributors?
Well, you can pay me to help you get into open source.
Although I'll probably still have the meetup for free.
But then for companies, it's like, well, we can charge them.
Come to a workshop. How to be open source.
Before, I was like, what kind of workshop would I i do and i realized there's a few i could do one would
be teaching people about javascript itself i think it's pretty easy to market like the maintainer
babel teaches you javascript and then there's how to get involved in open source and maintaining
open source if someone wants to do that for their own open source and then the last thing is like
how do you contribute to babel itself and then talking about that some companies would be interested i
mean i think most would just be like oh can you give a talk at our company yeah i mean well you
just put it into numbers for me so that's what made me think about that which is like if you can
say well i need for lack of better terms have 10 companies committed to doing two hours because it
equates to this number, you know,
cause then you can start to determine it's going to be there and start to live like it's
going to be there and have some certainty that you don't have now.
I think, well, how do you reverse that is, is, you know, have some relationships like
what I've learned from our business and you probably know this already, but you know,
the biggest part that makes us successful is that we care deeply about
not just our listeners, but also our sponsors, which we, we really call them partners, but they,
the industry acceptance term is sponsor. We really call them partners. People who work with us
that sponsor our content are our partners. We work with them. We form a great relationship
and it's only beneficial if we add value to them.
Like sponsoring our podcast isn't charity.
They get value and it's up to us to help them understand how they get value and help them to get value, not just read an ad, you know?
So in your case, maybe it's the relationship side is, is the equation is that, uh, invest
a little in those people to, to get them to invest in you a little.
Yeah.
That's just one way though. That's just one of the things I'm just thinking about.
That's interesting though, man. So the uncertainty, do you sleep well? I mean,
given, I'm not saying you're in a bad situation because you're not, but just,
you know, there's some uncertainty. Do you fret? Do you get upset? Are you, how's life for you right now?
It's interesting.
I guess I'm not, I don't know, oddly, I'm not like too worried.
I mean, in the end, I do have, the backup is just I get another job and it's fine.
And that's totally.
It's experiment.
Yeah.
And it's just that I guess I'm willing to like take it as far as I can.
And I don't really want to think of that as an option at all, really.
Do you have what they call runway?
Are you like a mini Henry Zou startup?
You got runway?
I have some savings, so I think I'm good.
But unfortunately, there's not that much rest
in terms of thinking about what's coming up.
And yeah, there's definitely uncertainty.
I don't know.
I guess I'm very optimistic.
This is your twist, though, right?
You chose to do this.
So clearly you can't be that upset about your circumstance because you chose the circumstances.
Right.
Someone didn't force me to do this.
People would probably tell me not to.
And I have to go out of my way to be like, okay, I think I truly believe this is a thing and maybe writing the blog post or giving this talk helps me to like
form my thoughts better to think, okay, this is a something I want to pursue and like a goal I want,
not just for me, but like for other people too. So. Okay. Other people too. So are you in a position
yet to advocate others to follow in your footsteps? Are you still an experimental stage only to the
point where you're like, let me try this out for a bit and then i'll let you know or what do you expect for i mean i know that for us a buka dj is also
doing it there's several others we had a small list of people we actually want to do a panel
show with you were one of them around this topic of like you know going full-time you know crown
funded open source maintain your perspective so i know i actually i don't think people should do it like
okay uh you really well for me i spent like a whole year thinking about whether this should
be a thing or not i didn't want to like go rash and just like quit my job because i don't want
to do this anymore and and just do my thing it's like i want to really know that this is what i
want before i do it but i didn didn't figure it out before making decisions
because I felt like then it would take forever.
We had to at some point take a leap.
Yeah, and that's what I did.
And I don't think my goal with doing it
is to convince everyone else to quit their job
and do full-time open source.
That's definitely there.
I just don't see that as that viable at the moment.
It's more like I would rather help people
to be more aware of open source in general
and how we can support people that are doing open source,
whether they're full-time or not.
And probably the most important thing is like,
how do we get companies to either sponsor projects they use
or allow their employees to work on open source at work?
And that's probably the best way
because they're usually unwilling to give money,
but they're willing to let their employees
work on the things that they want
if they're going to ask for it.
But the problem is that a lot of employees don't.
And there's a lot of reasons,
whether it's like being intimidated
or not knowing what to contribute to
or not knowing how to say it.
But I feel like there's a lot of people that would want to do that, especially if they're busy outside of work. They like,
not everyone has a privilege or even wants to work on code. Like you're coding all day and
then you go home and you're going to code again. Like it makes sense that you wouldn't want to do
it. You may have answered some of this there, but I'm not going to ask you for the five-year plan,
but I'm going to ask you for some project for me a time range. Like, what do you, I like, this is how I kind of work.
What is success, right? A year down the road, a year and a half, what does success look like to
you? Like you've done this, it's successful. You can advocate others to try it. Given some
circumstances, for example, maybe have put in some time, have a relationship with the community,
not just jump ship, of course, but, you know you know give me maybe what do you think success is to you a year year and a half down the road or
just whatever time frame makes sense for you well one measure of success that i might think of is
like that the idea that i could just like go away and leave the project and it's everything's good
and so i could leave it now and i mean it's like i kind of talked about this in the talk like you have this sense of like pride where it's like like if i leave everything's gonna go bad and no
one's gonna take it up but that's not the case i'm not that irreplaceable but obviously i still feel
like i'm needed or whatever that means and it's like when do we get to a point where we have
enough contributors where like you know that i don't want to say the bus factory.
My boss, he said there's the lottery factory.
That's like a better way of putting it.
Yeah.
I haven't heard this one.
Yeah.
It's really cool.
It's like, oh, those people won the lottery.
Now they're not going to work on this anymore because they can do whatever they want.
Okay.
So they didn't die.
They just hit it big.
Yeah.
Okay.
So how do we increase that?
So like people can leave and I don't actually, it doesn't have to be I can leave the project entirely.
It's just more like we all feel comfortable.
Everyone feels comfortable to just take a break.
Not just because mentally they were able to overcome their issues with that,
but that everyone feels good that, oh, there's a lot of people working on this
and everything seems to be going well, that kind of state.
Which maybe that doesn't
really like i don't know what that looks like so it sounds like if i understand you correctly that
you think success or you're projecting that success is babble being in a state where you can
leave without any concerns of it imploding for lack of better terms yeah that's one way to say
it and when i say that it doesn't mean i don't want to work on anymore i still really want to it. It's just like, I think it's a good attitude to have when we're doing this kind of thing. And I think that is a good level of success because it's saying that I don't have to be around and that it still moves on. Yeah. And maybe then you won't have anxiety about whatever the issues are. And maybe you can work on other open source or i think for me like i really like working in a battle but i think another thing i like is just thinking about open
source in general and it doesn't have to be tied to this project i just happen to really appreciate
what this project does because it's unique that's interesting i mean i've only i guess my my
perspective so far has been you full-time on open source means Babel involvement sounds like that's not exactly
not that you have plans but just that your mind is open to one day Babel not needing you that you
may have skills and and abilities to be put elsewhere whether it's another project or just
a new thing or whatever I mean for any open source product it's going to go away eventually yeah
um in Babel's case it's not really true in the same
way as most because the funny thing is
that most projects, maybe it's just
people stop using it, but Babel, the whole point
is that you're implementing syntax
that goes with JavaScript. So as long as people
still use JavaScript and we have new
syntax, then there's always going to be a need
for Babel for the people that want to use it.
So in that sense, it's kind of like
always going to be there. But people get bored or whatever. I don't think I'm going to get bored with it, but the reason why that want to use it. So in that sense, it's kind of like always going to be there.
But people get bored or whatever.
I don't think I'm going to get bored with it.
But the reason why I want to look into other things
is just it will help me think about open source differently.
And maybe it's just I wouldn't leave the project.
It's just like I would maybe look into how other people are doing things
or get involved, and that would help me do it better in what I'm doing.
I always get uh this
guilt-free kind of perspective when i think about it like bands right like it you know sometimes a
band might tour with another band or this band or the lead singer might go to this band and you know
do a crossover it's never like they're leaving their band unless they actually do leave their
band but it gives you the freedom to sort of cross-pollinate. And there's a lot of underappreciated opportunities in cross-pollination.
Yeah, and I think at least I should be trying to look, like, if I have the time, then I
can look into what are the projects that are related to Babel.
It doesn't even have to be, like, a different language or something crazy like that.
It's just like, oh, Webpack is used with Babel, so maybe I should learn more about how they
do things and we can work together or view or react.
I think that's a good way to kind of naturally
look into other projects.
I had this idea in my mind of what does it look
like to be a JavaScript
ecosystem maintainer? So not just
one project, but a lot of what that is.
I mean, that seems really
burdensome, but
I think that would help
coordinate things or at least talk to people
instead of just like kind of like we're all in their own little like isolated bubble or something
yeah anything i didn't ask you that you want to share that just i just missed i don't know i guess
my talk today was just about like how we can think differently about open source uh not just like
getting things for free but like actually helping and serving people.
And I think the values that are there,
maybe we don't really emphasize much,
and we kind of get influenced by how non-open source works.
So whether it's adding transactions or thinking about things in a very robotic way,
when open source has its own views on things.
And we kind of just take those things
and we copy them because like that's just the way we do things and that's probably why why do we
keep seeing like all this behavior and we can change the medium in which we do open source such
that it is you know building up people it is bringing like community and and those kinds of
things and i think we should be thinking about
that more like I don't really have an answer and it's going to be really hard because this is all
like non-technical versus like the code itself um but yeah I'd like to think like what are the like
the habits or things that we can create such that we can enforce, not enforce, but reinforce the ideas that we want,
like in our mind,
the ones we want,
but in reality,
we don't act those out.
This episode is brought to you by our friends at GoCD.
GoCD is an open source continuous delivery server built by ThoughtWorks.
Check them out at GoCD.org or on GitHub at GitHub.com slash GoCD.
GoCD provides continuous delivery out of the box with its built-in pipelines, advanced traceability, and value stream visualization.
With GoCD, you can easily model, orchestrate, and visualize complex workflows from end to end with no problem.
They support Kubernetes and modern infrastructure with elastic on-demand agents and cloud deployments.
To learn more about GoCD, visit gocd.org slash changelog.
It's free to use, and they have professional support and enterprise add-ons available from ThoughtWorks.
Once again, gocd.org slash changelog.
Sitting down talking to son willison here getting in the groove talking about data like you are so like more than anybody i've ever met excited about data bro like i've never like earlier i didn't
tell you this because i was just enjoying the moment but you got really excited about some data
oh yeah so um back in 2009, 2010,
I was working for the Guardian newspaper in London
where it was basically,
I was hanging out with journalists
helping solve data problems.
So it was in the world of data journalism.
The Guardian.
At the Guardian, yeah.
This is like the mecca of like,
you're from the UK, right?
Yeah.
It's a very, very high quality UK newspaper.
So if you're working for a newspaper there
or anything journalistic, the Guardian is a good name to have on your resume.
Yeah, that's definitely true.
Okay.
And so when I joined The Guardian, they had this fascinating onboarding process where they basically set you up on coffee dates with people from all sorts of different departments around the newspaper.
So you have coffee with the sub-editor and then you have coffee with somebody who's involved in the print presses and so on and so forth.
And those people will talk to you and they'll introduce you to other people and after a
few of these meetings uh a bunch of people said you know you really need to talk to this guy simon
rogers uh one of the journalists because simon rogers was the um the journalist at the guardian
responsible for the infographics where you publish a graph from the newspaper somebody has to phone
up a bunch of government agencies and gather the data and he'd been doing this for years and was really good at it he could get data on anything and all
of this data i asked him where it was and he said oh i've got it in excel spreadsheets on the computer
under my desk and i'm like this is gold so we got together we we started putting things out we ended
up thinking okay the easiest thing we can do is start a blog we'll we'll start start the guardian
data blog and we'll publish these things as google Spreadsheets. And we did this, and the Guardian
still has a data blog today where they're publishing data behind the stories. Wow.
And that was really exciting, except Google Spreadsheets always felt like a bit of a
shortcut to me. I mean, it worked, and all credit to it, that was fine. But I wanted to do something
better. So last year um i was
still thinking about this problem i've moved on from journalism and had all sorts of other
adventures since then and i realized that the combination of sqlite as a database format and
zeit now as a immutable hosting platform was actually a really interesting opportunity for
publishing data because if you can take any data at all,
you can always wrap it up into a relational database.
They're really good at that.
If you wrap that up into a SQLite database file
and then publish it, you can build APIs on the top,
you can build an interface on the top,
you can start sharing data that way.
So essentially, I've been building the software
that I wish I'd had back in 2010
when I was working at The Guardian.
Really good for databases that don't change.
Exactly.
Meaning like, this is data that's never going to change.
In stone, it's done.
It's facts about the world.
Some of the examples I showed in the talk today,
one of my all-time favorite data sets
is the list of
trees in San Francisco. It's maintained
by the Department of Public Works.
They publish it through their open data portal.
It's basically a CSV file with 190,000 trees.
And each tree, it's got the species and the location
and often when it was planted and who looks after it.
And it's just sat there, this gorgeous data.
So I took that, I turned it into a SQLite database
and I've published that as an API
and then started building things on top of it.
I've got a website, sf-trees.com,
which is a search engine for trees.
So you type cherry and click a button,
and it'll show you all of the cherry trees in San Francisco.
It turns out that's the website I always wanted to build,
and I just never knew until the data was in front of me.
The question about the data, I guess, is that, you know,
if the data doesn't so much change from the past, but there's new data,
how do you deal with new data?
So when you're deploying with static hosting like Zyte,
like Zyte now, you just deploy a new copy.
So the tree data is actually updated pretty frequently.
So every now and then I'll pull a new copy of the CSV file,
I'll turn it into a SQL database,
and I'll publish, I'll just overwrite the old thing
with the new thing, and it works.
I think it's like about 80 megabytes when you deploy it,
which is small enough that it doesn't really matter,
and you can just keep on shipping new versions.
So for the uninitiated out there who are not very familiar with the Zite now,
you mentioned it was immutable.
That means that you can't write back to a database.
So databases are immutable.
So Zite now, it's hosting where everything is fire and forget.
You bundle up your code and other assets.
It turns it into a Docker container.
You fire it up to them.
They deploy it, and they give it a URL which will never change.
So it will always live in that URL.
And in 10 years' time, if somebody hits that URL,
they'll spin it up, and they'll start serving requests from it again.
I think Guillermo said it was called immutable deploys.
Is that right?
Immutable deploys, exactly.
And then you can also set an alias.
So you can say, you know what?
Okay, that URL is going to stay the same,
but I'm going to point sf-trees.com
at whatever the current version is.
So you get atomic deploys as well.
You deploy a new version, you test it,
and then you switch the alias over
once you're certain that it's going to work.
And it's a really nice way of working.
The downside is that you can't do,
it doesn't give you a regular database
that you can post rights to.
You can get those,
but you have to get those from another vendor
like compose.io or Heroku Postgres
or something like that.
But if your data doesn't change,
if it's say a list of 190,000 trees in San Francisco,
you can package that up in a SQLite database
and deploy it alongside the rest of your code
just as part of that regular deploy process.
And that, it turns out, is a really
cunning trick for doing all sorts of exciting
things with semi-static data.
The demo you showed earlier,
we did it before your talk,
but the demo you did was
around, it was fun, possums.
No, that was the...
That was a slightly different project.
Okay, yeah.
Yeah, a slightly different project,
but you're trying to essentially take data sets,
deploy them now,
and be able to perform searches on those.
And in this case,
you wanted to auto-deploy a new site
based on search parameters, essentially.
I've built a few different tools.
They're all open source.
They're available on GitHub.
The first one is this script called CSVs to SQLite,
which is just a command- line tool that takes CSV files
and turns them into a SQLite database.
And that's all it does.
It's like a one-shot pony.
It can do a few extra things.
You can tell it extract this column out into a separate table.
You can tell it to make things indexable
using SQLite's full-text search.
But essentially, you run a command,
and CSV goes in one end,
and a.db file comes out the other.
Right. The bigger tool
is this tool I've built called Dataset. And Dataset is a couple of things. Firstly, it's a
little web server that given one or more SQLite databases gives you a HTML user interface for
browsing that data. So you can click on it and reorder by columns and filter it and so forth.
And it also gives you a JSON API. So anything that you can
see on screen, you can get back as JSON. And this means that you can take a CSV file, turn it into
a database file, and then turn that database into a JSON API that you can run queries against from
JavaScript. The final piece of this is the dataset publish command, which as a command line tool,
it'll take that database, that SQLite database, publish it on the internet with Zyte, add the dataset application itself,
and essentially in one go wrap the whole thing up and turn it into data that you can access as a URL with a JSON API and an HTML interface.
And then on top of that, I built another tool called Dataset Publish, which is a web app that does all of this for you,
so you don't have to install any software on your computer at all. You go to publish.datasets.com
you upload some
CSV files, click a button and it will
deploy to your own Zite account
that
combined database, all of the
code and just get things up and running.
So the idea is that if you're
somebody who isn't comfortable
installing Python command line tools
you can still use this suite to take your data
and turn it into something that's browsable and explorable.
You mentioned in your talk that you're really passionate,
I think either it was in the past
or you're still currently passionate about this,
but like data journalism.
You mentioned the Guardian in your past.
This is something where journalists are not typically programmers.
They're not typically familiar with these tools.
Some of them are.
Data journalism is a very specific sort of subset of journalism where
you've got journalists who are all about Python and R and Jupyter and they're very familiar with
this stuff. And they have conferences where they all get together and share tips. And these are my
people. I'm a huge fan of what they're doing. And that intersection of skills, I think, is really interesting.
But most newspapers can't afford to hire people who have that software engineering background.
You know, the really big papers, the New York Times, the Washington Post, they're all doing
this stuff.
But if you're like a little local newspaper, the chances that you can hire somebody with
software engineering and journalism skills is pretty slim.
What does it mean then, since you're building these kind of tools?
I mean, one, you're having fun, you're here at Zite Day, you're showing off some cool
tools that you've built, obviously showcasing the performance and abilities of Zite, or
Zite Now, matter of fact.
When you start to look at how this applies, some of the tooling you're building applies
to data journalists or just people who are curious about data, essentially taking insights
from data, what taking insights from data.
What are you trying to build for them with what you're doing?
So the starting point, I think, is that CSV is actually a pretty awful format for sharing data.
But it's what everyone uses because...
It's a standard.
It's a, well, a semi-standard.
Once you get into quoting rules and stuff.
It is ubiquitous.
And Excel can produce it and consume it and so forth.
So that's fine.
But it's not the most useful format.
And actually, when you look at CSV files, where they really fall apart
is when you are dealing with something a bit more relational.
You know, you get CSV files where a bunch of the columns are duplicated hundreds of times
because they didn't have the ability to bundle it with a second CSV file
that's just got one set of the data in.
SQLite databases, I think, are the perfect format
for this kind of stuff. SQLite itself is crazy stable. A SQLite database from 10 years ago can
still be read today. It's ubiquitous. It's one of the best tested pieces of software I've ever seen.
And it's in everything. My mobile phone runs SQLite. My watch, it turns out, has a SQLite database in it that tracks my step counts,
which I can't get access to because I have to jailbreak my phone
in order to get the database out of my watch, which is very frustrating.
Come on now.
But, you know, it's everywhere.
And so if I can help show people that if you're going to share data,
sharing it as a SQLite database is actually a much more powerful
and efficient way of doing it than just as regular CSV files.
That's fantastic, especially if I can provide a toolkit
that will take that database and spit CSV back out of it.
So you're not losing anything by using SQLite.
You're gaining stuff, and you still get that CSV export as well.
So you got CSV to SQLite.
What about the other way around?
Not yet.
That's very high on my to-do
list is that feature. Prominent open source.
You're doing some cool stuff there.
The repo you mentioned, that is
open source, right? Yes.
CSVs to SQLite is
open source. Dataset's open source.
They're both under the Apache 2 license.
Explain Dataset real quick, the spelling at least.
It's
spelt like the Commodore 64 tape drive.
So it's D-A-T-A-S-E-T-T-A, like a cassette, but for data.
Dataset. Cassette.
So think about that when you go there.
We'll have some show notes, so don't worry about that.
But Simon, what is it that gets you so excited about this data?
Maybe a better question might be, what is it about datasets like this that gets you excited?
The San Francisco trees is probably my favorite data set.
One of the other ones I really like is one from the USGS.
It's a data set of polar bear ear tags in Alaska.
So in 2009 through 2011, they stuck GPS trackers on a bunch of polar bears and had them wander around Alaska.
And they got back latitudes and longitudes and
battery levels and temperatures and all of this stuff and they published it online as CSV files.
So I grabbed that CSV file with 40,000 known locations of polar bears. I converted it into
dataset and then I used it as the demo for the first one of my dataset visualization plugins.
So I've been building out this plugin infrastructure so we can add additional features.
And the first feature I built was one
which looks for latitude and longitudes
and then sticks them on a giant map with clustered markers
so you can map the whole lot at once.
And when you put 40,000 polar bear ear tags on a map,
and we'll definitely have a link to this in the show notes,
one thing that's interesting that shows up
is that most of the polar bears in Alaska,
but about 200 of those trackings come from Seattle.
And that was an interesting mystery.
I'm wondering what these polar bears are doing in Seattle.
So I zoomed right in on the map where these earmarks were, correlated it to Google Maps,
and realized that there's a company called Wildlife Computers in an office park right there
who sell ear tags for scientists to track polar bears and so evidently
they'd been testing the ear tags at the home office and that data had ended up in the data
that the uh that was published as part of this survey i don't know if the scientists who published
the data would even notice because they might not have that same visualization that shows them where
the clusters are so they could be counting fake data. Possibly. Who knows?
I need to figure out who to get in touch with
and see if they've spotted this.
So here's something I thought about
while you were sharing that story,
was that if you reverse the scenario,
like as a data person,
like you concerned about analyzing polar bears,
for example,
if you had started out with your desire to have the data,
they gave away or just had available
massive amounts of data that was so valuable to you
that it was just not really valuable to them?
Well, a wonderful thing about the US government
is that they release an awful lot of stuff.
I think the default within government
is often to release the data.
Right, it's open.
It's wonderful.
It's part of the OpenGov initiative.
It's that, but also,
I have an interest in zeppelins and airships,
and it turns out that the US Navy
had some really cool zeppelins
in the 1930s, and the photographs
of those are all
freely available, because every photograph
taken by somebody in the Navy
is copyrighted in a way
that it can just be released by the government. So if you want photographs of awesome airships
in the 1930s, the U.S. Navy Photo Archive has all of this stuff for free. I don't think about data
quite like you do, but it sounds like there's just a plethora of opportunity. Where do you see
open data sets, immutable data now? What have you? Some things you've done here. Where do you see like open data sets immutable data now what have you some things
you've done here where where can you see like interesting projects that are exciting to you or
things that could be done not so much tomorrow or next year but like over half a decade a decade
like where do you see some of these ideas you're shaping going to like utilize public data in
useful ways to analyze society and give back answers to make a better future.
So I feel like we've spent a lot of time as an industry sort of dreaming of the semantic web,
trying to say, okay, if we can get all of the data in one sort of standardized format,
then we could build wonderful things with it. And we've been trying that for 10 years,
and it hasn't really worked that well so far. But I think that's because that's sort of a boil the ocean approach.
Try and come up with the perfect standard for data and then they get everyone to do
that.
What actually does work is publish the data in a format that people can use and then watch
people integrate with that and say, okay, well, I'm going to do a custom query against
this, something custom against this, and then combine those together and build something
myself.
So one of the things I want to do with Dataset is not so much establish a standard, but just
make it as easy as possible to put data out there in a way that people can automatically
query it, that can pull from a JSON API, and then watch what people do with it once it
becomes available to them, even if they have to do a little bit of work to clean it up
and reformat it for their purposes.
I'm just amazed that you can find such use with such a simple CSV, for example.
So much data out there available.
We've done shows in the past about open government.
We've done shows in the past about just open cities, like the city of Chicago talking about
manhole covers and stuff like that.
Just interesting.
Do they have a CSV file full of manhole covers?
I'm sure they do because they have the register.
Those things are expensive.
Yeah.
Those things are like, I don't know, 500 bucks each or more.
The manhole covers are super thick.
They're completely metal.
And if they get stolen, they got to replace it.
So think about the cost.
So they have to track them.
I remember talking about this around um it was years ago i
can't recall the exact show but we'll put in the show notes if we can but just like all this public
data out there and like someone like you that gets excited about it like that being able to make it
useful to people who are not exactly like programmers you know and then advocating for
ways to say share data in csv share whatever you can, just to make sure that you can share it with whomever, you know, so that it can be reused in these cases. I mean, with all of this
stuff, the real delight is when people build things that you weren't expecting. When somebody
takes data that you have made available and uses it to solve problems in an interesting way,
that's always a delight. You know, if you're building open source software, the best part
is when somebody uses your software for something that you'd never imagined it would be used for. That's certainly
something that gets me very excited. Maybe some parting advice then for anybody listening to this
segment here that is half as excited as you are about some of this stuff. Where are some good
starting places? What are some skills you've developed over the years that has made it easier
for you to work with and munch data to do some of the things you've described here? So as a Python programmer, one of the things
I love most about Python is it's got an interactive prompt. And this is true of all of the other
programming languages that I like working with. And interactive prompts make it so much easier
to manipulate data, to suck things down, to reformat them. I've recently been learning my
way around Pandas, which is the Python library for dealing with tabular data.
And if you combine Pandas with something like Jupyter Notebooks,
it's an incredibly productive way of sucking data in,
previewing it, formatting,
and then you can write it out to a SQLite database
and publish it in that way.
So I think the data science community
has all of these tools as well,
which are constantly being developed, and so keeping
in with what's going on
over there is super useful for
that sort of, for those tools
for pulling things down, for cleaning up
and for generally manipulating things.
And then the other one is just good old SQL.
It's been around since 1979, and it turns
out it's still a fantastically powerful
tool for doing data analysis.
Any help needed on your projects in open source that you can give a shout out to?
Yeah, absolutely.
So both CSVs to SQLite and Dataset are active open source projects on GitHub
which are very keen on accepting contributions.
I've been trying to label issues with good first contribution and so on.
And so I'm very keen on having people dig in there.
But more importantly, I want people to use the software and give me feedback.
I need to know what works, what doesn't work,
what are the features that would help you solve problems that I haven't even imagined yet.
On your issues, do you share any sort of, not so much roadmap,
but bigger ideas that you don't have time to tackle that,
if someone did have time to tackle,
that you would take a contribution that way,
or do you only have the kind of issues you described?
I have a few issues that are some of those bigger ideas,
but in Dataset, the thing I'm most excited about right now
is the plugin ecosystem.
It's now possible to build plugins for Dataset
that add additional functionality,
and that means that I can be completely hands-off,
and if you want to invent a fantastic new visualization
mechanism that does
time charts against
columns automatically or whatever,
you can build that right now today without
even talking to me, and you can start
using it and shipping it and sharing it with other people.
So I'd love to see people starting to
dig in with the build plugins, but also give me
feedback on what other hooks are needed
to make plugins more productive.
Who is Dataset for?
Is it for developers?
Is it for developer-like people?
Who's the user for Dataset?
My three targets for Dataset are data journalists,
anyone who's got collecting interesting data
and wants to be able to share it.
I'm really interested in museums
because it turns out museums have huge amounts of metadata around their collections, which often is locked up somewhere. And I'd love
to help get museums start publishing that. The Metropolitan Museum of Art in New York
has a spreadsheet with 400,000 items in it, which I've turned into a dataset instance.
And so that kind of thing I think is really exciting. And then the third one is civic
institutions. There are all of these governments out there that are publishing data in various different formats.
If I can help them publish it more effectively and in a way that's more useful to people, that I'd find really exciting.
When we were talking earlier, not in this podcast, but like earlier today, the interface you were showing off where you can like create a database.
I think you were even talking about the museum example.
Was that data set?
Yes. Because you showed me a couple of things think you were even talking about the museum example. Was that Dataset? Yes.
Because you showed me a couple of things.
That was Dataset?
Yep.
You were able to dig into a query
and then even augment the SQLite or the SQL queries, I believe.
This is one of the most interesting things
about doing things read-only,
is if you've got a read-only SQL database,
I can just open it up to select statements,
because it's not like anyone can cause any harm.
They can't modify the file on disk.
Because it's on now. Well, because it's now, and can cause any harm they can't modify the file on disk so because on now it's well because it's now and also because when i open the database i use sequelites
immutable option okay so sequelite will disable any rights that could possibly happen gotcha um
and so on top of that i have a one second time limit on sql queries so if you try and
like do something too expensive it won't work. But other than that, you can just be completely free with it.
So it's kind of like JavaScript developers get very excited about GraphQL
because it lets them specify exactly what they want to get back from the server.
With Dataset, you can do that with SQL.
You can write SQL in your JavaScript,
selecting the exact comms you want, joining against different things.
And it works, and it's fast, and it gives you back that data as JSON.
You were showing me, you got really excited about this, different things. And it works and it's fast and it gives you back that data as JSON.
You were showing me, you got really excited about this, you were showing me the example of Australia and dogs. Yes. There are eight different counties in Australia who publish
lists of dog registrations when people go to register with dog license and they've got the
breed and they've got the name. So I've got one little tool where I combined all eight of those
CSV files. I've got them all in the dataset instance. the name. So I've got one little tool where I combined all eight of those CSV files.
I've got them all in the dataset instance, and now I can say,
okay, for golden retrievers, what's the most common dog name?
So you can search by species, some by the count of each name,
and see what the most popular name for different types of dogs is.
The answer is always Bella, it turns out.
Bella, no matter what species of dog you are,
Bella is the most common dog name in Australia.
But it's still kind of fun to see the different naming trends.
And then for Pugs, I believe it was Ruby, right?
Yeah, Ruby beat Bella, actually, for Pugs.
Only for Pugs, though.
Only for Pugs.
And I think Pugsley came fourth.
So what I find interesting about that is here's some obscure data set
that maybe nobody's paying attention to.
And because of what you've done with data set,
you're able to query it in these ways and find out this information maybe it's just for
the curious but i kind of find that kind of interesting to me like that you can just
you know play with this data like that and of the entire uh not even country continent of australia
right and the dog thing the dog thing is is is kind of um it's amusing but not necessarily useful a much more useful one is um uh it turns out the uk members of parliament in the uk have to register their
conflicts of interest with parliament in the register of members interests and its public data
and some and there's an organization called my society who turned that into xml i took their xml
and i loaded that into a sqlite database and so now i've got a tool that lets you search 1.3 million line items of mps saying who paid the money who invited them to
speak who gave them a free watch it turns out um that the sultan of brunei hands out christmas
hampers to mps every christmas and you can see which mp has had the most christmas hampers from
him it's super fun and actually this is, this is newsworthy. If you're wondering
why does this certain MP
behave in certain ways to different countries,
you can dig through all of this stuff and say, oh, well, it turns out
they've been giving him a lot of free watches.
And it's undeniable data because
it's from the Parliament, right?
Yeah, it's official data.
They publish it as a rather ugly set of webpages,
but with a little bit of work, you can
turn that into a queryable database.
And how easy is it to refresh that work?
Like if it's so hard to, you know...
So most of the work's been done for me
by this organization, MySociety,
who've been scraping this for 10 years
and dumping the data format into these XML files.
So I just wrote the thing that turns XML
into a SQLite database and built on top of that.
Simon, I'm sure we could talk for
hours. I promise you 20.
It was a good 20 for sure. So thank you so much for
sharing your story and
where can people find you? Where do people find you at
on the internet? Mainly on Twitter.
I'm twitter.com slash SimonW
and I've got a blog at simonwillison.net
as well where I write about a lot of these
kinds of things.
Alright, thanks for tuning in to this special episode of the changing vlog on location at zeit day 2018 if you've never been to this conference i highly recommend you attend
head to zeit.co slash day to learn more and see the videos from this past conference
and of course thank you to our sponsors rollbar lin, Linode, and GoCD. Also, thanks to Fastly, our bandwidth partner.
Head to Fastly.com to learn more.
And we move fast and fix things here at ChangeLog because of Rollbar.
Head to Rollbar.com to learn more.
And we're hosted on Linode cloud servers.
Check them out at Linode.com slash ChangeLog.
This episode is edited by Tim Smith.
The music is by Freak Master Cylinder.
And you can find more shows just like this at
changelog.com or on your favorite podcast
app or wherever.
Subscribe where you can. Thanks for tuning
in. We'll see you next week.