The Changelog: Software Development, Open Source - Burnout, open source, Datasette (Interview)

Episode Date: May 9, 2018

Adam is on location at ZEIT Day talking with Jessica Rose about burnout, Henry Zhu about his passions and pursuit of open source, and Simon Willison about data and his passion for interesting datasets... in the world.

Transcript
Discussion (0)
Starting point is 00:00:00 Bandwidth for Changelog is provided by Fastly. Learn more at fastly.com. We move fast and fix things here at Changelog because of Rollbar. Check them out at rollbar.com and we're hosted on Linode servers. Head to linode.com slash changelog. This episode is brought to you by Rollbar. Move fast and fix things. Resolve errors in minutes and deploy with confidence. Head to rollbar.com slash changelog request a demo get started today it's loved by developers trusted by enterprises and most of all we use it here at changelog move fast and fix things with rollbar once again rollbar.com slash changelog All right, welcome back, everybody.
Starting point is 00:00:51 This is the ChangeLog, a podcast featuring the hackers, leaders, and innovators of open source. I'm Adam Stachowiak, editor-in-chief of ChangeLog. On today's episode, I'm on location at Zite Day 2018 in San Francisco, talking with Jessica Rose about burnout, Henry Zhu about his passions and pursuit of open source. And finally, Simon Williston about data, data sets and his passion for interesting data sets in the world. I personally connected pretty deeply with your talk because I almost said podcast because it's on the brain. No worries. But because I think everybody's experienced some level of burnout, whether they admit it or not, you know, and I, you know, I kind of empathize with you because you said you're getting older. And I'm also realizing that I'm not immortal and I am getting older.
Starting point is 00:01:42 So I realize that, you know, things start to hurt or it's harder to wake up and be excited, even though I run the thing. Like I'm in control of what I'm doing, aside from not really being in control. Yeah, it is one of those things where as we're getting a little bit older and I think part of that isn't just getting older, but I think it's getting either the space within the industry or sort of the backbone ourselves to say, oh, no, wait, I am tired. Oh, no, wait, this is hurting me. So the folks I see really impacted by burnout and swept off their feet are the youngins who feel like they're invincible. Like I said, oh, no, I'm going to be fine. I can work these 80 hour weeks. Weren't you there though? I was there at one point in my life. I was the young one who was invincible. I've been old for forever. And I can remember pulling all-nighters often.
Starting point is 00:02:28 I remember having insomnia and not being able to sleep just because and still getting up and doing the work. I can't say I was the best at it all the time, but I remember days where I felt young and invincible. And I think that's a really important point. When you look at doing meaningful work, your ability to stop doing meaningful work ends after six or seven, eight, if you want to go by like standard workday hours, but like that ends a lot early on their people keep working. Yeah. And I saw a really interesting, someone mentioned, someone said something really valuable
Starting point is 00:03:02 to me. I think they got it off the internet where they say, if you're doing, if you're, if you're going at 150%, you're really just taking out a loan from your future self. Yeah. Can you talk about your very visual, so it's probably difficult for a podcast, but this visual process of memory load. Can you just remind me what the term was for it? So I'm really into pop psychology, and I'm really into pop psychology from the 60s because I'm a very specific kind of nerd. And cognitive psychology was sort of a branch of psych. Someone is going to hear this and fuss at me for what I got wrong.
Starting point is 00:03:36 No, I'm so I'm delighted for it. I'm pretty, pretty laid back about being wrong. Cognitive psychology was a period where it was a specific branch, a study of psychology, and folks started thinking about and talking about the brain's organic processes and psychological processes in computer terms. Right. So when you talk about working memory from cognitive psychology...
Starting point is 00:03:59 Working memory, that's right. It's exactly what you think it is. It's the amount of memory, it's the amount of space you have available for mental processes. Whereas cognitive load is the amount of working memory you have booked up. It's how many mental processes you're doing and how much of that cognitive load they eat. So I often like to ask people to visualize having too many browser tabs open. Like that's eating up too much of your browser's working memory. Right, right.
Starting point is 00:04:28 Last time I actually had a bout of this. I opened up Slack with the intention of going to a particular person's private conversation to pull back essentially what email is. You know, very similar. Like, I'm going into Slack to get some information. I know they share with me. I've got to do a couple scrolls. I knew I had to do that. But before I knew it was five minutes
Starting point is 00:04:48 later, something else grabbed my attention. I'm in Slack and I'm like, what am I doing? I didn't even do the thing I was supposed to be doing. So I was pretty tired last night. I was flew in from Houston to San Francisco. I was, I went to bed early when I normally don't go to bed very early, maybe 10, 11, 12 or whatever. But I mean, I was 12 being going to bed early. If my wife's listened to this, she knows I'm lying. Cause it's more like one or two. So there, there I am. I'm stretching a little bit. I'm just trying to like not embarrass myself. It's one of those things, isn't it? Where you're like, do as I say, not as I do to be like, Oh yeah, take care of yourself, kids. I'll stay up all night and worry. My wife's like, do you have to work tonight? I'm like, I do to be like, oh, yeah, take care of yourselves, kids. But I'll stay up all night and worry.
Starting point is 00:05:25 My wife's like, do you have to work tonight? I'm like, I don't have to, but I probably could do one or two things to make tomorrow easier. And it only makes it, it probably makes it a little easier, but it still probably makes it just as hard. I love how listeners can't possibly catch this. But as soon as you said that, you made clearly the face your wife would have made, which is just a gentle eye roll. Yes, yes. And she realizes that I try my best to do balance. It's tough. It's not always easy. Um, but I can feel like I teeter on burnout and I kind of say it's okay in seasons. Have you ever heard this term?
Starting point is 00:05:57 I have not heard this except being burnt out for a season. And then, cause I know I got a lot going on, say the first quarter of the year, the half a year or some kind of commitment where i say it's okay for this length of time but then once i get there i'm starting to do what you said in your talk just starting to say no to things cutting things out in positive ways that don't hurt you um so that's i'm making it about me i want to get back to you and parts of your talk but something you said though in your talk was like and i think what a good message might be to share is like, it's okay to be overloaded, but to be realizing it and then say, don't just like eject all the things. Right. And then screw your life up, you know, pull out things that you said it in such a way, basically that it seemed like don't
Starting point is 00:06:41 hurt yourself in the process of like saying no to things and removing things that are bad for you to do and taking up too much of your time in your life? Well, even when you're critically burned out, there are some things that you have to do to sort of save your future self. So for me, when stuff got really bad, internet grocery shopping is absolutely my savior. Buying a bunch of quote-unquote healthy frozen burritos, getting an accountant to make sure
Starting point is 00:07:07 I was paying all the things that needed to be paid. And even outsourcing some stuff, like having some folks help with cleaning and laundry, which is a massive privilege and I feel so lucky to have been able to do. But just either getting other people to help or me myself making sure that the stuff I'm dropping isn't gonna make my life much more difficult in two months.
Starting point is 00:07:28 But I think a lot of our dear listeners may not have sort of the funds or the flexibility to do it. The advice I would still give is when everything seems overwhelming, I've got to do this and the other thing and that things do. Really triage what's going to doom you. Well, everything feels like doom when you're overloaded, but what's going to make your life more difficult if you don't solve it, you probably don't have to call your aunt back. You probably do have to pay your tax bill. Like, yeah, just triage where you can. Yeah. There's a book by Brian Tracy that may not be the best prescription for this, but I think it's called
Starting point is 00:08:06 eat that frog or eat the frog or something like that. Have you heard of that? And essentially, you know, it's not a great to do list type of thing, but essentially do the thing that's the thing you have to do the day that day first, you know, it doesn't mean, you know, literally eat a frog. It just means that if you've got this one big thing to do today, and today's a success because you did that thing, do that thing. Don't wait two or three days to do that thing and call your aunt back or whatever. Instead of taking my approach where it's like, oh, I'm really dreading this. I'll just put it over in the corner. Do the dread thing first. That's his basic advice. I'm not sure if it's the best advice. It's difficult to do that. But
Starting point is 00:08:44 I think what he means is to say is just don't put the things off that matter most too far. Do them pretty quickly. You shared stories, too, about people may not realize this, but right now it sounds like you may be in a case of burnout. Not so much from our conversation, but from your talk. No, I was really critically burned out seven, eight months ago. Okay. And I got really lucky. Being able to successfully work through burnout is really rare.
Starting point is 00:09:10 But I'm working over at FutureLearn now. We're like an ed tech platform. And it's the most reasonable place I think I've ever worked. So I'm managing a team of really brilliant engineers. I go home right at five and I don answer emails, and I don't do Slack. And I get to fuss at my team, too, to go home right at 5 or at 6. And not do – yeah, the only thing I ever think I fussed folks out for is, like, I saw you were emailing on a weekend. Yes, I like that.
Starting point is 00:09:40 We've been doing that. I'm not the best at it personally. However, I do like it when my team is that way because it lets me know they care about balanced life. And if I don't ever want to make them feel like they have to do something that's for us outside of like normal hours. It's just like, I just don't want you to feel that way. But an important part of leading in that way is making sure you don't do stuff that they say. I know I'm kind of a bad example. I'm working on it. So that's part of burnout too, is like, you may not be able to get back to like perfect you
Starting point is 00:10:07 in this burnout stage to like get back to some, you know, equal balance, you know, balancing. I think it takes time. I'm a big fan of iteration. I realized that today's delivery may not be perfect, but it's good enough. It's enough to put out there and get a feedback loop going on to learn on how to better course correct for the next iteration. To me, that's a pretty profound thing. So what does burnout feel like? What do you think, what would you describe burnout as? So if someone's listening to this and they're thinking, kind of sounds like it might be burnout. What do you think burnout is? What was it to you? Fantastically, there's a ton of research around this that says burnout is effectively
Starting point is 00:10:44 indistinguishable from clinical depression. Yay, it's just great. That's pretty common though, right? Being depressed. Or clinical depression. Is it different? Clinical depression is just diagnosable depression, I believe. So it's really challenging because the symptoms of occupational burnout both mirror depression and other mental illness issues and can trigger them. So it's like a one-two whammy of misery. If you are listening and you're
Starting point is 00:11:11 concerned, like, wow, maybe I'm burned out, the Mayo Clinic has a really fantastic checklist, which you can search for online and go through it. A checklist. I'm assuming if I took that checklist right now, they'd probably say yes. If you were in my talk, if you said yes to a couple of things, I was, I was side-eyeing. I wasn't, uh, I didn't want to say yes to any of those things, but I couldn't help but doing so. And, you know, there's times when I get up, I love what I do. And it's a shame because I do really enjoy what I do, but not every day. I don't always enjoy it the first day in the morning. And I was, I was, uh, really listening closely when you said, you know, you have this, this bug or this, this thing you're dealing with, you know, with, with a code problem or whatever. And, oh, by the way, I've got to take care of laundry. I've got to help with this or like,
Starting point is 00:11:59 that's what happens to me sometimes is that I heard myself in your talk today, basically. And it's a lot of like everybody's brain works really, really differently. And like neurodiversity is one of the most exciting things about the way people engage with the world around them. And like mental process and social process are so cool. I think wedding self-care into the way we view our own mental processes and approach the world is absolutely critical. What is that? What do you mean by that? So for me, when I was burned out and coming through recovery, one of the hardest things was you aren't running on all cylinders cognitively. So I had very, very serious memory issues, which is quite common. I was consistently despairing
Starting point is 00:12:43 and miserable, and I had a really hard time seeing success in my own work. So I'd do something, I'd do a measurably good job and they'd be like, well, I guess that's okay. And for me, if you've heard me wandering around the halls today, you've probably heard me sort of morosely go, everything is fine. To yourself? To other people as well, where they're like oh this thing of like everything so i had to kind of not quite home cbt but like work to build new pathways where it's like this is terrible and everything's bad but like you know it's it's probably fine
Starting point is 00:13:16 it reminds me of something that uh happened on my trip here i was I had to drop off my car in parking and take the, I guess it's some sort of bus over to the departures. And I was in a rush. She was only supposed to take C, but she decided to pick up A and B as well. And then everybody started piling on. And then I had to scoot over and share half of my C with somebody. Long story short, it was just like, she thought I was complaining, but I wasn't. But I liked her response to her thinking I was like being an abrasive person. I may have been, but I don't think I was trying. I definitely wasn't trying to be. But she says, it's a beautiful day. And I was like, you just abused, I'm not a bomb, but you just abused a bomb. You know, because like you can say that thing, you say it to yourself. It's not like you say it to other people too, but it's that one thing you can say that's like, you know,
Starting point is 00:14:08 everything's fine. It's a beautiful day. It's just, you know, it's just something you could say to sort of like take away the stress and crazy, the chaos that might be coming. What do you think? I was going to say, I've got something that sounds very nice and placid, but actually means piss right off. And I wasn't supposed to be doing that. Sure. You can do that. Cause sometime. Yeah. One of my favorite things is, oh, well, perhaps, you know, best, uh, which does mean like, um, please. She's mouthing something. We, we, we, we try to remove the explicit tag for our podcast for our young hacker fans out there, but she's, she's, uh, I think that, yeah, if, if, if I'm ever in a situation with somebody being terribly unpleasant, be like, oh, well, I suppose, you know, best
Starting point is 00:14:52 folks know what you mean. I hope. Yeah. The one thing I would say about burnout or because burnout so closely mirrors depression is that if anybody listening to this thinks, oh man, wow, maybe that is an issue for me. Yeah. One thing I'd encourage people to do almost immediately, but immediately is seek some kind of healthcare. So go and talk to your doctor, talk to them about how you're feeling just because it could be burnout. It could be depression. It could be one of a dozen physical or mental health issues. Just going and making sure that you're not listening to a podcast and going, oh, well, Jess said, just balance my... No, no, go see your doctor, please. Right, right. This is not medical advice. This is only two people talking about sharing war stories of burnout. Jess, you said that in your talk, you mentioned that your doctor
Starting point is 00:15:39 had actually written you a prescription to quit your job. Can you talk about that? It was very jokingly. So that wasn't real? No, it was... It did happen. She did. Okay. Can you talk about that? It was very jokingly. So that wasn't real. No, it was, she did. Okay. Um, can you share the story? Uh, so I went in to talk to my GP and I've been working as a contracting consultant for, for a while. Um, and I went into chat with her because I was having a really hard time sleeping and I was lethargic. I, I was just pretty miserable and my, my GP, my general doctor, is just absolutely glorious. She's very, very sarcastic and absolutely delightful. She's like, okay, okay, okay.
Starting point is 00:16:11 Well, I can give you this antidepressant, this antidepressant, or this antidepressant, but you should probably just quit your job and we can skip all of those. I was like, are you telling me that's your medical advice? She was like, yeah, yeah, look. And then wrote me out, grabbed a prescription pad and just wrote, please quit your job. Please quit your job. Well, I mean, technically she did write on paper that was her, you know, her medical pad. I don't want you to call those things prescription pads.
Starting point is 00:16:38 It's a legit thing, right? It was just a bit of stationery. I don't think it was a proper prescription. Oh, it wasn't? Yeah. Okay, cool. I was taking it a little further, but. No, you want it to be like a, a sticky
Starting point is 00:16:47 pad where they tear off at the top. No, not, not quite so romantic. Okay. But so did you quit your job? Uh, it was a contract. So I got to the end of my contract and then yeah, I got into something that was less stressful, which isn't fair to the contract. It was a completely reasonable job and completely reasonable people, but had 60, 70% travel and early stage startups, you know? Yeah. Just unhealthy for you. Not so much unhealthy in general, like to other, somebody else who could do the role.
Starting point is 00:17:18 Just not the best for you. Developer relations folks talk about there being a, like 18 like 18 20 you can do it maybe two years and i think at the time i'd been doing it four or five you know there's there's actually several people i know that have been in like head of something or dev evangelist or relations whatever name you want to apply to that role essentially telling people about products to encourage you to use them through developer means whether it's how to use something or a demonstration or meetups or whatever. There's a lot of people I've seen do that job that eventually do for sure burn out. Oh, yeah.
Starting point is 00:17:54 We just drop off the mat for a year and a half, two years, go to Thailand. Folks do tend to take a chunk of time off. Well, it's fun to do. You may go into it thinking, won't be me. You know, I can conquer this job. I can travel the world and not get burnt out. I could put in 15-hour days, seven days a week or whatever the time constraint is. What's a normal day-to-day?
Starting point is 00:18:18 Like is it 10-hour days, 15-hour days? It really depends. So I was doing a lot. I'm such a ham. I was doing a lot of travel and speaking. So there was, I think I was in maybe 50 countries in two years. I was doing maybe two or three conference talks a week. My husband switched to being a house husband just because I would come home and need somebody to swap out laundry and run my life admin.
Starting point is 00:18:42 I don't necessarily do. I love DevRel. I absolutely recommend doing it. I don't necessarily recommend love DevRel I absolutely recommend doing it I don't necessarily recommend that level of travel I absolutely recommend house husbands though, they're just glorious Yeah, is that right? Everyone deserves one Well, how do you get one?
Starting point is 00:18:59 I suppose you marry someone who is more caring than job focused. Okay. Yeah. Okay. Interesting. Part of good advice.
Starting point is 00:19:10 That's always good. Like what's on the horizon for you? Where are you going? What's next for you? What's some good advice to give back to people? How can we tell this out? Oh dear, that's so many questions. So for me, I'm doing a lot less traveling and speaking.
Starting point is 00:19:22 And instead I've been focusing on stuff that's scalable and I can do for my flat. So I'd run a podcast myself, the Pursuit Podcast. Which you've got to scroll down a bit because there's also a lot of church podcasts with similar names. The Pursuit Podcast. Yeah, I guess so, right? So if you're searching for it, you may find a bunch. But if you go to... It's the blue one.
Starting point is 00:19:41 What's the website? If you go to pursuit.tech, you'll find us there. Nice.tech. It's a good domain. That is a pursuit.tech. If you want to fuss at me directly, I'm at jessica.tech. Always very on brand. Okay. What is the podcast about? So we tend to focus on a different topic each month. And we just talk to folks in that space and about their experiences and about what kind of advice they could give. So we talked about wearables this past month. And then next month we're going to be talking next. Oh dear. That's like tomorrow, isn't it? In May, we'll be talking about different types
Starting point is 00:20:16 of fundings. We'll be talking to somebody about bootstrapping, talking to somebody about working with VCs. Yeah, I'm pretty excited for it. What you read the summary of it, what's the promise of it? Oh, man. I wrote that like two years ago. So I think it's your guide to getting the things you wanted. Oh, dear. Oh, no. It says some stuff.
Starting point is 00:20:38 It says some stuff. Okay. Well, listen to Jess's podcast and learn some stuff. Yeah, there we go. Pursuit podcast. Is it pursuitpodcast.net? Or sorry,.tech..tech..tech. Or pursuitpod on Twitter. Pursuitpod on Twitter. Okay, easy enough. And parting advice. Parting advice is, oh, absolutely fall in love to saying no.
Starting point is 00:20:58 The way I got so burned out was, oh yeah, I would love to come to your conference. I'd love to help with this project. Of course I'll work overtime. If you learn to say no judiciously earlier, you can make your life a lot easier. Saying no is the toughest two-letter word in the English language. It's so tough. But I can agree with you that saying no has its fruits because I've said no to a lot of stuff and have seen the rewards of that. So it's definitely a good thing for it. And once you get used to it, like being a little bit burned out, I've fallen in love with it. Are you familiar with Derek Sivers by any chance? I'm not.
Starting point is 00:21:37 So Derek Sivers wrote several blog posts. He started CD Baby. He's got a really interesting life. He's a real interesting person. Let's just say that. And he has this way of saying it's either when he says a decision to do something for him, it's either heck yes or no. It's like he's really excited about it or it's definitely no. Because if he's not really excited about it, he doesn't have time.
Starting point is 00:22:02 Almost like in your talk where you don't have enough hex for something. It's like that for him. It's like, I don't have enough to do this all in, like I do things, so then I said no. That makes quite a bit of sense. That's his way. I kind of like that one. I might politely disagree a little bit
Starting point is 00:22:18 because some of the stuff I've found most rewarding is stuff I never thought I should do or that I'd enjoy. I can kind of agree with that too because there's been times I've said yes to things I'm like, ain't going to work out. It's not going to be fun. It worked out and it was fun. It was great, yeah.
Starting point is 00:22:32 You know, so. Kind of the night out principle. When you're going for a night out with your friends, the nights where you're like, yeah, this is going to be great, are often duds. And the times where you're like, oh, I'm not sure I want to go out. Best time of your life. Best time of your life.
Starting point is 00:22:44 I can agree with that. This episode is brought to you by Linode, our cloud server of choice. It's so easy to get started. Head to linode.com slash changelog. Pick a plan, pick a distro, and pick a location, and in minutes, deploy your Linode cloud server. They have drool-worthy hardware, native SSD cloud storage, 40 gigabit network, Intel E5 processors, simple, easy control panel, 99.9% uptime guarantee. We are never down. 24-7 customer support, 10 data centers, 3 regions, anywhere in the world they got you covered.
Starting point is 00:23:34 Head to leno.com slash changelog to get $20 in hosting credit. That's 4 months free. Once again, leno.com slash changelog. So, Henry, you're in the middle of an experiment, which I think is, to some might seem risky, but you've got it kind of fleshed out. You've been down this road. You've been doing open source for a while. You kind of you've even said you've accidentally became a maintainer at one point, I believe, either on the React podcast or request for somewhere.
Starting point is 00:24:18 I've heard it. I'm not really sure where. Where's this leading you to? Like you're in open source full time now, but solo. Not for like a company's not paying you to do it. You're doing it yourself. Right. The reason why I left, part of it was that I don't think a company would pay me to do the things
Starting point is 00:24:34 I would want to explore and experiment with. Mostly they're going to pay you to, I guess, work on features or just the things that they need for their business. And if you talk about like, you know, how do we build community or bring in contributors, they want that, but they're not going to pay you specifically to do that. Or what does it mean to do mentorship when it's an intern, but they want to hire them to work on other things, not open source.
Starting point is 00:24:58 And all these companies don't want to take ownership of open source or Babbel for me. And it's like, where can I actually do that? And I feel like the only way to do that is doing it on my own then. I mean, it makes sense, right? A business has to have some sort of value from the exchange. At some point, you'll be your own business. You may have to have similar considerations of how you spend your time, right? Is it valuable?
Starting point is 00:25:20 Does it help me meet? Or you may actually have to make a choice of being able to earn money or serve open source. What do you think about that? Right, because the things I want to do aren't things that people would normally pay for. So it's like either I have to make money doing freelance, which I don't want to do, or get enough money from doing maybe consulting or support contract or workshops for Babbel such that I can do the other work. And this is why I made the Patreon because then people can support
Starting point is 00:25:52 me for whatever I'm doing. They want to support the person rather than the project. But I have found that like for most companies, they only want to support the project, right? Because it makes, maybe it's just better for them from sort of like a business point of view or just talking about to their boss. They don't want to support a person, but individuals would rather support a person because they understand who you are and you're, you know, you're literally supporting their livelihood. And I think that's more like something that they want to like contribute to rather than a project where it's like, it could be a lot of people.
Starting point is 00:26:25 We don't know how you're spending it. You know, what are they, what are they doing with this money? And learning to figure out how to spend the money for a project is really difficult. Like what's that conversation like look like, look like and stuff. So to not assume that the listening audience is familiar with your story, give me a quick version of like, not so much you're getting started, but like this transition, you worked at Behance, you decided to make the shift, kind of give us that and we'll go further from there. So first I got my job at Behance because of open source. Like I got involved in the project and then they emailed me, Hey, do you want to move to New York
Starting point is 00:26:58 on this team? And so I was already wanting to do open source more. So I was like, if they found me through it, then I think that means they care about it. And so I was kind of to do open source more. So I was like, if they found me through it, then I think that means they care about it. And so I was kind of doing it, but I was working on the product team because Behance, Adobe, they're mostly focused on that. And so it didn't really make sense for them to like have someone working on it,
Starting point is 00:27:17 like just for open source. But, you know, halfway, like, I guess, not halfway through, like a year later, I'm working on Bavel in my free time. We use it at work. Why are we like, we shouldn't wait for me to work on it if we need something. So like, well, why don't we like, let me work on it at work then. And so they, I think I asked for full time and they gave me half, which was like already amazing. Right. And so I kind of like, just was like, oh, that's, that's cool. Okay. I'm good. so i kind of like just was like oh that's that's cool okay i'm good and i kind of just started working on that um but the problem was that that's always going to be hard
Starting point is 00:27:51 like what does it mean to do 50 is it like half of my sprints are babble and the other half is how did that work out actually practically yeah did it work out were you still yeah the same amount of responsibility is just like oh now half of it really is this, but it's not. Yeah, I think it's kind of like that. It was an experiment for them, too. Yeah. Because they've never done that with other people. I mean, kudos to them, too, for trying.
Starting point is 00:28:12 I mean, you can't knock, I mean, obviously there's no hard feelings, but it's just like, that's risky for a business to like, that's brand new for them, even. I think it's just that they, a lot of the people there have a, like, previous experience in culture and open source and another product like jQuery and other projects and so they're they understand what that's like and they they want to they see where like where my passions are and they want to support that because I'm still working on other things and then the things I'm doing only help us too so like it was they were able to justify it um it's just that in the end, I would have this, my own guilt of not feeling good about doing open source, even though it is my job.
Starting point is 00:28:50 And like, you know, we have deadlines. And so it's like, I would always tend to, like I wanted the open source, but I would be doing my other work. And it's just like a weird situation. And, you know,
Starting point is 00:29:01 everyone was really supportive. It's just like mentally, I don't know if I could take like doing it and then doing this other thing and so in the end i was like i think i need to just figure out do i actually want to do this i had lots of conversations with my boss about you know what is it do i actually want do i just say i want to do open source because it sounds like a good idea like full-time and you know the reality is like a lot of people don't want to do it full-time because it should be just like something that's fine and like you shouldn't have to like worry about your whole day like every day and dealing with like community and all these other things
Starting point is 00:29:34 which are not just purely coding and so but then I realized that's the part that I like I like working on those non-technical parts anyway they They were totally on board when I said, oh, I want to leave. When I made the decision, they knew I was sure and that they knew I wasn't just making it up or something. They're like, okay. My boss was like, if anyone can do it, I think you can do it. Were you concerned in those moments? I can just imagine me, so I'm not you obviously.
Starting point is 00:30:04 I would be nervous. I would be like, gosh, man, what are they going to think? Are they going to like, you know, is the world going to think I'm an idiot for doing this? Like, is this a wrong move? Like, take us through some of the, you know, like the morning of. Like, I know today I'm going to give my resignation. How's it going to go?
Starting point is 00:30:20 So I don't think I ever thought like people think it's going to be dumb or bad. I actually never thought that it was just more like what's gonna happen to me and like am I just running away or not running away I'm seeking after something and like not taking the like risky road instead of just having a safe comfortable you know job and working on it at halftime like maybe I should just be okay with working on halftime, which is already better than a lot of people. But I guess in the end, I was so, I felt so convicted to want to do it that much
Starting point is 00:30:51 that I decided that was the right thing to do. And when I went there, I felt pretty confident in going in and doing it. And in the end, I didn't really have any issues when I got to that point. But before it was like a lot of just not even wanting to think about it in the first place. Cause you're like, that's such a dream or that's such a like impossible thing
Starting point is 00:31:09 that I never just thought about it seriously. And so trying to do that, you know, thinking about the taxes and insurance and all that, you know, just to try to make it seem more realistic. Not that in terms of like, it's easy or hard, just that those are things I need to think about if I'm going to do this. Maybe I'm maybe I'm just missing it. But what's a day like in your life? Not so much Henry's life, you know, like get up, brush your teeth kind of thing. But like, you know, what is being an open source to you? What is it you said you get excited about community, the things that they weren't willing to pay for?
Starting point is 00:31:43 What is it that are the intangibles that are not so much like code level human level what's that for you yeah i think so that's something that i kind of talked about this in my talk it's something i always wanted to do but i never really did it like i would kind of half do it because i would it's like you have a picture in your mind of what a maintainer should be and usually it's like about code and so even if i wanted to do something else i knew like oh no i'm in your mind of what a maintainer should be. And usually it's like about code. And so even if I wanted to do something else, I knew like, Oh no,
Starting point is 00:32:08 I'm in like this, I have like this box where it's like, Oh, this is what a maintainer should be. So I was just focused on code, but now it's like, okay, I don't really know what it looks like,
Starting point is 00:32:16 but I want to figure out those things. And so that means like, um, role right now, I'm only just focusing on like getting funding and that kind of thing. But I do want to spend my time like, okay, should we make a Babel meetup where like, Right now I'm only just focusing on getting funding and that kind of thing, but I do want to spend my time like, okay, should we make a Babel meetup where instead of giving talks, it's just a meetup where we come and I'm there and I'll teach you about how to contribute to open source or get involved in our project.
Starting point is 00:32:37 Instead of waiting for people to randomly show up in our GitHub making a PR or an issue, obviously everyone's very intimidated. So if I'm there, then people will feel at least willing to show up and see like what's this about you know maybe we make a podcast maybe we do live streaming or we start doing more like video chat instead of just talking on Slack or Twitter like just things that make it so that I seem more accessible and human instead of just like some person and also so that people know that there's like a few people working on this not just like a huge company and like a huge team or something like that it's like a few volunteers but yet you're using it and then people feel like oh I'm not good enough but it's like I got started in the same way not
Starting point is 00:33:23 knowing anything right accidentally and I'm here and I, but I got started in the same way, not knowing anything, right, accidentally, and I'm here. And I want to help other people get to that if they want to. So in a lot of ways, you're the on-ramps to Babbel slash future open source work. Yeah, I think so. And that on-ramp is defined by either face-to-face meetup workshops, intro to open source, first contribution to Babbel, or XYZ project, or here's how you use GitHub, or you name it.
Starting point is 00:33:50 It's hard because then if you're doing the one-on-one or that kind of thing, it doesn't really scale. But I think maybe we get caught up in this idea of just wanting to tell everyone about open source and everyone should do it, everyone should be a maintainer. But I don't think it means that people can't't it's just that i don't know if they know what they're getting into and so we we want to make sure that they it's not like a prepare it's just like in a lot of cases you're warning signs yeah not in a bad way but like it it takes
Starting point is 00:34:18 you have to be committed it's not that the road's bad or hard it's just a matter of like you got to be committed to do that kind of job yeah and and i think it's also fine if road's bad or hard. It's just a matter of like, you gotta be committed to do that kind of job. Yeah. And, and I think it's also fine if like people want to do those kind of, I guess we call them like drive by and PRs or whatever. Like people like, maybe you just don't have contributors.
Starting point is 00:34:34 Yeah. Yeah. You don't, maybe you don't have time and you don't have, or you can't do it at work or you have kids or your family, like all these other priorities, you still might want to do open source and there should be a way for you to contribute like maybe once a month or, or once or whatever. And that's fine. It's just
Starting point is 00:34:49 that if you like go about it in a certain way, you might find that you're taking on too much. You don't realize you're like, Oh wow, why did I suddenly, you know, make my own project and the millions of people are using it. How do you think Babbel will benefit from now you being essentially, are you, is there anybody else full-time on Babbel? Is there a certain thing? Like, is there a full-time on Babbel at all for, for the project? No. Um, the other main contributor, Logan, he, he isn't full-time, but he left his job a while ago and he's my other partner in doing all this and everyone else is volunteer. So like, we don't have any full-time people and i mean it's weird like i say
Starting point is 00:35:25 full-time but it's like it's full-time for the project but it's not full-time for the code like right now i'm not basically not coding anything i'm just like trying to raise awareness or get funding which are like sales i guess right and all these other things development yeah business development that's still part of it it's just not what people normally think. And I'm mostly just doing like reviewing and like very like meta level things that I hope are like long term things that we would never consider before. Like before when you're at work or you have limited time, you're always thinking about like taking out fires or like this short term things. But it's like if you're full time, well, then why aren't we doing these longer term goals
Starting point is 00:36:04 like getting contributors, investing in people so that maybe some of them will be committed instead of just asking 100 people to make a good first PR. So you decided to go with the Patreon route. And I know I'm not trying to get into politics on the different routes you could go to seek funding. So I'm not asking you to share the advantages of this platform versus another. But what is it about, not so much Patreon, but how are you incentivizing people and or corporations or companies to support you? Who is the customer that you're seeking? If you're doing business development, you're seeking out who can fund you, who can believe in this this mission who can help sustain you to make it happen
Starting point is 00:36:45 like who is that person and or company i should probably have a better specific group of people but there's a lot of different people that it could be so i'll first i'll say that we we have an open collective which is our other crowdfunding site that's for the project itself so i can take money from there which is good because i'm not making enough from Patreon right now. The reason why I picked Patreon is because it's a way for certain people to be able to invest in a person rather than a project. I think there's going to be people that would just rather donate to a person. Other people would rather donate to a company. Maybe companies would rather donate to the open collective or Babel itself. In terms of incentives, I don't really want to go out of my way to make incentives and i think that's true for anyone that does patreon because it becomes a job in itself
Starting point is 00:37:30 yeah of like getting people to like and then the kickstarter where you say i'll just give you a bunch of free stuff and well just give me the thing yeah i don't want the swag i just want the thing yeah and you're like am i gonna ship people stickers and stuff it's like that's like a lot of work that you have to do so all of my incentives i've kind of tried to make it funny actually so like i have one incentive that's like 11 because a month and it's like the ping pong tier because like ping pong you play the 11 and so it's like if you donate to this i'll play you in ping pong it doesn't mean i'm not gonna play ping pong with you it's just like that's like a a reason for you to pick this yeah pick that um i did one for video games like mario kart like i had a 50 cc tier so i was just trying to get creative um you don't really get
Starting point is 00:38:15 anything per se it's just like if you want to donate that much you can and i did one for like board games just the things i'm interested in maybe they're interested in that too so those are just individual people and i guess it's going to be difficult to get that because it's more of an awareness thing right like i feel like once people know they might decide to do it if they know me but for the higher tiers you know i might have like a hundred five hundred or a thousand dollars a month and that goes to the basically emulating what we did in open collective, where it's more like, we'll put your,
Starting point is 00:38:47 your logo on our website. It's more of like an advertising thing. And then that is kind of, it's not risky. It's just, I'm going to have to keep pitching because once someone's like, Oh, we're not getting enough out of this logo for a thousand dollars,
Starting point is 00:39:00 we don't need you anymore. And then we're not going to donate. So like what happened was like a few days ago i had like three thousand a month and then like suddenly like two people dropped out they both were giving a thousand uh for this next month so now i'm already down to like a thousand and that's because you know these two companies they're probably not that big they're not like google it's just like two people or something. The goodwill or charity almost. Yeah. And then maybe they want to support me for a month or they have their own money issues. So it's like, I don't expect them to do that, but it does suck when it's like, you're like living off of this number where like, oh, one day it's all gone. And then you're like, okay, now I got to
Starting point is 00:39:38 think about that. The ups and downs of, you know, crowd, i don't want to say crowdfunded i couldn't think of any other way to say it but like donation based living i don't know like you're you have that number you look at your own patreon you see what's coming in or what you think is going to come in or what's expected and you probably plan life around that it should also be said that you live in new york city right it's not exactly cheap to live there right i'm sure you got fairly you know even if you trimmed your expenses you'd be like what kind of what size is your apartment no i don't want to say it's smaller because maybe you're in the smallest one already i don't know yeah it's not
Starting point is 00:40:13 enough to cover the rent and then not have to pay for insurance too so even the base of that is already very high but the good thing is that our open collective has a good amount of money there now, so I can at least use that. And then the other plan is to, like, reach out to companies, which I've done, and hopefully those go through soon, where it's like, okay, one idea is this idea of a support contract, which is something that Webpack's doing, where you pay a certain amount, say it like a thousand dollars a month and then we'll give you two hours of time and to help you with whatever like and then that means i can work a lot less hours and you can still get paid um and usually like if you can convince them to do that that would be really good and you only have to do that for like 10 companies it's just but the problem is like we have none it's like it feels like there's like a huge potential. But right now there's none.
Starting point is 00:41:07 So it's like, where is that? So I would say after hearing that, I would say maybe you could treat the human side of this equation for you. That these companies are also humans, right? Like companies aren't just companies. There's humans behind that. Users of Babel. Potentially even contributors of Babel, right? Potentially, who knows?
Starting point is 00:41:27 And that maybe this is a growth hack that you're using to, one, keep in touch with your constituents, right? And also potentially get some, for lack of better terms, sales. Yeah, I think it's definitely a good way in. And it doesn't have to be two hours. It could be more and they can just pay more. I can have a lot of other ideas around what if they want to be contributors? Well, you can pay me to help you get into open source.
Starting point is 00:41:55 Although I'll probably still have the meetup for free. But then for companies, it's like, well, we can charge them. Come to a workshop. How to be open source. Before, I was like, what kind of workshop would I i do and i realized there's a few i could do one would be teaching people about javascript itself i think it's pretty easy to market like the maintainer babel teaches you javascript and then there's how to get involved in open source and maintaining open source if someone wants to do that for their own open source and then the last thing is like how do you contribute to babel itself and then talking about that some companies would be interested i
Starting point is 00:42:28 mean i think most would just be like oh can you give a talk at our company yeah i mean well you just put it into numbers for me so that's what made me think about that which is like if you can say well i need for lack of better terms have 10 companies committed to doing two hours because it equates to this number, you know, cause then you can start to determine it's going to be there and start to live like it's going to be there and have some certainty that you don't have now. I think, well, how do you reverse that is, is, you know, have some relationships like what I've learned from our business and you probably know this already, but you know,
Starting point is 00:43:01 the biggest part that makes us successful is that we care deeply about not just our listeners, but also our sponsors, which we, we really call them partners, but they, the industry acceptance term is sponsor. We really call them partners. People who work with us that sponsor our content are our partners. We work with them. We form a great relationship and it's only beneficial if we add value to them. Like sponsoring our podcast isn't charity. They get value and it's up to us to help them understand how they get value and help them to get value, not just read an ad, you know? So in your case, maybe it's the relationship side is, is the equation is that, uh, invest
Starting point is 00:43:42 a little in those people to, to get them to invest in you a little. Yeah. That's just one way though. That's just one of the things I'm just thinking about. That's interesting though, man. So the uncertainty, do you sleep well? I mean, given, I'm not saying you're in a bad situation because you're not, but just, you know, there's some uncertainty. Do you fret? Do you get upset? Are you, how's life for you right now? It's interesting. I guess I'm not, I don't know, oddly, I'm not like too worried.
Starting point is 00:44:10 I mean, in the end, I do have, the backup is just I get another job and it's fine. And that's totally. It's experiment. Yeah. And it's just that I guess I'm willing to like take it as far as I can. And I don't really want to think of that as an option at all, really. Do you have what they call runway? Are you like a mini Henry Zou startup?
Starting point is 00:44:30 You got runway? I have some savings, so I think I'm good. But unfortunately, there's not that much rest in terms of thinking about what's coming up. And yeah, there's definitely uncertainty. I don't know. I guess I'm very optimistic. This is your twist, though, right?
Starting point is 00:44:45 You chose to do this. So clearly you can't be that upset about your circumstance because you chose the circumstances. Right. Someone didn't force me to do this. People would probably tell me not to. And I have to go out of my way to be like, okay, I think I truly believe this is a thing and maybe writing the blog post or giving this talk helps me to like form my thoughts better to think, okay, this is a something I want to pursue and like a goal I want, not just for me, but like for other people too. So. Okay. Other people too. So are you in a position
Starting point is 00:45:17 yet to advocate others to follow in your footsteps? Are you still an experimental stage only to the point where you're like, let me try this out for a bit and then i'll let you know or what do you expect for i mean i know that for us a buka dj is also doing it there's several others we had a small list of people we actually want to do a panel show with you were one of them around this topic of like you know going full-time you know crown funded open source maintain your perspective so i know i actually i don't think people should do it like okay uh you really well for me i spent like a whole year thinking about whether this should be a thing or not i didn't want to like go rash and just like quit my job because i don't want to do this anymore and and just do my thing it's like i want to really know that this is what i
Starting point is 00:46:01 want before i do it but i didn didn't figure it out before making decisions because I felt like then it would take forever. We had to at some point take a leap. Yeah, and that's what I did. And I don't think my goal with doing it is to convince everyone else to quit their job and do full-time open source. That's definitely there.
Starting point is 00:46:20 I just don't see that as that viable at the moment. It's more like I would rather help people to be more aware of open source in general and how we can support people that are doing open source, whether they're full-time or not. And probably the most important thing is like, how do we get companies to either sponsor projects they use or allow their employees to work on open source at work?
Starting point is 00:46:42 And that's probably the best way because they're usually unwilling to give money, but they're willing to let their employees work on the things that they want if they're going to ask for it. But the problem is that a lot of employees don't. And there's a lot of reasons, whether it's like being intimidated
Starting point is 00:46:59 or not knowing what to contribute to or not knowing how to say it. But I feel like there's a lot of people that would want to do that, especially if they're busy outside of work. They like, not everyone has a privilege or even wants to work on code. Like you're coding all day and then you go home and you're going to code again. Like it makes sense that you wouldn't want to do it. You may have answered some of this there, but I'm not going to ask you for the five-year plan, but I'm going to ask you for some project for me a time range. Like, what do you, I like, this is how I kind of work. What is success, right? A year down the road, a year and a half, what does success look like to
Starting point is 00:47:33 you? Like you've done this, it's successful. You can advocate others to try it. Given some circumstances, for example, maybe have put in some time, have a relationship with the community, not just jump ship, of course, but, you know you know give me maybe what do you think success is to you a year year and a half down the road or just whatever time frame makes sense for you well one measure of success that i might think of is like that the idea that i could just like go away and leave the project and it's everything's good and so i could leave it now and i mean it's like i kind of talked about this in the talk like you have this sense of like pride where it's like like if i leave everything's gonna go bad and no one's gonna take it up but that's not the case i'm not that irreplaceable but obviously i still feel like i'm needed or whatever that means and it's like when do we get to a point where we have
Starting point is 00:48:22 enough contributors where like you know that i don't want to say the bus factory. My boss, he said there's the lottery factory. That's like a better way of putting it. Yeah. I haven't heard this one. Yeah. It's really cool. It's like, oh, those people won the lottery.
Starting point is 00:48:33 Now they're not going to work on this anymore because they can do whatever they want. Okay. So they didn't die. They just hit it big. Yeah. Okay. So how do we increase that? So like people can leave and I don't actually, it doesn't have to be I can leave the project entirely.
Starting point is 00:48:45 It's just more like we all feel comfortable. Everyone feels comfortable to just take a break. Not just because mentally they were able to overcome their issues with that, but that everyone feels good that, oh, there's a lot of people working on this and everything seems to be going well, that kind of state. Which maybe that doesn't really like i don't know what that looks like so it sounds like if i understand you correctly that you think success or you're projecting that success is babble being in a state where you can
Starting point is 00:49:16 leave without any concerns of it imploding for lack of better terms yeah that's one way to say it and when i say that it doesn't mean i don't want to work on anymore i still really want to it. It's just like, I think it's a good attitude to have when we're doing this kind of thing. And I think that is a good level of success because it's saying that I don't have to be around and that it still moves on. Yeah. And maybe then you won't have anxiety about whatever the issues are. And maybe you can work on other open source or i think for me like i really like working in a battle but i think another thing i like is just thinking about open source in general and it doesn't have to be tied to this project i just happen to really appreciate what this project does because it's unique that's interesting i mean i've only i guess my my perspective so far has been you full-time on open source means Babel involvement sounds like that's not exactly not that you have plans but just that your mind is open to one day Babel not needing you that you may have skills and and abilities to be put elsewhere whether it's another project or just a new thing or whatever I mean for any open source product it's going to go away eventually yeah
Starting point is 00:50:21 um in Babel's case it's not really true in the same way as most because the funny thing is that most projects, maybe it's just people stop using it, but Babel, the whole point is that you're implementing syntax that goes with JavaScript. So as long as people still use JavaScript and we have new syntax, then there's always going to be a need
Starting point is 00:50:40 for Babel for the people that want to use it. So in that sense, it's kind of like always going to be there. But people get bored or whatever. I don't think I'm going to get bored with it, but the reason why that want to use it. So in that sense, it's kind of like always going to be there. But people get bored or whatever. I don't think I'm going to get bored with it. But the reason why I want to look into other things is just it will help me think about open source differently. And maybe it's just I wouldn't leave the project.
Starting point is 00:50:56 It's just like I would maybe look into how other people are doing things or get involved, and that would help me do it better in what I'm doing. I always get uh this guilt-free kind of perspective when i think about it like bands right like it you know sometimes a band might tour with another band or this band or the lead singer might go to this band and you know do a crossover it's never like they're leaving their band unless they actually do leave their band but it gives you the freedom to sort of cross-pollinate. And there's a lot of underappreciated opportunities in cross-pollination. Yeah, and I think at least I should be trying to look, like, if I have the time, then I
Starting point is 00:51:31 can look into what are the projects that are related to Babel. It doesn't even have to be, like, a different language or something crazy like that. It's just like, oh, Webpack is used with Babel, so maybe I should learn more about how they do things and we can work together or view or react. I think that's a good way to kind of naturally look into other projects. I had this idea in my mind of what does it look like to be a JavaScript
Starting point is 00:51:53 ecosystem maintainer? So not just one project, but a lot of what that is. I mean, that seems really burdensome, but I think that would help coordinate things or at least talk to people instead of just like kind of like we're all in their own little like isolated bubble or something yeah anything i didn't ask you that you want to share that just i just missed i don't know i guess
Starting point is 00:52:14 my talk today was just about like how we can think differently about open source uh not just like getting things for free but like actually helping and serving people. And I think the values that are there, maybe we don't really emphasize much, and we kind of get influenced by how non-open source works. So whether it's adding transactions or thinking about things in a very robotic way, when open source has its own views on things. And we kind of just take those things
Starting point is 00:52:47 and we copy them because like that's just the way we do things and that's probably why why do we keep seeing like all this behavior and we can change the medium in which we do open source such that it is you know building up people it is bringing like community and and those kinds of things and i think we should be thinking about that more like I don't really have an answer and it's going to be really hard because this is all like non-technical versus like the code itself um but yeah I'd like to think like what are the like the habits or things that we can create such that we can enforce, not enforce, but reinforce the ideas that we want, like in our mind,
Starting point is 00:53:27 the ones we want, but in reality, we don't act those out. This episode is brought to you by our friends at GoCD. GoCD is an open source continuous delivery server built by ThoughtWorks. Check them out at GoCD.org or on GitHub at GitHub.com slash GoCD. GoCD provides continuous delivery out of the box with its built-in pipelines, advanced traceability, and value stream visualization. With GoCD, you can easily model, orchestrate, and visualize complex workflows from end to end with no problem.
Starting point is 00:54:09 They support Kubernetes and modern infrastructure with elastic on-demand agents and cloud deployments. To learn more about GoCD, visit gocd.org slash changelog. It's free to use, and they have professional support and enterprise add-ons available from ThoughtWorks. Once again, gocd.org slash changelog. Sitting down talking to son willison here getting in the groove talking about data like you are so like more than anybody i've ever met excited about data bro like i've never like earlier i didn't tell you this because i was just enjoying the moment but you got really excited about some data oh yeah so um back in 2009, 2010, I was working for the Guardian newspaper in London
Starting point is 00:55:07 where it was basically, I was hanging out with journalists helping solve data problems. So it was in the world of data journalism. The Guardian. At the Guardian, yeah. This is like the mecca of like, you're from the UK, right?
Starting point is 00:55:19 Yeah. It's a very, very high quality UK newspaper. So if you're working for a newspaper there or anything journalistic, the Guardian is a good name to have on your resume. Yeah, that's definitely true. Okay. And so when I joined The Guardian, they had this fascinating onboarding process where they basically set you up on coffee dates with people from all sorts of different departments around the newspaper. So you have coffee with the sub-editor and then you have coffee with somebody who's involved in the print presses and so on and so forth.
Starting point is 00:55:44 And those people will talk to you and they'll introduce you to other people and after a few of these meetings uh a bunch of people said you know you really need to talk to this guy simon rogers uh one of the journalists because simon rogers was the um the journalist at the guardian responsible for the infographics where you publish a graph from the newspaper somebody has to phone up a bunch of government agencies and gather the data and he'd been doing this for years and was really good at it he could get data on anything and all of this data i asked him where it was and he said oh i've got it in excel spreadsheets on the computer under my desk and i'm like this is gold so we got together we we started putting things out we ended up thinking okay the easiest thing we can do is start a blog we'll we'll start start the guardian
Starting point is 00:56:24 data blog and we'll publish these things as google Spreadsheets. And we did this, and the Guardian still has a data blog today where they're publishing data behind the stories. Wow. And that was really exciting, except Google Spreadsheets always felt like a bit of a shortcut to me. I mean, it worked, and all credit to it, that was fine. But I wanted to do something better. So last year um i was still thinking about this problem i've moved on from journalism and had all sorts of other adventures since then and i realized that the combination of sqlite as a database format and zeit now as a immutable hosting platform was actually a really interesting opportunity for
Starting point is 00:57:02 publishing data because if you can take any data at all, you can always wrap it up into a relational database. They're really good at that. If you wrap that up into a SQLite database file and then publish it, you can build APIs on the top, you can build an interface on the top, you can start sharing data that way. So essentially, I've been building the software
Starting point is 00:57:18 that I wish I'd had back in 2010 when I was working at The Guardian. Really good for databases that don't change. Exactly. Meaning like, this is data that's never going to change. In stone, it's done. It's facts about the world. Some of the examples I showed in the talk today,
Starting point is 00:57:33 one of my all-time favorite data sets is the list of trees in San Francisco. It's maintained by the Department of Public Works. They publish it through their open data portal. It's basically a CSV file with 190,000 trees. And each tree, it's got the species and the location and often when it was planted and who looks after it.
Starting point is 00:57:52 And it's just sat there, this gorgeous data. So I took that, I turned it into a SQLite database and I've published that as an API and then started building things on top of it. I've got a website, sf-trees.com, which is a search engine for trees. So you type cherry and click a button, and it'll show you all of the cherry trees in San Francisco.
Starting point is 00:58:10 It turns out that's the website I always wanted to build, and I just never knew until the data was in front of me. The question about the data, I guess, is that, you know, if the data doesn't so much change from the past, but there's new data, how do you deal with new data? So when you're deploying with static hosting like Zyte, like Zyte now, you just deploy a new copy. So the tree data is actually updated pretty frequently.
Starting point is 00:58:32 So every now and then I'll pull a new copy of the CSV file, I'll turn it into a SQL database, and I'll publish, I'll just overwrite the old thing with the new thing, and it works. I think it's like about 80 megabytes when you deploy it, which is small enough that it doesn't really matter, and you can just keep on shipping new versions. So for the uninitiated out there who are not very familiar with the Zite now,
Starting point is 00:58:53 you mentioned it was immutable. That means that you can't write back to a database. So databases are immutable. So Zite now, it's hosting where everything is fire and forget. You bundle up your code and other assets. It turns it into a Docker container. You fire it up to them. They deploy it, and they give it a URL which will never change.
Starting point is 00:59:13 So it will always live in that URL. And in 10 years' time, if somebody hits that URL, they'll spin it up, and they'll start serving requests from it again. I think Guillermo said it was called immutable deploys. Is that right? Immutable deploys, exactly. And then you can also set an alias. So you can say, you know what?
Starting point is 00:59:26 Okay, that URL is going to stay the same, but I'm going to point sf-trees.com at whatever the current version is. So you get atomic deploys as well. You deploy a new version, you test it, and then you switch the alias over once you're certain that it's going to work. And it's a really nice way of working.
Starting point is 00:59:42 The downside is that you can't do, it doesn't give you a regular database that you can post rights to. You can get those, but you have to get those from another vendor like compose.io or Heroku Postgres or something like that. But if your data doesn't change,
Starting point is 00:59:56 if it's say a list of 190,000 trees in San Francisco, you can package that up in a SQLite database and deploy it alongside the rest of your code just as part of that regular deploy process. And that, it turns out, is a really cunning trick for doing all sorts of exciting things with semi-static data. The demo you showed earlier,
Starting point is 01:00:14 we did it before your talk, but the demo you did was around, it was fun, possums. No, that was the... That was a slightly different project. Okay, yeah. Yeah, a slightly different project, but you're trying to essentially take data sets,
Starting point is 01:00:30 deploy them now, and be able to perform searches on those. And in this case, you wanted to auto-deploy a new site based on search parameters, essentially. I've built a few different tools. They're all open source. They're available on GitHub.
Starting point is 01:00:40 The first one is this script called CSVs to SQLite, which is just a command- line tool that takes CSV files and turns them into a SQLite database. And that's all it does. It's like a one-shot pony. It can do a few extra things. You can tell it extract this column out into a separate table. You can tell it to make things indexable
Starting point is 01:00:57 using SQLite's full-text search. But essentially, you run a command, and CSV goes in one end, and a.db file comes out the other. Right. The bigger tool is this tool I've built called Dataset. And Dataset is a couple of things. Firstly, it's a little web server that given one or more SQLite databases gives you a HTML user interface for browsing that data. So you can click on it and reorder by columns and filter it and so forth.
Starting point is 01:01:21 And it also gives you a JSON API. So anything that you can see on screen, you can get back as JSON. And this means that you can take a CSV file, turn it into a database file, and then turn that database into a JSON API that you can run queries against from JavaScript. The final piece of this is the dataset publish command, which as a command line tool, it'll take that database, that SQLite database, publish it on the internet with Zyte, add the dataset application itself, and essentially in one go wrap the whole thing up and turn it into data that you can access as a URL with a JSON API and an HTML interface. And then on top of that, I built another tool called Dataset Publish, which is a web app that does all of this for you, so you don't have to install any software on your computer at all. You go to publish.datasets.com
Starting point is 01:02:06 you upload some CSV files, click a button and it will deploy to your own Zite account that combined database, all of the code and just get things up and running. So the idea is that if you're somebody who isn't comfortable
Starting point is 01:02:21 installing Python command line tools you can still use this suite to take your data and turn it into something that's browsable and explorable. You mentioned in your talk that you're really passionate, I think either it was in the past or you're still currently passionate about this, but like data journalism. You mentioned the Guardian in your past.
Starting point is 01:02:37 This is something where journalists are not typically programmers. They're not typically familiar with these tools. Some of them are. Data journalism is a very specific sort of subset of journalism where you've got journalists who are all about Python and R and Jupyter and they're very familiar with this stuff. And they have conferences where they all get together and share tips. And these are my people. I'm a huge fan of what they're doing. And that intersection of skills, I think, is really interesting. But most newspapers can't afford to hire people who have that software engineering background.
Starting point is 01:03:11 You know, the really big papers, the New York Times, the Washington Post, they're all doing this stuff. But if you're like a little local newspaper, the chances that you can hire somebody with software engineering and journalism skills is pretty slim. What does it mean then, since you're building these kind of tools? I mean, one, you're having fun, you're here at Zite Day, you're showing off some cool tools that you've built, obviously showcasing the performance and abilities of Zite, or Zite Now, matter of fact.
Starting point is 01:03:36 When you start to look at how this applies, some of the tooling you're building applies to data journalists or just people who are curious about data, essentially taking insights from data, what taking insights from data. What are you trying to build for them with what you're doing? So the starting point, I think, is that CSV is actually a pretty awful format for sharing data. But it's what everyone uses because... It's a standard. It's a, well, a semi-standard.
Starting point is 01:03:57 Once you get into quoting rules and stuff. It is ubiquitous. And Excel can produce it and consume it and so forth. So that's fine. But it's not the most useful format. And actually, when you look at CSV files, where they really fall apart is when you are dealing with something a bit more relational. You know, you get CSV files where a bunch of the columns are duplicated hundreds of times
Starting point is 01:04:16 because they didn't have the ability to bundle it with a second CSV file that's just got one set of the data in. SQLite databases, I think, are the perfect format for this kind of stuff. SQLite itself is crazy stable. A SQLite database from 10 years ago can still be read today. It's ubiquitous. It's one of the best tested pieces of software I've ever seen. And it's in everything. My mobile phone runs SQLite. My watch, it turns out, has a SQLite database in it that tracks my step counts, which I can't get access to because I have to jailbreak my phone in order to get the database out of my watch, which is very frustrating.
Starting point is 01:04:54 Come on now. But, you know, it's everywhere. And so if I can help show people that if you're going to share data, sharing it as a SQLite database is actually a much more powerful and efficient way of doing it than just as regular CSV files. That's fantastic, especially if I can provide a toolkit that will take that database and spit CSV back out of it. So you're not losing anything by using SQLite.
Starting point is 01:05:15 You're gaining stuff, and you still get that CSV export as well. So you got CSV to SQLite. What about the other way around? Not yet. That's very high on my to-do list is that feature. Prominent open source. You're doing some cool stuff there. The repo you mentioned, that is
Starting point is 01:05:31 open source, right? Yes. CSVs to SQLite is open source. Dataset's open source. They're both under the Apache 2 license. Explain Dataset real quick, the spelling at least. It's spelt like the Commodore 64 tape drive. So it's D-A-T-A-S-E-T-T-A, like a cassette, but for data.
Starting point is 01:05:51 Dataset. Cassette. So think about that when you go there. We'll have some show notes, so don't worry about that. But Simon, what is it that gets you so excited about this data? Maybe a better question might be, what is it about datasets like this that gets you excited? The San Francisco trees is probably my favorite data set. One of the other ones I really like is one from the USGS. It's a data set of polar bear ear tags in Alaska.
Starting point is 01:06:16 So in 2009 through 2011, they stuck GPS trackers on a bunch of polar bears and had them wander around Alaska. And they got back latitudes and longitudes and battery levels and temperatures and all of this stuff and they published it online as CSV files. So I grabbed that CSV file with 40,000 known locations of polar bears. I converted it into dataset and then I used it as the demo for the first one of my dataset visualization plugins. So I've been building out this plugin infrastructure so we can add additional features. And the first feature I built was one which looks for latitude and longitudes
Starting point is 01:06:49 and then sticks them on a giant map with clustered markers so you can map the whole lot at once. And when you put 40,000 polar bear ear tags on a map, and we'll definitely have a link to this in the show notes, one thing that's interesting that shows up is that most of the polar bears in Alaska, but about 200 of those trackings come from Seattle. And that was an interesting mystery.
Starting point is 01:07:08 I'm wondering what these polar bears are doing in Seattle. So I zoomed right in on the map where these earmarks were, correlated it to Google Maps, and realized that there's a company called Wildlife Computers in an office park right there who sell ear tags for scientists to track polar bears and so evidently they'd been testing the ear tags at the home office and that data had ended up in the data that the uh that was published as part of this survey i don't know if the scientists who published the data would even notice because they might not have that same visualization that shows them where the clusters are so they could be counting fake data. Possibly. Who knows?
Starting point is 01:07:45 I need to figure out who to get in touch with and see if they've spotted this. So here's something I thought about while you were sharing that story, was that if you reverse the scenario, like as a data person, like you concerned about analyzing polar bears, for example,
Starting point is 01:07:58 if you had started out with your desire to have the data, they gave away or just had available massive amounts of data that was so valuable to you that it was just not really valuable to them? Well, a wonderful thing about the US government is that they release an awful lot of stuff. I think the default within government is often to release the data.
Starting point is 01:08:22 Right, it's open. It's wonderful. It's part of the OpenGov initiative. It's that, but also, I have an interest in zeppelins and airships, and it turns out that the US Navy had some really cool zeppelins in the 1930s, and the photographs
Starting point is 01:08:37 of those are all freely available, because every photograph taken by somebody in the Navy is copyrighted in a way that it can just be released by the government. So if you want photographs of awesome airships in the 1930s, the U.S. Navy Photo Archive has all of this stuff for free. I don't think about data quite like you do, but it sounds like there's just a plethora of opportunity. Where do you see open data sets, immutable data now? What have you? Some things you've done here. Where do you see like open data sets immutable data now what have you some things
Starting point is 01:09:06 you've done here where where can you see like interesting projects that are exciting to you or things that could be done not so much tomorrow or next year but like over half a decade a decade like where do you see some of these ideas you're shaping going to like utilize public data in useful ways to analyze society and give back answers to make a better future. So I feel like we've spent a lot of time as an industry sort of dreaming of the semantic web, trying to say, okay, if we can get all of the data in one sort of standardized format, then we could build wonderful things with it. And we've been trying that for 10 years, and it hasn't really worked that well so far. But I think that's because that's sort of a boil the ocean approach.
Starting point is 01:09:46 Try and come up with the perfect standard for data and then they get everyone to do that. What actually does work is publish the data in a format that people can use and then watch people integrate with that and say, okay, well, I'm going to do a custom query against this, something custom against this, and then combine those together and build something myself. So one of the things I want to do with Dataset is not so much establish a standard, but just make it as easy as possible to put data out there in a way that people can automatically
Starting point is 01:10:13 query it, that can pull from a JSON API, and then watch what people do with it once it becomes available to them, even if they have to do a little bit of work to clean it up and reformat it for their purposes. I'm just amazed that you can find such use with such a simple CSV, for example. So much data out there available. We've done shows in the past about open government. We've done shows in the past about just open cities, like the city of Chicago talking about manhole covers and stuff like that.
Starting point is 01:10:42 Just interesting. Do they have a CSV file full of manhole covers? I'm sure they do because they have the register. Those things are expensive. Yeah. Those things are like, I don't know, 500 bucks each or more. The manhole covers are super thick. They're completely metal.
Starting point is 01:10:56 And if they get stolen, they got to replace it. So think about the cost. So they have to track them. I remember talking about this around um it was years ago i can't recall the exact show but we'll put in the show notes if we can but just like all this public data out there and like someone like you that gets excited about it like that being able to make it useful to people who are not exactly like programmers you know and then advocating for ways to say share data in csv share whatever you can, just to make sure that you can share it with whomever, you know, so that it can be reused in these cases. I mean, with all of this
Starting point is 01:11:29 stuff, the real delight is when people build things that you weren't expecting. When somebody takes data that you have made available and uses it to solve problems in an interesting way, that's always a delight. You know, if you're building open source software, the best part is when somebody uses your software for something that you'd never imagined it would be used for. That's certainly something that gets me very excited. Maybe some parting advice then for anybody listening to this segment here that is half as excited as you are about some of this stuff. Where are some good starting places? What are some skills you've developed over the years that has made it easier for you to work with and munch data to do some of the things you've described here? So as a Python programmer, one of the things
Starting point is 01:12:09 I love most about Python is it's got an interactive prompt. And this is true of all of the other programming languages that I like working with. And interactive prompts make it so much easier to manipulate data, to suck things down, to reformat them. I've recently been learning my way around Pandas, which is the Python library for dealing with tabular data. And if you combine Pandas with something like Jupyter Notebooks, it's an incredibly productive way of sucking data in, previewing it, formatting, and then you can write it out to a SQLite database
Starting point is 01:12:37 and publish it in that way. So I think the data science community has all of these tools as well, which are constantly being developed, and so keeping in with what's going on over there is super useful for that sort of, for those tools for pulling things down, for cleaning up
Starting point is 01:12:54 and for generally manipulating things. And then the other one is just good old SQL. It's been around since 1979, and it turns out it's still a fantastically powerful tool for doing data analysis. Any help needed on your projects in open source that you can give a shout out to? Yeah, absolutely. So both CSVs to SQLite and Dataset are active open source projects on GitHub
Starting point is 01:13:14 which are very keen on accepting contributions. I've been trying to label issues with good first contribution and so on. And so I'm very keen on having people dig in there. But more importantly, I want people to use the software and give me feedback. I need to know what works, what doesn't work, what are the features that would help you solve problems that I haven't even imagined yet. On your issues, do you share any sort of, not so much roadmap, but bigger ideas that you don't have time to tackle that,
Starting point is 01:13:44 if someone did have time to tackle, that you would take a contribution that way, or do you only have the kind of issues you described? I have a few issues that are some of those bigger ideas, but in Dataset, the thing I'm most excited about right now is the plugin ecosystem. It's now possible to build plugins for Dataset that add additional functionality,
Starting point is 01:14:01 and that means that I can be completely hands-off, and if you want to invent a fantastic new visualization mechanism that does time charts against columns automatically or whatever, you can build that right now today without even talking to me, and you can start using it and shipping it and sharing it with other people.
Starting point is 01:14:18 So I'd love to see people starting to dig in with the build plugins, but also give me feedback on what other hooks are needed to make plugins more productive. Who is Dataset for? Is it for developers? Is it for developer-like people? Who's the user for Dataset?
Starting point is 01:14:33 My three targets for Dataset are data journalists, anyone who's got collecting interesting data and wants to be able to share it. I'm really interested in museums because it turns out museums have huge amounts of metadata around their collections, which often is locked up somewhere. And I'd love to help get museums start publishing that. The Metropolitan Museum of Art in New York has a spreadsheet with 400,000 items in it, which I've turned into a dataset instance. And so that kind of thing I think is really exciting. And then the third one is civic
Starting point is 01:15:04 institutions. There are all of these governments out there that are publishing data in various different formats. If I can help them publish it more effectively and in a way that's more useful to people, that I'd find really exciting. When we were talking earlier, not in this podcast, but like earlier today, the interface you were showing off where you can like create a database. I think you were even talking about the museum example. Was that data set? Yes. Because you showed me a couple of things think you were even talking about the museum example. Was that Dataset? Yes. Because you showed me a couple of things. That was Dataset?
Starting point is 01:15:26 Yep. You were able to dig into a query and then even augment the SQLite or the SQL queries, I believe. This is one of the most interesting things about doing things read-only, is if you've got a read-only SQL database, I can just open it up to select statements, because it's not like anyone can cause any harm.
Starting point is 01:15:42 They can't modify the file on disk. Because it's on now. Well, because it's now, and can cause any harm they can't modify the file on disk so because on now it's well because it's now and also because when i open the database i use sequelites immutable option okay so sequelite will disable any rights that could possibly happen gotcha um and so on top of that i have a one second time limit on sql queries so if you try and like do something too expensive it won't work. But other than that, you can just be completely free with it. So it's kind of like JavaScript developers get very excited about GraphQL because it lets them specify exactly what they want to get back from the server. With Dataset, you can do that with SQL.
Starting point is 01:16:16 You can write SQL in your JavaScript, selecting the exact comms you want, joining against different things. And it works, and it's fast, and it gives you back that data as JSON. You were showing me, you got really excited about this, different things. And it works and it's fast and it gives you back that data as JSON. You were showing me, you got really excited about this, you were showing me the example of Australia and dogs. Yes. There are eight different counties in Australia who publish lists of dog registrations when people go to register with dog license and they've got the breed and they've got the name. So I've got one little tool where I combined all eight of those CSV files. I've got them all in the dataset instance. the name. So I've got one little tool where I combined all eight of those CSV files.
Starting point is 01:16:46 I've got them all in the dataset instance, and now I can say, okay, for golden retrievers, what's the most common dog name? So you can search by species, some by the count of each name, and see what the most popular name for different types of dogs is. The answer is always Bella, it turns out. Bella, no matter what species of dog you are, Bella is the most common dog name in Australia. But it's still kind of fun to see the different naming trends.
Starting point is 01:17:08 And then for Pugs, I believe it was Ruby, right? Yeah, Ruby beat Bella, actually, for Pugs. Only for Pugs, though. Only for Pugs. And I think Pugsley came fourth. So what I find interesting about that is here's some obscure data set that maybe nobody's paying attention to. And because of what you've done with data set,
Starting point is 01:17:24 you're able to query it in these ways and find out this information maybe it's just for the curious but i kind of find that kind of interesting to me like that you can just you know play with this data like that and of the entire uh not even country continent of australia right and the dog thing the dog thing is is is kind of um it's amusing but not necessarily useful a much more useful one is um uh it turns out the uk members of parliament in the uk have to register their conflicts of interest with parliament in the register of members interests and its public data and some and there's an organization called my society who turned that into xml i took their xml and i loaded that into a sqlite database and so now i've got a tool that lets you search 1.3 million line items of mps saying who paid the money who invited them to speak who gave them a free watch it turns out um that the sultan of brunei hands out christmas
Starting point is 01:18:16 hampers to mps every christmas and you can see which mp has had the most christmas hampers from him it's super fun and actually this is, this is newsworthy. If you're wondering why does this certain MP behave in certain ways to different countries, you can dig through all of this stuff and say, oh, well, it turns out they've been giving him a lot of free watches. And it's undeniable data because it's from the Parliament, right?
Starting point is 01:18:37 Yeah, it's official data. They publish it as a rather ugly set of webpages, but with a little bit of work, you can turn that into a queryable database. And how easy is it to refresh that work? Like if it's so hard to, you know... So most of the work's been done for me by this organization, MySociety,
Starting point is 01:18:55 who've been scraping this for 10 years and dumping the data format into these XML files. So I just wrote the thing that turns XML into a SQLite database and built on top of that. Simon, I'm sure we could talk for hours. I promise you 20. It was a good 20 for sure. So thank you so much for sharing your story and
Starting point is 01:19:11 where can people find you? Where do people find you at on the internet? Mainly on Twitter. I'm twitter.com slash SimonW and I've got a blog at simonwillison.net as well where I write about a lot of these kinds of things. Alright, thanks for tuning in to this special episode of the changing vlog on location at zeit day 2018 if you've never been to this conference i highly recommend you attend head to zeit.co slash day to learn more and see the videos from this past conference
Starting point is 01:19:39 and of course thank you to our sponsors rollbar lin, Linode, and GoCD. Also, thanks to Fastly, our bandwidth partner. Head to Fastly.com to learn more. And we move fast and fix things here at ChangeLog because of Rollbar. Head to Rollbar.com to learn more. And we're hosted on Linode cloud servers. Check them out at Linode.com slash ChangeLog. This episode is edited by Tim Smith. The music is by Freak Master Cylinder.
Starting point is 01:20:02 And you can find more shows just like this at changelog.com or on your favorite podcast app or wherever. Subscribe where you can. Thanks for tuning in. We'll see you next week.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.