CoRecursive: Coding Stories - Tech Talk: Open Source Health and Diversity with Heather C Miller

Episode Date: September 15, 2019

Tech Talks are in-depth technical discussions. Heather C Miller is an Assistant Processor at CMU. She is concerned that key open source projects are at risk of failure and no one is paying attention. ...Adam talks to her about open source, how it grows, the diversity problems it has and much more. Heather also shares some interesting stories about the early days of Scala and her ideas for increasing diversity in tech. Heather's JuliaCon keynote Digital Infrastructure Scala Center https://corecursive.com/038-heather-miller-open-source/

Transcript
Discussion (0)
Starting point is 00:00:00 And, you know, like we got some teenage girls to show up once and like, turns out that they blew the socks off these JavaScript developers who kept on trying to mutate state. I think these 16 year old girls were like, why do you keep trying to change things? Just make a new one. Like it was obvious to them. The JavaScript dudes were like, oh, sorry, teenage girls. Hello, this is Adam Gordon-Bell. Join me as I learn about building software.
Starting point is 00:00:25 This is Code Recursive. Today's guest is Heather Miller. She's an assistant professor at CMU. She is also writing a book about distributed programming. And I wrote down all these questions about distributed programming. But she's also an expert on building communities in open source. And we end up talking about problems with open source and contributor burnout and increasing diversity and also about how we're all just humans and we should get together and talk sometimes. So I think
Starting point is 00:00:57 you're going to enjoy this interview. And I never got to any of my questions, so I will save those for another time. Oh yeah, If you like the podcast, likely, you know, someone else who might like the podcast. So yeah, maybe let them know about it. All this, the beginning. So that's the beginning. You mentioned in this talk, I watched of yours that you felt that like open source is digital infrastructure. What did you mean by that? So I adopted this term. The term digital infrastructure is actually coined by a woman named Nadia Egbal. She put together this report called Roads and Bridges,
Starting point is 00:01:36 and she actually made the argument in that report that open source software, in many ways, it's infrastructure that we just expect to be there and that we depend upon. And unlike actually roads and bridges, which we yell at our local government to fix when there's like a pothole, like we can't yell at anybody to fix it. Something, if there's like a problem in some open source thing, it's just kind of, if somebody has time, they might do it or you should do it, which is kind of a different model. Yet it's still similar in that we expect these things to exist and to be working, to do our jobs nowadays. It's really hard to find any piece of software that we use for anything that is not either consisting of many open source pieces or
Starting point is 00:02:16 entirely open source itself. And we just kind of expect a lot of these things to be there and ready for us to use and that people are maintaining that thing and that there are releases of the thing, right? And it's infrastructure in that sense sense it's like to do our jobs we expect this thing to be there yeah so nadia she worked at github i tried to get her on the podcast yeah if you're listening nadia please get on the podcast hurry up so do you feel like the digital infrastructure isn't being maintained i would argue that to a large extent, like much of the digital infrastructure that we care about is maintained and there are companies throwing money behind stuff. And in some cases you've got like companies fighting with other companies for control over something, right? Like they're all putting energy into it. But the main issue obviously is that
Starting point is 00:02:58 because this is not something easy to see, it's easy to see if there's a pothole. It's not easy to see if something is not maintained or the person who's maintaining it is like totally stressed out and like quitting all open source, right? You can't see that very easily. So there are obviously cases here and there where there are projects that a lot of people care about or that are important for some reason that are at risk that we don't realize are at risk. And I think that maybe what people should carry away from this observation is that as developers, we should perhaps, A, try to be a little bit more careful and conscious of the fact that a lot of this is still volunteer effort, that people invest in things, they do this in their free time, which doesn't give people the right to show up on an issue track or be insulting
Starting point is 00:03:44 and all of that. Right. As much on us to kind of observe when there is an open source project that people need or care about, or that is important, that is struggling. Like it's something that we just can't pretend is not a problem. I mean, obviously there's a whole bunch of hilarious examples that we can cite, right? Like the Equifax breach, right? Like this being, they tried to blame it on, was it like Apache struts or something? Like they tried to blame it. Yeah. Yeah. Like the whole Equifax thing is blamed on an Apache project. Right. And then of course, Equifax got into trouble and like they had to pay up,
Starting point is 00:04:13 but they didn't apply a security fix in time. And then they were like, well, it's not our problem because it's open source. Well, it is your problem because you lost a lot of people's data. But then obviously there was open SSL, that whole thing with everybody, just the whole internet using, like depending on it. And then like nobody realizing that it was just like one guy who was super stressed out and doing like contracting and getting paid like minimum wage to like keep the thing alive. He was getting paid minimum wage? The discussion that I have read about what he was getting paid was something shockingly, embarrassingly low. And the guy who was commenting on how much he was getting paid was something shockingly, embarrassingly low. And the guy who was commenting on how much he was getting paid said that it did not even characterize it as getting paid. Like
Starting point is 00:04:51 it's not enough to support one person's full-time work is what he described it as. So like, I don't know what the number is, but it's like embarrassingly low is all I can say. And the struts guy, whoever's maintaining struts, like I feel bad for them because that seems like it's a very unsexy maintenance project. Oh yeah, absolutely. Like, but like, let's just think about all of the terribly unsexy things, right? Like, let's think about YAML or something. Like everybody complains about it, right? These are hard things to maintain too, right?
Starting point is 00:05:17 But everybody needs it. But there's things that people like to complain about on the internet a lot that people need and must be maintained and like it's think about how unsexy that is right because then you have all of these people complaining on the internet about how much they hate the thing that you are trying to keep alive because if it disappeared people would have nothing right i know that the folks who maintained the scholars build tool have dealt with i mean first of all nobody likes build tools like yeah like everybody hates every build tool right like they get beat up all the time about SPT and it's hard, right? So like have everybody hating on the thing that you're building or keeping maintaining, but everybody needs it.
Starting point is 00:05:54 Like people aren't using something else. So what do we do? Like, do I stop maintaining it? I've never thought of that. Yeah. Nobody likes SPT. But people maintain it. Yeah. Yeah. You have to, I mean, what happens if, I mean, well, we can stop maintaining SBT. People are going to get even more angry. There's no wins, right? It's hard. Are these people, this specific example is interesting. Are these people volunteers or paid or? PhD student. So his name is Mark Hara. He was a PhD student at Boston University, I believe. And he was like doing chemistry. And then he just made SBT. It was like 2010 or something. I don't remember. It was like a while ago. Then everybody just started using it. They loved it. And it
Starting point is 00:06:33 passed hands a few times. He ended up eventually got hired by TypeSafe, now LightBend. And then there was a team of people who worked on SBT, but the team was always two, maybe three max people. So at one point it was Josh Surath, now it's Eugene Yokota and Dale Winant. They're both engineers at Lightbend. So this is one example of something that is funded and people, this is their job, is to maintain this tool at this company, but it's still hard because of pretty hates on it.
Starting point is 00:07:01 Yeah, yeah, I guess you're in an interesting place to perceive this problem with your relationship with the Scala Center, right? Because I have to assume that there's somebody, well, there's somebody angry at Twitter because some standard library list moving around is unperformant. And then somebody else is, listen, I just made this for a graduate project. I didn't ask you to run Twitter on it. That's not my fault. Yeah. Yeah. Right. It's your fault. You built Twitter on it. I didn't my fault. Yeah. Yeah. Right. It's your fault. Yeah. Like you built
Starting point is 00:07:25 Twitter on it. I didn't. Yeah. Yeah. There's all kinds of stuff like that. Right. I mean, so I think we've gotten past this because I think people have stopped obsessing over like what is in the standard library, because I think people have accepted at this point that the standard library beyond like collections is not really being updated or changed or evolved in any way. And like, you know, we should just build our own libraries because the standard library, it was exactly what you just described. It was like, huh, we're at a university and we're developing a programming language and we're using the programming language to teach our classes with.
Starting point is 00:07:56 So we need some library things like who's going to write the IO. And so people just did this stuff because, you know, we needed these things to be able to write simple IO. And so people just did this stuff because, you know, we needed these things to be able to write simple programs with. And clearly Scala all of a sudden got popular and we couldn't change a bunch of things now that a bunch of people are depending on it. The philosophy at some point, I don't know, probably like three or four years ago became like people are building better things than we have. And they're like maintaining it and publishing it. And we should not try to like compete with them or anything. We should just give them all of the props in the world and like let them go forward with their library for, I don't know,
Starting point is 00:08:29 parsing or this or that. Right. So how do we make this world a better place? Be nicer to contributors? Is that the solution or? No, I don't think we have an answer either. So I'm at Carnegie Mellon University now. I'm a professor here. And there is another professor at Carnegie Mellon that he was like doing research on how do you build like trust between people on a remote team. He's been sort of studying how people interact on developing software for a really, really long time. And I was just having a talk with him today, actually, about some of these things that nobody is like looking at, right? Like there's not like researchers looking at this. Maybe companies are looking at these things internally and developing opinions about collaboration and
Starting point is 00:09:09 how open source works internally. But with the exception of a couple of research papers and Nadia's report, there's just not a lot of people looking at this stuff and not a lot of answers about how to make all of this stuff better. So we have all kinds of observations that seem to have done something or caused something to be different in the context of Scala. But it really seems that interacting in person and just hanging out with people, like having lunch with somebody, things like this, they do amazing wonders for establishing camaraderie. Like, I've watched a whole bunch of situations where there was like an us versus them situation happening, like in pull requests and all kinds of other whatever, like on Twitter, like via all of the usual internet channels. But when people are stuck together in the same room for a couple of hours, and they're having a beer, or they're having lunch or something together, the whole like them us thing flips upside down and it's just us.
Starting point is 00:10:10 And people are a little bit less adversarial and like they work together towards something more readily. And that's really like the opposite thing that I think everybody wants to hear because the Internet was supposed to save everything, right? Like the Internet, we should just do everything remotely and online. This stupid little one-on-one interaction creates trust and empathy and like all kinds of things that you just don't get that with pull requests or even Twitter or Slack, just chatting with people,
Starting point is 00:10:38 talking about your dog and finding out that this person hates onions. Do you think video can work? Like, so we literally did talk about your dog. Yeah, that's true. That's true. So I have no silver bullet example or answer for you. It's just that at least I have found
Starting point is 00:10:53 in the last couple of years, making people sit down and talk to each other, discover that they're one person's a vegetarian or something. Maybe this ends up like getting into diversity and other things. Well, I think especially it's the most extreme and pronounced in open source, but it's a thing in software in general, right? Like there is a culture to being a software developer and, you know, it caters very
Starting point is 00:11:14 much to like introverts and detail oriented people that like math and all kinds of other things. And I think what we really need is some diversity from, I mean, just other cultures of doing things. Like, like I said, just other cultures of doing things. Like I said, it could be different colors and creeds and ethnicities of people, but also just like put some artists in the room with us because they're going to be super confused about why we would rather chat on Slack than just ask a question across the room. You know, they'll be like, just ask her. You know, there's something that like bringing in these other ways of doing things and these other ways of relating to one another, it seems to do wonders for just getting things done. There's a point here, I swear. It seems like if you look at software development in general, we observe all kinds of trends. Let's just look at whoever identifies as male and whoever identifies as
Starting point is 00:11:59 female. In companies, it's like 17 or 20 or circa that percent of the workforce will be female or identifying as female and the other being identifying as male. But if you try to apply that same logic to open source, it's like 5% versus 95%, right? Oh, it's even worse. It's much worse. Yeah, it's much worse. I mean, worse as in like less diverse in terms of at least being able to say something about somebody's suspected gender identity, right? I would have thought the reverse that like, oh, there's no barriers to contributing. So... Right. And the things that we kind of value in open source and in software development, it really caters to like, like I said, the introverts that like math and don't like talking
Starting point is 00:12:41 to other people, right? And we're finding that we're not good at getting things done unless we actually interact with other people or figure out ways to do this better. So this is when I say, bring in people that are like, extremely different from those introverts that like to hide and hack on stuff. Because that shakes everything up. In some cases, it forces people to empathize with one another, makes people try to understand each other other better and people build trust and whatnot, right? Like this is just the thing that I was talking about where like people in the same room with each other develop some sort of empathy and trust really quickly, right? Because they realize that this is a different way of communicating that they didn't have before, that this is a person that I'm interacting with. I see that you're a human and having human experiences, right?
Starting point is 00:13:23 We're surely not quite as stunted as you portray us though, are we? No, but I think we are as a community of people markedly different than just like hairdressers and police officers and people who do other jobs, like markedly different. And so if I told a police officer that things got better when we met in person and talked, they would be like, Of course. Yeah. Like, what did you do before? You know, because the world has got, you know, it's silos. And I think software developers, especially live in a silo, where we forget that there's just all these other ways of interacting and like ways of sharing experiences. I think when you step out of that
Starting point is 00:14:01 silo for a second, like the thing with the teachers or the police officers or whatever, my point where I'm like, hey, we have this weird anecdotal evidence of like people hanging out in one place, making it easier to or sort of encouraging people to keep contributing to open source because we built like a little community where we have some sort of trust and empathy for one another. And like people keep coming back because they're part of that community. Like making that statement, this is obvious to other people, but to us, it's like a new, a new piece of wisdom. No, that's true. So you mentioned bringing in people who don't fit this stereotype. How do we do that? That's a really good question. I actually came from like the art world somehow. Like I thought I was going to be an artist for a long time. And like, I went to art school and stuff. And so I realized that I always like, I realized that I always come at problems in a different way than other people do.
Starting point is 00:14:50 And I realized that I think it's because I didn't go through all the same mathematical textbooks that other people went through in the same progression that they did. So I didn't like learn like the standard proof methodology when everybody else learned it, which meant that I did not follow that methodology when trying to solve some problem the first time. So this was super embarrassing for a long time because I would sometimes ask really stupid questions or like really good questions, but like never anything in the middle, right? Like it was like an embarrassing question or it was like a good question. Right. And like, I think just having that different sort of experience, like having a different educational background and not sort of following the same steps that other people learn, because this is sort of like the curriculum that we all follow. Letting people ask these sideways questions sometimes is really helpful. have CS bachelor's degree or like, you know, you lack some years of experience that perhaps other people have because you're not doing it as long, but you have the same sort of coursework that
Starting point is 00:15:51 somebody who has a master's degree in computer science has, like I said, it takes a little bit longer because you end up having to do some of this like foundational stuff. But like, I think that's like one thing that could take people who have completely different sort of ways of approaching problems and interacting with people and empathizing with people and like just injecting them into our software development teams. The fundamental idea is that somebody who has like a completely different background and completely different experiences who decide they want to do software development or get into computer stuff, having a way for them, like a path for them to be taken up into this is actually like extremely useful because it brings like these weird sideways questions that I think are sometimes useful, right?
Starting point is 00:16:30 I think that this is also another thing that we don't realize yet that it's valuable. I think it's another definition of diversity that we don't recognize and carry around and wave on a flag. Yeah, because it's less visible. But yeah, coming at things from a different perspective. You mentioned these sprees. Yeah, right. Yeah. Is that a diversity tool or kind of, I would say in general, it's not about diversity. It's about establishing relationships and sort of empathy and trust between people that have never met each other, but maybe like saw a screen name or something. And then like showing people how easy it is actually to contribute to open source stuff. Once somebody is standing next to you and like letting you know that the process isn't that
Starting point is 00:17:10 scary. Like I know we have like lots of documents that say contributing and like read these 25 pages before making a pull request and all that, like, let's talk about it. Right. And Oh yeah, you should group everything into one commit. Like just this, these like little bits of helpful advice and like a person delivering that to you rather than you spending your weekend on something and then your pull request getting rejected because you didn't squash everything into single commit and then being depressed about it and not reopening the pull request, right? We have other efforts that are more aimed at diversity. And actually what's interesting about those is that getting participants is really hard. And that's because of our funny like silos that we live in. Right. But Scala Bridge is all about taking underrepresented minorities, just teaching them how to program. And of course, because it's Scala Bridge, we do it in Scala. They have one. They have them for other programming languages, right? Like Ruby and Go and all these other ones. And I've done this before in Switzerland and I've done this in the US. And in all cases, it's been super hard to get people to show up
Starting point is 00:18:06 because the target audience is people who just like are interested in learning how to program. They have half a day to spare and it's free and you just show up and we'll have some mentors like walking you through kind of like an interactive curriculum that teaches you kind of programming concepts, but visually it's fun and like kind of draw pictures and stuff. Yeah. And so when I've done this before, you know, we've gotten like women who were doing business stuff, I don't know. And like, they were good at spreadsheets and they're like, I think I can program, right? Like that seems like, you know, and, but I don't know how to start. Right.
Starting point is 00:18:36 And I like show up and, you know, like we've got some teenage girls to show up once and like, turns out that they blew the socks off these JavaScript developers who kept on trying to mutate state. But like these 16 year old girls were like, why do you keep trying to change things? Just make a new one. Like it was obvious to them. The JavaScript dudes were like, oh, sorry, teenage girls. That's awesome.
Starting point is 00:18:56 The thing about that, which also I think that none of us fully appreciate, like we all because there's a lot of people interested in doing diversity things like, yeah, let's teach diverse people how to do stuff. But where everybody fails again and again and again is figuring out how to reach out to those people. Where do those people hang out? You want to get moms that have taken a career break and they want to pivot into programming after having some kids or something? Go to Facebook. It's a different channel than you're on. I think there's a lot of people that would take this kind of stuff up. It's just that what we do, especially when we want to do like a, I don't know, like a
Starting point is 00:19:32 scholar bridge or something like this, teach people how to program. Like we'll like go on Twitter to our developer friends and be like, hey, I'm doing a diversity thing. And like, you're reaching out to everybody who already knows how to program everybody. Like, and then we'll all like retweet it or something but like we're retweeting it to each other it's like what's more helpful is like if you're like oh man my cousin might be interested in this and you get your phone and you send a text message to your cousin like that's way more useful than like retweeting stuff or like uploading something you know
Starting point is 00:20:00 like co-locate it with some conference that's on something completely different like hairdresser conference or something right i mean it's all conference that's on something completely different, like hairdresser conference or something. Right. I mean, it's all these like silly things that I think are like obvious in retrospect. We're not actively realizing, even though we have all these great intentions and we have companies are putting up like they're getting somebody to like do logistics, like like a paid admin to like do logistics for these events. And like they get food and stuff and like we're like putting money into it and like time and but we're not like finding people because we're like asking the wrong like places yeah so we talked about like two kind of things yeah we're talking about lots of random stuff sorry no no no it's not random it's like open source there's lots of projects where there's just not enough people, I guess, or, and then
Starting point is 00:20:46 this diversity and do these relate or are they disconnected? So, yeah, I think they do relate, but not in the immediate term. Like what I imagine is going to happen like 10 years from now. The 16 year old girls will maintain openness. Yeah, exactly. And then like, we all can just like sit back and relax because they're bad, they're bad asses. No, but I think what's going to happen is one way or another, either via some painful means or some sort of like revelatory way, we will realize that developing open source things is required and either contributing paid time while you're at work, like company subsidizing your time to like work on
Starting point is 00:21:25 an open source thing is just kind of like how we have to do things because like, that's like the only way we can like pay the open source tax, right? Like taken together with the fact that like there's this hilarious dearth of software developers, like everybody's trying to hire all the time, right? And the number of people coming out of CS programs is not quite enough to fill the need. And like, that's why all these coding bootcamps and all sorts of other things are interesting and desirable. And that's why there's all of these master's programs for people who come from other degree backgrounds and whatever.
Starting point is 00:21:52 So like, I think these things are eventually going to meet up and that like, we need to suddenly create a lot more people who have the ability to hack on software things. And like the only way that we can get the number of people that we need to work on these things is to actually stop failing at diversity so much and then these people like i think naturally will end up getting involved in like maintaining open source things because i think like ultimately like i said via either a painful way or some more open ssl like disasters or we have some sort of
Starting point is 00:22:21 enlightening realization but like i think more and more companies are going to start dictating that engineering time is spent on maintaining something in the open, something that the company is built on and requires to continue existing. Or, you know, if you need more developers in the first place, because there's not enough, and we have to start solving the diversity problem to fix that, like, ultimately, some of these diverse people are going to end up having to deal with open source things because I think there's nothing that companies can do other than subsidize some of the open source development that they need and depend on. Right. So like, I think these things are ultimately going to meet. And then of course it's a problem that we have such a bad diversity issue in open source right now.
Starting point is 00:22:57 But I think that once we start solving the diversity problem more broadly, it's going to start being more and more important diversity in open source, because there's hardly any diverse developers in the first place. And currently the experience as an open source developer, especially one who volunteers their time, it's like a hazing ritual, right? You already have a hard time as a minority. Let's be hazed too. So I think that these things eventually will meet and we will do a better job, but it's down the road. I think we have to independently first all develop a better cultural understanding of like what's required. People are writing clue code between things that already exist, right? And like putting an interface on it and stuff. And if following this trend, we've got to somehow pay for those guts, right?
Starting point is 00:23:39 Like we have to somehow invest development time and energy into the guts of the things that we're clicking together. Like this is a problem that needs to be solved somehow. And I think that the most realistic solution is that companies subsidize this somehow. It's the cheapest, fastest way. And, you know, 20% time working on various projects that are important to your product team's thing. I think that's a good point, though.
Starting point is 00:24:00 The solution could be corporations paying basically to maintain open source. But paying by, I mean, ultimately developing, investing developer time, which is not what you want to do typically because everybody's on a tight deadline and everybody is super bad at estimating how long things take and we're always behind, right? But like, I don't know, we have to solve that problem. And then like, I don't know, there's like a hack week or something. And like, everybody works on an open source project for one week, every quarter or something. There's also a problem with like skills, right? Like I'm sure that my company used OpenSSL, but I don't know if I would have been able to help them out with the project.
Starting point is 00:24:31 For sure. Like you have to have like some crypto person, like being super good at those sorts of things. Yeah, that's true. Like not anybody can contribute to any project. Like also the Scala compiler, it's like I have contributed like a few small things to it. I'm like an idiot compared to a lot of the other people that work on the Scala compiler. And like, I have contributed like a few small things to it. I'm like an idiot compared to a lot of the other people that work on the Scala compiler. And like, I have a PhD, right? So I'm sure if I tried really hard, I could get back into it
Starting point is 00:24:51 and like meaningfully contribute again. You can't just have somebody who's like, okay, Joe, your 20% project now is contribute to the Scala compiler. Obvious examples are when there's a project or a piece of software that the company makes heavy use of. And, you know, if you're opening tickets about it, you should invest time in trying to close those tickets. Right. Or like understand the people who are maintaining the things well enough to maybe even maybe.
Starting point is 00:25:17 So this is another thing that's a common misconception about open source. Not all contributions are code contributions. Something like 70 percent of people think, well, if it's not a code contribution, it must be documentation. And that's also not like a super, I mean, documentation is very helpful. I used to be the person in charge of documentation for a while, trust me, write documentation. But my point is just helping people filter and curate and like figure out what is in a ticket. And if it's really important, I can't tell you how useful this is.
Starting point is 00:25:45 If somebody shows up and is helping manage the issue tracker or something and then realizes that there's a fundamental issue in this one that I did not notice, I just want to buy that person flowers and beer and mail it to them. It's like, thank you. I can't express my happiness through the computer, but thank you so much, right? There's anxiety when you look at how many new issues have been opened in the last 24 hours. But if there's somebody there helping you, it's like you're less anxious and less stressed out, right? Yeah. So it's like all of these things that you just don't really see as a person who's thinking, how am I going to contribute to open source?
Starting point is 00:26:27 By helping people figure out what the hell is in the issue tracker and what's important, you're like reducing people's anxiety in ways that you just cannot appreciate. You know, I kind of assume sometimes that the people who created the open source project that I work on should fix my problem. And like that me pointing it out to them is like a gift. You're very helpful. Yes. Yeah. But it is helpful. So on the one hand, it's helpful. But on the other, it's like also anxiety and stress inducing, right? Because it's like, oh, no, more things I have to do. This is why people make jokes in academia, especially in the programming languages community. You know, the Scala people will show up to a conference and we'll be like, yeah, so, you know, we got some user feedback and people said that this was a bad design decision or something. And then somebody will like make some snarky remark in the back of them, like some professor or like grad student or something who's like, you know, they made a, like a toy language or something. And they're like, that's why we don't want users. Right. Because it's stressful. Right. I mean, it's wonderful. It's like, great. It's like, look, Scala is successful. We have lots of users, but also it's stressful. So there's like some people who like the fact that they don't have apparently like a handful of people.
Starting point is 00:27:19 So how did Scala not become just like angry Martin working on a compiler on the weekend? Like how did it become a real, how did it get users and become popular? I have a tangent that's cute, but it's a visual that's fun. And like, you can imagine Twitter doing this and like other companies. So when penguins decide that they want to go hunting, they all like kind of like run to like the edge of the ice and they're all like pushing each other. Right. And nobody wants to jump in because there could be like one of these sea lions or something swimming around that eat penguins right and they'll just it'll just eat all the penguins so they all like push up to the edge and they're all pushing each other like they all want to but like nobody's doing it and then eventually the pressure is so hard it builds up and like one slips and falls in and then everybody
Starting point is 00:28:00 stops and they watch right and like if like the penguin like swims around and comes back up and like, there's no blood, it seems to be safe to go fishing. And they all jump in and go swimming and they catch fish, right? And that's a great experience. That silly analogy, it's kind of like, I think Twitter was the penguin that fell in, right? And then a whole bunch of other people were like,
Starting point is 00:28:19 look, Twitter's fine. Look, things are great. Oh my God, things are working somehow. Things that some, you know, and like, what's your secret sauce? Oh, wow, you have this programming language. Oh, it's on the JVM. And then a bunch of startups were like, I'm going to do this. And so a bunch of people started using it. I think after they saw that it was working out for Twitter. How did the language survive that?
Starting point is 00:28:35 Because like, if I had an open source library, which obviously different than a language and large companies started using it, that might scare me. No, it's terrible. Right? Because like you're all alone. So fortunately, Martin was not all alone. You know, he's a professor and he has tenure, which means that his paycheck is guaranteed. Like he can do whatever he wants. You know, the university is not going to fire him. So he can take a risk and just work on a compiler for some years.
Starting point is 00:28:57 And that's technically OK. And of course, professors will get grants or get various money to do research projects and whatnot. EPFL provides like base funding to have a couple of PhD students, which is unique. Most universities don't do that. So Martin had like a base team size of himself and at least like, I don't know, probably two other grad students or something. So it was him and these other people. And like worst case scenario, you know, we spend a lot of engineering time, like us small group on like keeping these things going. I don't think Martin ever had a plan for Scala taking off. Martin has
Starting point is 00:29:31 a super cool history. He's basically like a compiler maker. He has been since he was very young. He sold a Pascal compiler to a company before he started his PhD. And he was trying to decide whether he was going to go like work on compilers like in industry or something or go do a PhD with Niklaus Wirth, which is like the Pascal guy, right? He ultimately decided to do a PhD. But like he spent his PhD
Starting point is 00:29:54 writing compilers. He ended up finishing his PhD and then just writing a lot more compilers. Like he's just been writing compilers his whole life. And I mean, he invented a whole bunch of languages.
Starting point is 00:30:03 One's called Pizza. One is called Funnel. Like they're precursors to Scala. And he worked on this thing called GJ, which is generics in Java. He basically wrote that compiler that became the reference compiler for the Java compiler. And they were using this in like the Java compiler was using generics before people had access to generics in Java. Right. But like, he just keeps making compilers. He has been forever. I don't think that Martin, every time he picks up a compiler, except for like the last version of Dottie, I don't think he ever had a plan for like the compiler taking off and like this open source problem and all of
Starting point is 00:30:33 this, right? Like he was going to just do a really nice compiler for a really nice, beautiful, perfect language that he thought was like, right. And he knew that he could continue working on it because he was a professor and he had tenure and like they were going to keep paying him to do this. So worst case scenario, he works on it. And I think that was like his contingency plan, right? Fortunately, there were a bunch of grad students. The grad students were helping with triaging and bug fixing and all of this for some years. Like I joined the research group when, you know, our weekly meetings was really just triaging and assigning bugs to people to fix. It's like the whole research group was like,
Starting point is 00:31:08 our research meeting or group meeting or whatever was like talking about the bugs that we would each fix or trying to determine which bugs are the most important. And we did this for a while. And ultimately, it was just too much. It was just some random research group. And he did a sabbatical, but his sabbatical was creating this company called Scala Solutions at the startup park next to EPFL. It was creating this company called Scala Solutions at the startup
Starting point is 00:31:25 park next to EPFL. It was like this little startup building, like you can have an office, right? It's like affiliated with the university. So he made a startup over there. It was supposed to be just like a Scala consultancy that did like training and stuff. And apparently Jonas Bonner had a company in Sweden that had almost the same name. It was like Scalable Solutions or something. And like Martin's was Scala Solutions. Like it was like the same name. It was like Scalable Solutions or something. And like Martin's was Scala Solutions. Like it was like the same name almost. And then decided that they were going to get together and get some venture capital and found Cypesef,
Starting point is 00:31:51 which became Lightbend. And the idea at the time was that, well, the company will do a lot of the engineering because it doesn't make sense. Like I can't do it all by myself. Like I'm a professor. I have grad students that are transient and like they can do a bunch of engineering,
Starting point is 00:32:03 but they can't do only engineering because they're supposed to do phds so the idea was like well the company can deal with it right and the company is vc funded like it took care of the language for a while it still pays a number of people to work on the compiler and the build tools but like the company largely does not do open source scala making sure that the language is fine like they pay three people who work on the compiler and two people to do the build tool. Aside from that, they don't have a lot of engineers maintaining Scala.
Starting point is 00:32:30 This is when the Scala Center appeared again. The Scala Center was like, okay, well, let's add some more engineers to the pot to try and keep Scala alive because it's still not a solved problem. I think the Scala Center helps, right? And obviously, like Ben, paying engineers to maintain the production compiler also helps, right? And like, obviously, like Ben paying engineers to maintain the production compiler
Starting point is 00:32:45 also helps, right? But there's not like a silver bullet great solution or anything. It's like, this is the way it's done. It's perfect, you know? It's still like every year, it's a problem.
Starting point is 00:32:54 Like we're like, okay, I hope this keeps working the way that it's working, you know? Because how do you maintain an open source compiler? Like think about the open source compilers that you know. Who maintains them?
Starting point is 00:33:03 It's usually like a company that stands to benefit somehow from you using their compiler and they can invest engineering resources on it. It's really, really hard to have a compiler because compiler is like multi-year project and you have to get a group of people working on it. And these are like some special people
Starting point is 00:33:21 that are really hard to hire, right? Because there's not a lot of people who are just like A, into it and B, like good at it and love it right yeah and so it's super super expensive and you know it's it's a long-term thing and so even um rust has had you know a few scares here and there i mean with you know maybe mozilla will cut us oh my god yeah i mean they blew up in community land and everybody loves them and everybody's starting to build stuff in Rust. And like that fear is gone now.
Starting point is 00:33:47 But like Mozilla is not going to really increase its investment in Rust or anything. Like we should all be thankful as a group of humans that use programming languages on the Internet that like Mozilla continues to pay the handful of core team members that it does. You know, it's hoping that other people, other companies pick up some of the core developers. Even bigger companies like Mozilla don't have it perfectly worked out. It's better probably than in the Scala's case because it's maybe a bigger company or something, but it's not worked out anywhere. Open source compilers are hard to maintain because they're fundamentally not profitable. You're not going to get explosive profit from working on it. It's kind of crazy. It's like the most base of this infrastructure that you're talking about, right? It's like...
Starting point is 00:34:30 For everybody, it's uncertain. I mean, like hang out with the Mozilla guys four years ago or something. Like there was some fear that, you know, like maybe this project is taking too long and like it's too expensive, whatever. There was legitimate concern that Rust was not going to exist at several points because Mozilla didn't want to keep paying for it. Yeah. Clearly there's some sort of blockchain solution where your compiler... I'm so happy that... I'm just joking.
Starting point is 00:34:51 Actually, that's right. Like blockchain is really the answer for everything. It's true. I should have thought of this. Every time you compile, like it gives money to authors or something. Yeah, exactly. Yeah. And the more you compile, clearly the more productive you are,
Starting point is 00:35:02 or at least like the more reliant on the type system you are, which means we have to pay the engineers to develop the type system. Yeah. There we go. You know, that incentivizes people to not use type systems then. Like you're shooting yourself in the foot. You're like into type systems. This is not what we want. We have to like roll back, roll back.
Starting point is 00:35:19 Also just disincentivizes compiling. It'd be like, well, let's not even try to compile it until we want to push it to product. Yeah. Just let's interpret everything in prod. Anything. We can just receive any code and just run it. Who cares? Seems totally safe. So how does all this relate to your current role? Yeah, yeah. I'm an assistant professor at CMU. And CMU is an interesting place. So it's a really huge school. And they have have done like hugely foundational things in like programming languages and like robotics and, you know, like systems and like file systems.
Starting point is 00:35:53 And so I came into CMU with the idea that we would try to apply techniques from programming languages to try to make building distributed systems a little bit easier. And when you're a professor, it's really a negotiation. You don't show up and you're like a furor being like, now you will work on this and you will work on that, right? It's like, it's a negotiation. You know, you're collaborating with a PhD student usually, and you usually have a couple of them. And that means that there are a couple of projects underway and they're usually kind of like matched up with the interest of the PhD students. So I'm working with Chris Micklejohn right now, and we're doing stuff a little bit more related to like distributed runtimes,
Starting point is 00:36:32 which is part of programming languages world. It's the runtime piece. And it sort of solved a bunch of problems with actor runtimes that people didn't realize were really a problem, which actually artificially like limited how scalable they could be and how fast they could be. And it's one of those things where in retrospect, it seems obvious that the changes that we proposed, basically we observed, again, this sounds really obvious in retrospect, but implementations of the actor model or actor runtimes would just assume full mesh connectivity for distributed actors. So if you had one actor, the runtime assumes that you should have knowledge of all of the other actors that exist
Starting point is 00:37:11 in the system at any given moment. All right. And so like that artificially limits sort of what you can do with actors. And that also sort of gives the hardcore systems people who build everything and just like pure C, you know, they have some credence when they say actors are slow, like you can't do things with actors because because like, they're all these like artificial bottlenecks are not really necessary. You can still support the actor model, while having a more scalable way to interact with other actors. So Chris had this wonderful idea, which was basically, hey, why don't we at runtime, like let the people who are going to just run some actor program, say how connected you want all these actors to be.
Starting point is 00:37:47 Do you want, you know, like maybe it should be like a client server situation or maybe it should be peer to peer. And then we have this other project where we're trying to with this kind of fundamental question about like programming languages research and like what is the interface to the programming language? Everybody thinks it's the compiler, but for like the last year or so, we're like, but it could also be CI, right? Because we submit things to CI anyway,
Starting point is 00:38:11 and like we want it to pass before we push into production. So it's basically type checking between services. That's cool. So like if a service is a method, it's got a signature, right? And I kind of figure out what the signature is for your service.
Starting point is 00:38:22 And I type check it against the other services that I know are currently in production. So I can at least tell you, don't deploy this because you're going to break these other people's thing because they assumed that that number is always going to be zero or greater. Right. And then you suddenly start passing a negative number now. They can't handle it. So like their stuff is going to break and page two is going to wake somebody up.
Starting point is 00:38:40 Right. So don't deploy it. Right. Are you familiar with GraphQL at all? Yeah, I'm familiar with it. But am I like an expert? No, but go ahead, tell me and I'll let you know if I get lost. So because it has a schema, like you can validate whether the schema has changed like in the build process, right? Yeah, but this is also kind of the same thing with protobuf and all these other things, right?
Starting point is 00:38:59 Yeah, yeah. So it's already not so bad because you have teams kind of agreeing on schemas. And then schema evolution is something that's pretty well understood in systems land, right? Like we can never remove a field. We can only add fields, for example, or like version them or something, right? So like this is already like very codified. We have standards around this. But like our programming languages and compilers don't know what schema evolution is, right? Yeah.
Starting point is 00:39:23 And it's like all kinds of things that like just reasoning that like maybe we don't have to do and like we can offload to something else to have to think about for us. That's cool. That's very cool. It's like a dumb question, which is basically why does the programming languages stuff
Starting point is 00:39:35 have to be limited to a compiler? Why can't CI be a compiler? I mean, that's sort of a funny way to say it, but in some sense it's kind of semi-accurate for what we're doing. Yeah, that's a neat project. I thought you it, but in some sense, it's kind of semi-accurate for what we're doing. Yeah, that's a neat project. I thought you were going to say like you were going to build your own language and repeat this whole structure that Martin did. I am doing it, though, in like a tricky way, right?
Starting point is 00:39:54 Like I'm just telling everybody that the CI is a compiler. Oh, yeah. If people start using it, then. Yeah. Well, I hope that there will be like a startup or something where that I get tenure. So I get back on it. Now, we have a better idea nowadays of how to support things that are useful. At least I see companies a little bit more willing to help maintain stuff, right?
Starting point is 00:40:17 So all of my like pessimism about open source in general, like I don't think that that should scare people away from taking risks and starting new, interesting open source in general, like I don't think that that should scare people away from taking risks and starting new interesting open source things. I think that the current situation that we are in together as like a community of people who develop software is one that is ultimately going to be solved. But I think that we just have to change,
Starting point is 00:40:38 you know, we have to realize some things before we can solve it, right? And I think that, you know, realizing that we all have to be a little bit more responsible than we have been in the last couple of years for the software that we're using is a thing that we will slowly realize in the next four or five years, I promise. Like, otherwise we're going to have more like Equifax or OpenSSL or like all kinds of things. Right. Nice. I think that's a great place to end it. You kind of wrapped it all together.
Starting point is 00:41:02 Thank you so much for your time. Yeah. And thanks for like thinking to like reach out to me. That was the show. I hope you liked it. Thank you for listening to the co-recursive podcast. I'm Adam Gordon Bell, your host. If you liked the show, yeah, tell a friend, join the Slack channel. Until next time. thank you for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.