The Changelog: Software Development, Open Source - Securing GitHub (Interview)

Episode Date: June 19, 2024

Jacob DePriest, VP and Deputy Chief Security Officer at GitHub, joins the show this week to talk about securing GitHub. From Artifact Attestations, profile hardening, preventing XZ-like attacks, GitHu...b Advanced Security, code scanning, improving Dependabot, and more.

Transcript
Discussion (0)
Starting point is 00:00:00 What's up friends friends? Welcome back. This is the ChangeLog. I'm Alex Dekowiak, Editor-in-Chief here at ChangeLog. Today we're talking about the most important developer platform out there. Yeah, it's called GitHub. We're joined by Jacob DePriest, VP and Deputy Chief Security Officer at GitHub. And the topic is, of course, securing GitHub, securing open source and all the things you have to do to ensure releases, profiles, GitHub at large is secure.
Starting point is 00:00:55 Now, Jacob is one of many in the line of securing GitHub. So we dug deep, we go deep, and we ask questions about what it takes to secure GitHub and to keep it secure. A massive thank you to our friends and our partners at fly.io. That is the home of changelog.com. And it's also the place you can launch your apps. You can launch your databases and your AI. You can launch your AI near your users with no ops. And that's so cool.
Starting point is 00:01:36 Learn more at fly.io. Here we go. What's up friends? I'm here with a good friend of mine, Faras Aboukadije. Faras is the founder and CEO of Socket. Socket helps to protect some of the best engineering teams out there with their developer first security platform. They protect your code from both vulnerable and malicious dependencies. So we've known each other for a while now, Frost. Well, let's imagine somehow I've landed myself a Vercel. And because I'm a big fan of you, I understand what Socket is. But I don't know how to explain it to anybody else there. I've brought you into a meeting.
Starting point is 00:02:17 We're considering Socket because we want to secure dependencies. We want to ship faster. We want everything that you promise from Socket. How do you explain Socket. How do you explain Socket to my team of ourselves? Yeah, Socket is a developer first security platform that stops vulnerable and malicious open source dependencies from infiltrating your most critical apps. So we do that by focusing on real threats and keeping out all the types of risks that are out there in open source dependencies.
Starting point is 00:02:50 Everything from malicious dependencies, typo squad attacks, backdoors, risky dependencies, dependencies with hidden behavior. There's all kinds of risks out there. A lot of reasons why a dependency might be bad news. And Socket can help you as a developer. Just keep all that out of your app. Keep things nice and clean and pristine amongst your dependencies. I saw recently Dracula. I'm a fan of Dracula. I don't know about you, but I love that theme. Big fan of Zeno Rocha. And I saw there was like a misspelling there.
Starting point is 00:03:19 And so because Dracula is installed on VS Code and lots of different places, I saw there was a typo squat sitting there that had different intentions than obviously Dracula did. Is that an example of what you mean? Absolutely. Yeah. Dracula, that's a perfect example. It's super common these days to see that type of an attack where you see a common dependency that you have an attacker just pretending to be that dependency, typoing the name of it by one letter and then trying to get unsuspecting developers to install it. Unfortunately, we're seeing more and more of these types of attacks in the community and they're taking advantage of the trust in open source. As developers, we need to be more aware of the dependencies we're using and make sure that we're not pulling in anything that could risk the data of our users or cause a big breach at our companies. And so part of that is obviously
Starting point is 00:04:00 being more careful and asking questions and looking more carefully at the dependencies we use. But also part of that is tooling. It's really a hard problem to solve just on your own as a single developer. And so bringing in a tool like Socket can really help automate a lot of that work for you. It just sort of sits there in the background. It's really, really quiet. It doesn't create a lot of noise. But if you were to pull in something that was backdoored or compromised in some way, we would jump into action right in the PR or right in your editor.
Starting point is 00:04:28 Or even as early as you browse the web, we have a web extension that can actually give you information if you're looking at a package that's dangerous or if you're browsing Stack Overflow and you see somebody saying, hey, just install this dependency to solve your problems. A lot of times even that can be a way to get the attacker's code onto your machine. So Socket jumps in at all those different places and can tell you if something is dangerous and stop you from owning yourself. Yes, don't get yourself owned.
Starting point is 00:04:56 Use Socket. Check them out. Socket.dev. Big fan of you, Firas. Big fan of what you're doing with Socket. Proactive versus reactive, to me, is the ultimate shift left for developers. It is totally developer-first. Check it out, socket.dev.
Starting point is 00:05:13 Install the GitHub app, too easy, or book a demo. Once again, socket.dev. That's S-O-C-K-E-T dot dev. Well, we're here with the VP and Deputy Chief of Security. I should say, actually, Security Officer at GitHub, Jacob DePriest. Jacob, thank you for coming on the show. Definitely fans of GitHub, obviously, and securing GitHub. Can we do that, please? Indeed. It's great to be here. Thanks for having me.
Starting point is 00:06:03 I think it is secure, though, right? Is it anti-secure? Is it fully secure? Our goal, that's our goal on the security team, is to secure the world's developers platform. I think it's a good thing. That's why we want to have you on the show. We obviously had an XZ attack or issue, I guess, a while back. And in a conversation on this show, we talked about, we speculated, at least I did, on the role of GitHub to prevent things like that by hardening the profile of individual developer users on GitHub. But I'm sure we can go deep. Where should we begin when we talk about securing GitHub?
Starting point is 00:06:37 What's a good place to begin? I mean, starting with the developer is kind of where I always like to start. So, I mean, I think that actually sounds like a great place to start. I think when we talk about open source security, we talk about the supply chain, we talk about all these things. You could start anywhere, but at GitHub, we always like to start with the developer. That's kind of our central ethos is how to empower the developers,
Starting point is 00:06:59 how to secure the developer workflow. And so that's kind of our approach there. And so last year, about a year and a half ago, we announced and started implementing what at the time was kind of a fairly controversial initiative to turn on mandatory 2FA for all contributors on github.com. And we've been pretty successfully rolling that out. But there's, I think there's other things we can do in the developer account space as well, but happy to dig into those things. I think when we look at the XE attack,
Starting point is 00:07:32 that's a lot of social engineering in a way, right? But you also have sort of profiles on GitHub that may be nation-state based. There's a lot of speculation around that particular attack and that scenario. Would you call it an attack, Jared? Was it really an attack? I guess it was. It was a takeover. It was not really an attack. It was more like a social engineering takeover and then infiltration of... Well, you call it a supply chain attack, but it's not via an exploit
Starting point is 00:07:56 or brute force. It was via social engineering and takeover of a project and then the ability to release new code as the owner of said project and then the ability to release new code as the owner of said project without people knowing that the ownership had changed. I guess where I'm curious is how much does GitHub take the responsibility of securing profiles beyond simply they're secure? As in there seems to be nefarious actions in the profile. It doesn't seem to be some of the social constructs and contracts of being a citizen of developer land in the world. How far do you go in terms of securing proactively, and maybe your own ideas,
Starting point is 00:08:39 and how GitHub may react in the future to these kinds of things where you have profiles? And if there's prevention at the profile level, what can be done? Yeah, it's an interesting question. I think that the way I kind of have been thinking about this is, I think it's a bit broader because when you start thinking about social engineering, you start thinking about the techniques and approaches that individuals could use in these cases. Even if the profile is secure, even if there's lots of investigation and telemetry and all those kind of things,
Starting point is 00:09:11 it's still not necessarily obvious when something is nefarious versus when something's a mistake. In some ways, I think what we're talking about, the analogy I've been thinking about actually is most sufficiently sized corporations, businesses, organizations, government entities have an insider threat program. And in some ways, this is sort of like the insider threat scenario for the world, for the larger software ecosystem. How do we think about that? And so certainly I think developer accounts and profiles are an aspect of it.
Starting point is 00:09:45 But I also think there's an element of the broader supply chain and things like attestations and salsa compliance. So the framework to help secure builds, essentially. So what is going into a build, not just what's the piece of software that was downloaded from the Python, PyPy registry, or the Go registry, but what was used to build that software? Where was it built? What were the instructions that went into it? And can we know cryptographically that the software that we're installing came from that build process and was created by that developer? And so would that have necessarily solved
Starting point is 00:10:25 this particular challenge? I'm not sure that it would have, but it sure would have given the security researchers looking into it and trying to figure out what happened way more tools and confidence much more quickly than they would have otherwise had.
Starting point is 00:10:38 And so I think this is definitely has an element of, as a community, we've all got to come together and figure out what are the standards, what a community, we've all got to come together and figure out what are the standards, what are the ways we want to distribute software, what are the trust signals that we can all get when we're thinking about what software we're going to use in our products. I have a good idea. I think for eight bucks a month, you can just get a verified badge and then you're good to go, right? You just verify your account and we can trust you immediately.
Starting point is 00:11:03 That's right. Nobody ever lies on the internet. To that point, do you think that doing GitHub's best to reach into the real world and confirm people are people, this person is potentially suspicious, this person is okay, this account is being run by somebody who didn't previously own it? Let's forget about the implementation. Cause I understand how hairy that could all be, but do you think the idea is good or do you think it's not a place where we should even be prying into these things?
Starting point is 00:11:34 Cause a lot of it going back to the XZ is like, who in the world is Gia Tan? Is it a person? Is it a nation state? Why were they trusted? Et cetera, et cetera. And it's like,
Starting point is 00:11:43 well, if we could know who Giatan is and then track that or no more information, then we can make wiser choices. But do you think that's a worthwhile effort or not? I certainly think it's something that the broader software industry should continue exploring in all SaaS platforms.
Starting point is 00:12:00 I also think there's a balance here because one of the promises, and it's more than a promise. One of the outcomes we've seen from the open source movement in the last 20 years is people in developing countries, people all over the world from different socioeconomic backgrounds, being able to contribute, be able to get up to speed, be able to make a meaningful difference in a piece of software. And, you know, I think we have to recognize that not everybody who is contributing is coming from a place in the world where they're either have the technology or the identification system or the infrastructure to be able to do this kind of verification, or potentially it puts them in some sort of risk in whatever environment they're in. And so, you know, this is why when we developed the mandatory two-factor authentication, we didn't jump straight to, well, you have to have a YubiKey or FIDO or a passkey.
Starting point is 00:12:51 We continued to leave the door open for a wide range of 2FA options is because we have users who are students in schools and don't have mobile phones or can't afford mobile phones or in an area where they can't kind of perform that two-factor authentication in a way that, you know, the security industry might say is world-class, but, you know, we have 100 million developers on the platform, many of which aren't necessarily tech sector, financially affluent folks who can be able to do these things. And so I think we have to balance both of those as we think about the open source community.
Starting point is 00:13:23 And then I think that's where, to me, it's what can the rest of the community do? Like, what are the things we can work with our partners in industry on securing builds, securing attestations, cryptographically verifying what's going into things and attesting to those so that the companies that are building these things have a better sense of what's in them. And the companies with the resources who are using them can contribute back to that and help make these more secure versus kind of putting all that on the individuals what about change of ownership it seems like that is a pretty strong signal of potential problems and i'm not sure if github has anything built into it and the security tools around that particular thing. Like this repository is now owned by a new user or org.
Starting point is 00:14:09 I mean, it seems like a lot of times that's still not a big deal, but as a downstream person, I'd still want to know about it and be like, well, I went and checked it out. It seems legit. I'm cool. Or this doesn't seem legit. You know, let's take action. We have some protections in place now in terms of the account ownership,
Starting point is 00:14:28 and particularly if, for instance, somebody changes their username and somebody else grabs it real quick and things like that. But one of the things we released recently, which will be free for kind of public use as well, it's not just a paid feature, is attestations for our builds.
Starting point is 00:14:44 And so what you can do here with GitHub Actions is let's say you're an open source developer, you're working on something. Normally you kind of do the build, however, maybe use GitHub Actions, maybe use something else. And then you push the artifact up to PyPy or you push it up to Rust or wherever. And then when a user goes to download that artifact
Starting point is 00:15:01 and leverage it on their systems, they have no idea which repo it came from, what org. They don't know. I mean, it says the name of the repo on it, but there's not really a way to prove that it was you and it was this build process and it was this repo. And so that's where things can get really wonky if the repo changed ownership or the users
Starting point is 00:15:17 or the lead contributors all rotated out over a weekend for some reason. Then what do you do as an end user? Well, you can't really do much today or in the past. And now with attestations, what you can do is you can actually say, I want to cryptographically verify this build. And you can even do things like, I want to make sure that it came from this repo, this org, this branch, and you can actually attest to that before you deploy it in your environment.
Starting point is 00:15:48 And I think that's something that has been possible through partnerships with things like Sigstore and other cryptographic means in the past. But the accessibility of that's been hard. It's not really been built in in a way that the average developer could take advantage of it. And now with attestations, it's literally just add an action that we support and maintain to your workflow. And it produces an attestation that people can check against if they want to have that level of rigor in their deployments and security builds. And I think, I mean, that's step one, right? That's not going to solve everything, but I think that's the path
Starting point is 00:16:21 we have to go on as an industry, particularly with open source, is making these things more transparent, making them not just researchable transparent with somebody, a human's eyes, but machine readable transparent so that we can start to make risk decisions on them in a programmatic, scalable way. How would this idea of attestation apply to Eggsy in particular, given that entire scenario where you had a social engineer over many, like a long time, this was a very patient attack. How would this apply there? So again, I don't necessarily think this would have been a preventative thing, but let's fast forward to a future where most open source packages on the internet have this built in. I think it's a deterrent at that point. And here's what I mean. If attestations were used in this case, then it would have been a very trivial manner for any researcher to look at these packages to be able to,
Starting point is 00:17:15 within a matter of a few clicks, get to the build workflow that shows the instructions that were happening in the actual build itself. And what went into the build? Was it just the source code? Was there other things that went in? Like, how do we actually backtrace this into the visibility of not just the code, but what went in to take that code into the artifact that ends up getting used by end users? And so I think as we see this adopted more and more,
Starting point is 00:17:42 the recognition from malicious actors that this stuff is really accessible and everybody's expecting this transparency, and it's going to be trivial for a researcher to go look at all the build logs and start to build analytics and scans and detections against not just the code, but the builds.
Starting point is 00:17:59 I think it's going to be an important step forward in deterring this as a space and an attack vector. What's scary about XD is that it was discovered by accident. It was like somebody who just happened to have just like a millisecond too long on their hands and they found this thing. And like, so how many of these things are happening given now the zoom out of the patients to do the engineering, the social engineering to get into place and the multiple profiles and catfishing that took place to get to sort of wear on the maintainer, right? Like that person was taken advantage of in terms of what a maintainer goes through to
Starting point is 00:18:39 build, run, communicate, et cetera, in an open source community, a software like XE, for example. The scary part is that it was discovered by accident. And I think, you know, you want to have this attestation, this build process, look, this sort of reproducible build aspect that's verifiable. But then you have the other side, which is like, okay, if I'm going to become a core contributor or a maintainer or have right to master or main on a given GitHub repo, at that profile level, I know that you have 100 million developers across the platform, but there's a certain level of developer that begins to become a core contributor to a key piece of software. And that person is different and more unique than everybody else on the platform insofar that they have a level of power and control given the prowess and usage of that software. So they kind of elevate themselves. And you were part of the NSA, so you get security clearances.
Starting point is 00:19:41 Not everybody can get a security clearance. So they're set apart, right? And so I think, I'm curious, I think this is Jared's angle is like, how can we set apart certain profiles to have certain levels of awareness of the personhood so that we can have more trusted software? Yeah, I think it's a great question. I would pivot it slightly, at least the way I think about it, is less about the profile and the human and more about the expectations of these critical pieces of software. And here's what I mean. I think there's kind of two elements to this.
Starting point is 00:20:14 I think one is, what is the responsibility and expectations for the organizations, corporations, companies that are using this critical software? Do they have a responsibility to look into and ensure the security of these core fundamental building blocks that really power a lot of the internet and a lot of these companies, right? And so I think today we've seen this, I mean, this is a few years old now, but we saw this in Log4J, there was this outcry of like, well, how are we going to hold these developers accountable? And it was like a handful of folks over in their spare time building this stuff, right? They weren't resourced to secure, build, look at these things. And then, so in many ways, I don't view the malicious intent from potentially the alleged malicious intent in the XZ case is any different than a accidental or poor programming practice or just not securing. I mean, to a certain degree, the outcome is the same in the sense that there's insecure software that is
Starting point is 00:21:10 being included in core functionality across a lot of platforms. And so I think some of this is, I think we have to, as a community, take more responsibility for the open source software we're using. And I think on the platform side, I think there has to grow an expectation of the security tooling and expectations of the code that we're using. And so this is where things like GitHub Advanced Security, code scanning, secret scanning, and there's plenty of other tools out there too,
Starting point is 00:21:37 but I think we have to elevate the expectation that these core pieces of software are going to have those things turned on. They're going to have security scanning with the results made available or at least something that's consumable as an artifact there so that we're kind of hitting this from multiple angles to really level up the security. The challenge of the defender is that you must secure the entire thing, right?
Starting point is 00:22:03 You've got to fortify the entire house. And the advantage of the attacker is they only have to find one way in. That's right. Doesn't that seem futile? Like, I don't know. I'm just getting a little bit worn down perhaps because it's like, I'm just thinking about
Starting point is 00:22:18 how many lines of code are in, for instance, the Debian distribution. You know, because XZ is low level software and certainly widely deployed, but mostly invisible. And would we have considered it critical software? I mean, maybe some people would have, but for the most part, it's just down there, it's utility software. And how much of that is there millions upon millions upon millions of lines of code, of course, of course.
Starting point is 00:22:45 It just seems like we have to overhaul. You're talking about new-ish best practices around writing and deploying secure code, but it almost requires an entire industry come-to-Jesus moment with regard to
Starting point is 00:23:02 these practices before it's ever going to actually help us. I think if we were just looking at the code bases and assumed that come to Jesus moment with regard to these practices before it's ever going to actually help us. Yeah, I mean, I think if we were just looking at the code bases and assumed that all of it was sitting unprotected on the internet, it would feel and likely be futile, to be honest. And I think this is where the rest of mature security programs come into place and things that we can do as an industry. So I think that that's where zero trust and identity as a perimeter and those kind of concepts secure by design come into play. industry. So, you know, I think that that's where like zero trust and identity as a perimeter and those kinds of concepts secure by design come into play. So like, if we have strong authentication in front of access to a lot of these systems, if we have network isolation between key systems,
Starting point is 00:23:37 if we have, you know, role-based access control. So we kind of assume that parts of our parts of these systems will eventually experience some sort of security issue. How do we firewall those off from other parts of the system? And so I think this is where the rest of that comes into play. And I also think the other element to this is I think the industry does need to level this up. Like so the CISA secure by design pledge that was announced and many companies include GitHub signed at RSA a few weeks ago, talks about this, right? It talks about needing increased commitment from key players in the industry to implement secure by design principles as part of, not just as part of their internal programs, but as part of the products they offer to the world and to users,
Starting point is 00:24:21 so that the settings that make things more secure on by default, even if it causes a little more friction or one more click for a user. And I think that's really an important part of this that, you know, honestly, we do have to progress as an industry here. And I think it's critical that that's the other element of responsibility that companies, corporations and organizations take in this space. Yeah, well said. I think it's tough when you get to the individual org or individual developer, and we're relying on them to also do their due diligence and their best practices. Because A, education is a problem, like a lot of us don't know.
Starting point is 00:24:57 And then B, the constant pressure and stress to be shipping more features and code faster, stronger, cheaper, et cetera, with tools now helping us write code that we may not exactly vet. That just makes the problem even more massive because we need the big players to adopt and to sign pledges and to push out secure best practices and suites of tools and everything.
Starting point is 00:25:25 And then we also need the awareness. And we need to equip the everyday developer with the ability to also do these things, use these things, and really just have their wits about them despite all the pressures pushing them away because of that push and pull between convenience and security and that relationship is just so fraught. Yeah, I totally agree. I'll give you a concrete example.
Starting point is 00:25:49 I'm obviously more familiar with what we're doing at GitHub than other places, but we have had a feature for a while for enterprise customers and public repos that was opt-in. It's called push protection for secrets. So we have this thing called secret scanning, that if we detect a secret in someone's code, so like a structured AWS token or Azure token or something like that, we'll alert, it becomes a security alert, but it had to be turned on. And recently we enabled push protection, which prevents, that stops those secrets from getting to the public repo before the commit happens.
Starting point is 00:26:27 And so you're a developer, you're working on your laptop, you go to push up a change to GitHub. If we detect it, we'll stop it before it gets to the public repo and send an error back and say, hey, we detected a secret. That's push protection.
Starting point is 00:26:39 We turned that on for all public repos recently, all public repos on github.com. And that increases friction to a certain degree. There are going to be developers out there who are just pushing test secrets up to their repos to try things out. And they're going to get frustrated and they're going to have to go, you know, search and figure out what setting to turn off or whatever. But it's a secure by design principle that we believe strongly in is that source code is not the right place to store secrets. And we continue to see issues in the news and industry where things have gone really wrong for companies
Starting point is 00:27:09 where an innocuous, probably well-intentioned secret was put in code. Somehow that code gets leaked and there's a phishing incident. And then all of a sudden that secret is used to pivot way further into the infrastructure and cause a lot more damage. And so we believe this is a core tenant of secure software development. And so we turned it on by default for all public repos. And I think that's an example
Starting point is 00:27:31 of the types of things I think we need every company, every organization who's shipping capabilities to developers, users, to think about, what can I turn on by default? What can I just take away as a choice or an education opportunity for someone? We're just going to do this. And sure, you've got options to turn it off if you need to, but this is the way it's going to ship. How has that received? Well, so far, honestly, I think it was, it's one of those things where thankfully most developers aren't doing this every day. They're not pushing secrets to code. And I think it's very likely that many who are didn't really take the
Starting point is 00:28:06 time to step back and think like, oh yeah, maybe I shouldn't do that. Or maybe there's another way to do this. And you can still override those alerts as they come in. But as far as I'm aware, we've had generally positive reception. Same thing with mandatory 2FA. We've seen a significant drop in support tickets since we've rolled out the requirement to make 2FA. We've seen a significant drop in support tickets since we've rolled out the requirement to make 2FA mandatory. And then we've seen a 95% opt-in rate across co-contributors who've received those requirements. It was a day of great joy when I 2FA'd myself on GitHub, so I was happy about that. Same. What is behind the scenes of this scanning process? Like how did you have to re-architecture Git Push, essentially, to GitHub?
Starting point is 00:28:48 Did it have to be a sandbox of sorts that gets pushed to, then scanned, and then kicked back? What's the process? And even what's the cost center? Is this a cost center for GitHub to have to pay for all of this source code to be scanned? What's the architecture? What's the cost?
Starting point is 00:29:04 What's all the things? So I'm not going to butcher the architecture by trying to explain it in detail. Give us the high level overview. But in general, yes, that's essentially the gist is there is a sandbox space where we do the scanning and it's all encrypted. So it's not like we're punching out of that,
Starting point is 00:29:21 but it hits the GitHub side of it. And before we put it into the Git commit, get pull requests, whatever that is on the actual github.com platform, we scan it in a sandbox first. And if we find it, we kick it, we kick back and alert that says, Hey, we found this, you should, we highly recommend you deal with this, you know, clean your history out, remove it from code, get everything clean, and then push up again. And that's kind of the gist of it there. In terms of how we structured it architecturally, we actually partner with industry partners across kind of every sector here that does structured secrets. I think we have over 300, 350, 400 partners that we have essentially the
Starting point is 00:30:02 ability to scan for their secrets and people can just register for the program. And in a couple of cases, we've actually gone a step further where we can show enterprise customers whether it's still valid or not. So it's not only a secret we found, but it's an active secret in code, slightly different than push protection that's after if we found it in your code. So that's actually a huge benefit as well to developers. From a cost center perspective, supporting the open source community
Starting point is 00:30:28 has and always will be one of our kind of core spaces that we invest in and that we support. And so, you know, we essentially support most of the GitHub advanced security features that enterprise customers pay for, for all public repos that includes the compute behind it that includes the scanning that includes you know all those things the things that you can get on a free account on github.com are incredible code spaces so many
Starting point is 00:30:56 so many minutes a month for code spaces usage which if you don't have a developer laptop is is a game changer and even for me personally if i'm just going to tinker around with something on the weekend the last thing i want to do is spend the first five hours getting my laptop patched up to date, whatever developer tool I need installed. I don't worry about any of that anymore. I fire up a code space, which is just a remote development environment that we host and I get to work and, you know, free actions minutes, stuff like that. And that's, that's part of our mission to accelerate human progress through software development. And so I think it does cost money for us to run those things for the public repos,
Starting point is 00:31:32 and I think that's okay. Is this the first time you're hearing about GitHub's advanced security features, Jared? Is it just me first time? First time for me, yeah. Can you explain that then, Jacob? Because I kind of get what it is. I Googled it, landed on a page, but it seems enterprise-focused. But then, as you just mentioned, some of this is public repo blessed. Can you give us a rundown? Sure. I mean, the gist of it is GitHub Advanced Security is our static analysis capability for software.
Starting point is 00:32:03 And that's based on CodeQL. And it also includes our dependency scanning ability. So I'll give you an example here. If you've got a repository on GitHub.com, if you turn this on, if we see a dependency in your source code that's out of date, we'll just send you an alert. And if you have it turned on,
Starting point is 00:32:21 we'll actually open a pull request on your behalf, on your code and say, hey, we found a dependency that's out of date. And also here's the updated version that we recommend. And if it fits, you can just merge it and move on. I do this all the time on a handful of projects that I do personally on github.com. I just kind of like go in every month or so. And I look at all the pull requests, Dependabot's open for me. I merge them and I'm happy and I move on. And then the last bit is secret scanning. And so this is an enterprise, those three together are GitHub
Starting point is 00:32:50 Advanced Security. And then for our enterprise customers, we also have things like a security overview and trending and charts and things like that, that will help enterprise administrators and security teams to administer this across their environments. So that is an enterprise offering that we do offer to our enterprise customers as a security suite for their source code. But then we offer most of that available for free to public repos that are hosted on github.com. So a lot of the open source community
Starting point is 00:33:18 can take advantage of that, those that are hosted on GitHub. Well, friends, I have something special for you because I made a new friend. Tamar Ben Sharkar, Senior Engineer Manager over at Retool. Okay, so our sponsor for this episode is Neon. And as you know, we use Neon, but we don't use Neon like Retool uses Neon. Retool needed to stand up a service called retool db tamar can explain it better in this conversation but retool db is powered by neon okay they have a service called fleets it is a service that manages enterprise level fleets of Postgres, serverless managed fleets of Postgres.
Starting point is 00:34:28 And RetoolDB by Retool is powered by Neon Fleets. Okay, Tamar, take us into how Retool is using Neon for RetoolDB at large. So one big problem we had with Retool, we wanted users to have value, production value as soon as possible. And connecting to a ProDB in a new tool is not something that people will do lightly. But they're much more likely to then dump a CSV into Retool. And so because of that, we said, OK, well, what if we just host databases on behalf of users? And then they can get spin up really fast.
Starting point is 00:34:59 And we really saw that take off. The problem we had is we didn't have a big team. We couldn't spin up a new team to support this feature. So what do we do? And so we were looking at what are the options out there? And, you know, we found Neon. Neon is a serverless platform that manages Postgres DBs. And so like, okay, that's interesting.
Starting point is 00:35:14 Let's kind of look in further. What's kind of really unique about them is you really only pay for what you use, which is exactly the case that we have, right? Because we want to provide this to everybody. Not everyone uses it. Not everyone uses it all the time. And so like if we had to like, you know, us manage a bunch of, you know, RDS instances, we have, right? Because we want to provide this to everybody. Not everyone uses it. Not everyone uses it all the time. And so if we had to us manage a bunch of RDS instances, for example, right?
Starting point is 00:35:30 Basically, we'd have a whole info team to support, figure out, okay, what are they on? How do we do? Try to have some kind of greedy algorithm to get all the data in the fewest moments as possible. This is now a hard problem. That's not kind of a core value, right? A core value is kind of providing that database.
Starting point is 00:35:48 And we don't want to kind of go in and take, we know we're not like an infertile team. We don't want to kind of get in that game. I think what's really great is that, okay, well, one big kind of risk when you think of going in third party is A, the cost. We're giving this free to all users. We have 300,000 databases right now, right?
Starting point is 00:36:03 Like we can't especially especially as we were um rolling this out to begin with right we like didn't know for sure how it would how people respond right and you know we can't all of a sudden have like a couple million dollars you know the bank for uh for this without kind of seeing the the activation that it has on our users so it's kind of obvious but what was the appeal of Neon? What was really appealing to Neon, it spins out to zero. And so because of that, right, it really kind of reduces the cost. And so really, it's really exactly only what we spend.
Starting point is 00:36:33 And there's really actually not a way to actually spend less money, even if we always need ourselves. So you can be like removing all the people cost, right? Because let's say we use something like an RDS, we have to figure ourselves, right? Basically what Neon is doing. Right. How to bucket all the instances together, how to bucket the usages to have as few instances as possible. Right. To scale up and down depending on what's going on.
Starting point is 00:36:55 And now we sort of don't have to worry about any of that part, but still get kind of the cost benefit. And so really it was out, you know, it's a win-win. OK. Win-win. Always a good thing. I like win-win-wins, but okay, fine. Win-win. If it were not for Neon and their offering of fleets of Postgres and how they're essentially your serverless Postgres platform, where would Retool be at with RetoolDB without Neon? Well, we would have to have at least a fully staffed team. Neon call burden would be a challenge. You know, I think we have to spend a lot of time on, you know, making it sustainable. And that's, you know, a whole, you know, other sets of concerns that are, that we don't ever think
Starting point is 00:37:31 about. First of all, like, you know, it's a team of engineers, right? Which is not free. So it's everyone's salaries, right? So let's say probably a team, let's say, you know, eight to 10 people, you know, easily only focus on this. And then it's like, well, does the revenue of RetroBee offset that, the cost, even if just the engineers? So, you know, that's step one. But I think even before then, right, like, you'd have to set up this team before you even had the product. You know, databases and, you know, having them the way that Nian has them, right, like, suspend to zero, having, you know, warm spares that they're, you know, ready instantaneously when you, like, log on to Retool. Those things aren't free. And even if we tried to do an MVP,
Starting point is 00:38:05 there's a basic functionality that needs to exist that we all have to start from scratch. And that would be a huge commitment to this. And I think we would have completely, it would come out a year later because we'd have to do a lot more validation to know that it would have been worth it right before we started. Here, we were able to quickly try it out, see that it was effective, and then grow
Starting point is 00:38:22 from there because the cost was very low. And that really gave us a lot of flexibility of also testing out different features and different flavors of it. Okay, so RetoolDB is fully powered by, backed by, managed by Neon. Neon Fleets, neon.tech slash enterprise. Learn more. We love Neon here at ChangeLog. We use Neon for our Postgres database. We enjoy every single feature that Tamar mentioned for RetoolDB,
Starting point is 00:38:50 but we use it at a small scale, a single database for our application. They use it at scale. One single engineer propped it up, manages it. That's insane. They would have never been able to do this without Neon. RetoolDB would have cost more and may not exist without Neon. Okay, go to neon.tech, learn more, or neon.tech slash enterprise. Enterprise. Has Dependabot gotten any better about not warning on latent code or being able to detect actually used code versus code that happens to be in a dependency that's never
Starting point is 00:39:46 executed in the run of a program or dev dependencies only. Because it seems like in the past, it's had a lot of false positives for me. And so I just, you know, I jumped ship. I'm just like, well, I'm kind of done with you. Because 90% of these aren't actually my problem, but you're making them my problem. And I'm assuming that that's something that y'all work on because I'm probably not unique in that way. And I wonder if it's unclear if they're used or not. And being able to trace that through the code and figure that out in a scalable way is a difficult challenge. So it's definitely something that teams are tracking
Starting point is 00:40:32 and working on and always trying to improve. Fair. I wouldn't want to work on that problem. I can understand that it's a hard problem, but I also want somebody to work on it. Can we go back to attestations? Because it seems like it could be a good step forward in the right direction. And obviously it's out there and ready to use on it. Can we go back to attestations? Because it seems like it could be a good step forward in the right direction. And obviously it's out there and ready to use and stuff.
Starting point is 00:40:49 You gave the workflow a little bit from the end user perspective. Like you have GitHub actions if you're using them, toggle on a thing, you probably have to decide what happens if an attestation fails, and that's roughly the workflow. But what about from the maintainer perspective?
Starting point is 00:41:04 What do I have to do to have my code attested to as I'm deploying it out to people? So the main thing today, and again, this is a very early capability that we're shipping. So I expect this to continue to just improve as more of our partners adopt it. and this becomes sort of ingrained in the developer ecosystem. But today, it's as simple as adding a specific GitHub action to the workflow. So a lot of open source projects do their builds on GitHub actions. And in the workflow itself,
Starting point is 00:41:37 you can specify different segments of it. And often, you would include an action for checking out the code. In fact, that's one of the main ones pretty much every action on GitHub.com includes. You might include an action for deploying your artifact to AWS or to PyPy or to Azure. And then we've written and released an action that will let you attest to the code. And essentially all it does is, as the build's happening, once that artifact is produced that you're either going to deploy somewhere, upload to PyPy or Rust or wherever,
Starting point is 00:42:12 it will sign it using our TrustSor cryptographic kind of root trust. And then it will store that attestation in the same repository on which the action was run on. And so there's essentially a repo name slash attestations. The attestation in the same repository on which the action was run on. And so there's a essentially repo name slash attestations, the attestations there, and it's available for download and use for anybody to verify against cryptographically through the GitHub command line tool. And on the receiving end, obviously you're still in GitHub actions. So now you're just using whatever you guys built to go ahead and do that process during your own deployment when you're
Starting point is 00:42:44 saying. So you don't have to be in the GitHub ecosystem at all to use this. That's the great part about it. So let's say the artifact is built and it's uploaded to PyPy. And then a developer who's sitting in another company using a completely different tech stack, but still uses that PyPy repo, they can download the PyPy artifact onto their local machine. And then they can use the GitHub command line to go check the attestation to see if it's the same one that they think it should be built on that repo, that org, that flow, that branch, whatever their criteria is. And this is where you can use things like policy enforcement software. So open policy agents, a popular one. So you can write policies and say, before I deploy, I want to make sure that everything I'm deploying came from
Starting point is 00:43:32 one of these three organizations on github.com, the source code. It was created and built there, but where it was downloaded and whether it was from a local, you know, artifactory instance or a public, you know, artifact store like MPM doesn't matter because the attestation is cryptographic and can happen out of band. Does that allow you to actually track the binary in the case of a binary deploy to the source code commit? Like, what do you know about the source code? The exact. You can go all the way down to the commit. You can go to the commit. You can go to the actual workflow that built that binary which is
Starting point is 00:44:08 fantastic and this is kind of why when we were talking about xz i think this could end up being a helpful deterrent again i don't think it's a a one size solves everything situation but you can go from essentially a binary you found laying around on your computer to knowing which repo built it, which workflow, the build instructions that went into it, which commit went into it. And it gives you this ability that is incredibly difficult to do now at scale. And that's really how we see this going from an industry perspective is more and more tools like this that help us do this at scale to give that essentially unfalsifiable paper trail. That's really interesting to like just find a binary and
Starting point is 00:44:52 test it. Like how does that work? Literally, how does it work? Underneath the covers? Yeah, like how does that work? How's the GitHub command line doing it? Yeah. Tell us how it works, Jacob. See, now you're double-clicking past my ability to be incredibly useful here. Well, my understanding is, and we'll follow up if I totally get this wrong, but generally speaking, we are looking at essentially the cryptographic hash of the binary.
Starting point is 00:45:17 Right. And then looking up the attestation. You have to kind of know the attestation that you're going against. So knowing which org you think it came from on the internet or knowing like, hey, I think the GitHub Actions org built this. So you have to know where to go ask for the attestation generally, but hopefully corporations deploying in a high sensitive environment
Starting point is 00:45:40 are going to have that knowledge. But you don't necessarily have to say like, oh, I have to go find the attestation file myself. You just have to roughly know where it's at. You can point the command line there and then it's going to go grab the attestation and compare the cryptographic hashes based on the signature and the attestation signing to be able to tell you if it's the same one or not. And then once you have that, then you can display all the information about the build that went into that binary. That'll make sense. Is that a GitHub? It's obviously GitHub specific in your implementation, but is this something that other platforms could also do attestations and just follow the same? You have a spec or something where we could just get it to be
Starting point is 00:46:18 generally useful? Our approach is based on the SIGstore approach. So it's a commonly, it's essentially a scaled version of what we released last year with NPM and SIGstore. So for public repos, there's still that kind of normal flow. And then enterprise customers have the opportunity to use a private implementation that's inside GitHub so that their attestations don't show up on the internet. They may not want them to. This is like the second time I've heard the phrase double click in the last week.
Starting point is 00:46:47 I haven't heard it too frequently, but now I've heard it twice in one week. So good job, Jacob. You need one more. Everything comes in threes. Yeah, for sure. Well, I was going to ask you what else you're working on. What else is cool in this space that is burgeoning or in development or the next attestation that's going to add another layer to our defense in depth.
Starting point is 00:47:06 Yeah, I think this is a space I'm excited about because I want to see more and more parts of not just the GitHub product, but software development products make use of these things. And I think that's where we're going to be heading as industry. So right now we talk about supply chain and table stakes that are necessary for secure development. I think there's, there's some things that are becoming common in the industry, whether everybody's doing them or
Starting point is 00:47:32 not, everybody at least acknowledges they're necessary. And so that's things like don't put secrets in code, keep your dependencies up to date, you know, things like that. I think that the next step is for us to, as an industry, say that including attestation and a full paper trail of what was, what is going into the software that we're all using. And this is, you know, attestations is a step beyond SBOMs, right? So SBOMs is here are the ingredients that are going into the recipe I made. But attestations gives you that receipt from where those ingredients came from. So you know which grocery store they came from. You know which shelf they came from.
Starting point is 00:48:11 You know which manufacturer made that and shipped it to that shelf. And so it gives you that next level down. And so I think that's going to be where we're headed as an industry is, and where I think we should head as an industry, is not only making those things available, but making it just standard as part of the build workflow. That's just, everybody expects it. It's just, we have tools that show it. It's very easy and built into the artifact repositories, the developer workflows, CICD flows, things like that. I think we had to build the scaffolding and framework first,
Starting point is 00:48:38 but now that that's coming along, I think this is where we're headed. How well received, I suppose, has the SBOM been? The software bill of materials, is it widely used? Is it generally adopted? I remember talking about this and hearing about this, but I'm not in this world to even write one, build one, care about one. But how well received have they been?
Starting point is 00:48:58 I think there's a lot of people interested in it. I think everybody acknowledges that it's part of the solution we need to adopt as an industry. But it's also, I think, acknowledged that it's not going to solve everything. It's just one part of kind of this broader trust flow that I was talking about. And so it's, you know, it's something we support on GitHub. It's something that a lot of companies are making standard as part of their build and deployment practices. But I don't think it by itself is necessarily going to be the solution
Starting point is 00:49:26 to the supply chain challenges we all face. When you talk about broader adoption of these practices and tools, what are you and your teams doing in order to get that done? Obviously, you put the features out there, you make them usable, and then you blog about them, and then you use GitHub's channels. But then is there conference talks? Is there training? Is there tutorials?
Starting point is 00:49:47 Because really a lot of this has to be known before it's going to be adopted. What are you doing there? The short answer is yes. Our teams are going out to conferences and talking about this. We're putting together documentation and training on these things. I think there is an awareness here that is part of it, and we're absolutely doing that. At a broader level, you ask what else we were working on and we made it almost 45
Starting point is 00:50:11 minutes without talking about AI. All right, you're allowed. We waited a while. But I think this is the other part of where I think that we can make things easier for an adoption for the developer ecosystem. So I'll give you an example here. We've had this capability for a while called GitHub code scanning, CodeQL, the one I mentioned earlier about from GitHub Advanced Security. And it's great because it's got a really powerful engine that essentially models a piece of software. And then it will trace the sources and sinks, the inputs and outputs of functions, and it'll trace the data through the source code to find out where there's a potential vulnerability. It's fantastic. In the past, what would happen is this would run on
Starting point is 00:50:56 accounts that had this enabled, and it would show developers like, hey, we think we found a vulnerability here. Here's why. Here's some documentation you can read about it. But then it was sort of up to the developer to go figure out what to do about it. And so they had to pivot out of their workflow. They had to go to a search engine or wherever on the internet and go figure out like, okay, well, that's great that you found this for me, but how do I fix it? And so with AI, what we're adding in is the ability, we're calling it code scanning autofix. And so what will happen is in the pull request where it traditionally shows you what's wrong or what we believe to be a vulnerability, it will now also open a second part of the pull request with a suggested fix in it. And so we're using Copilot AI to be able to
Starting point is 00:51:34 do this. And we'll say like, hey, we found this thing. We think this is going to fix it for you. If you agree, just hit accept and move on. And the ability to kind of get that in front of a developer and make it part of their flow, I think is really, really important. And the ability to kind of get that in front of a developer and make it part of their flow, I think is really, really important. And then also, you know, we're early days here, but we're seeing folks use Copilot chat and the IDE so that the interactive chat capability we have to ask Copilot about these things. Hey, is there anything insecure about my code? Can I make my code more secure? You know, what would make this more resilient to an attack? And, you know, that interaction with Copilot, it's going to look at it and say, well, hey, like if you structured your function input this way, it would be safer.
Starting point is 00:52:15 Do you want to do that? Just click yes. And it copies it over and you're off to the races. And so I think I really think this is where, you know, we've kind of known as an industry that things like static analysis are important, and we've all worked with our teams to enable it, and everybody's on a different level of maturity on that journey. But generally, helping developers keep up with the pace by which vulnerabilities are found and CVE sent out has been, I won't say a losing paddle, but it's been challenging. It doesn't feel like we've been making ground. And I think AI and things like autofix are going to allow our developers and our security teams to make up ground in a way we haven't been able to do before. Well, now that we've broken the seal, I thought this was going
Starting point is 00:53:00 to be your answer to the pentapod getting better. I thought you were going to be like, well, we're throwing AI at it and it's getting better at detecting hot code paths. But I wasn't going to bring it up in the moment, so I was waiting for you to bring it up before I loop back around to that. I think Autofix sounds really cool. Didn't we see that demoed, Adam, recently, I think? And it worked great in the demo.
Starting point is 00:53:18 I'm not sure about real life. Exact same. Yeah, it works exactly like that. Demos always work the same in real life, right? Well, I wonder, is this the real world yet? Is this the promised future world? Are people using autofix today? We're getting great feedback.
Starting point is 00:53:32 I mean, it's showing right now that the suggestions are remediating more than two-thirds of vulnerabilities with little to no editing. And we have bigger plans for this, too, to be able to really, I mean, our goal is to goal is to make it easy for developers to build secure software. And so how do we think about this at scale? What are ways we can reduce that friction for developers finding and fixing vulnerabilities in code? And the interesting thing with something like Autofix and CodeQL is even if it's clean today and everything's fixed today, it may not be tomorrow because security researchers are finding new things every day.
Starting point is 00:54:05 CVs are getting released every day. And so how do we make this an ongoing practice that is low friction and low pain for developers? And I think that's just really part of how we're trying to design and think about integration of AI capabilities across the board is how to accelerate developers and get them focused on the things they want to be focused on. And frankly, pretty much every organization and company wants their developers to be focused on. They don't want them clicking through a bunch of menus and searching how to fix something that they fixed yesterday, but just can't remember what it is. They want them moving on to the value add work. And so do we. What about proactive versus reactive? Because CVEs are very reactive when it comes to security. It's like it's a known thing.
Starting point is 00:54:48 And it's obviously an awareness thing once it's known because then it's still burgeoning and it's still being more people being made aware of it. But how about proactive things? I know you mentioned scanning. Obviously, a testing is part of the, to some degree, proactiveness. What other plans or ideas do you personally have or does your organization have around being proactive, typo squatting, things like that? Where it's like, you know, I didn't mean to type in React spelled incorrectly.
Starting point is 00:55:16 I meant, or with a plural, you know, that kind of thing. Like how, what are the proactive ways you're securing things? I'll kind of work my way into it a bit. So at a high level, like with our SaaS capabilities, CodeQL is why I actually am a huge fan of CodeQL just as a security practitioner, whether I worked at GitHub or not,
Starting point is 00:55:35 because the way it works underneath the hood is about variant analysis and modeling, not about trying to pattern match on a specific CVE. Now, obviously our security team and researchers are informed by the types of bugs and vulnerabilities that are being found by the research community. And we have a great research team inside of GitHub as well. But the way it works is it's modeling and looking for patterns and known insecure patterns versus a specific like, oh, we know that function in this thing is broken or vulnerable. And so I think that's kind of step one.
Starting point is 00:56:08 I think the other part of this is I think this is where we're going to see significant advances, and we're already seeing advances now in editor co-pilots. So we build and deliver GitHub co-pilot, but being able to have that AI assistant in your code looking for things that are typos, that are, hey, we saw you typed it this way, but did you mean this? Like, we actually think it would be much faster if you did it this way.
Starting point is 00:56:35 And like, I think we're going to see a lot of advances in the proactive space as those filters and as the models and as things like fine tuning get better and better in our shipping. I think that's a good place for AI, obviously the pattern matching and being that buddy next to you. So you don't always have to be on pins and needles of, am I making a secure choice or this dependency? Is it, you know, is it really secure? Has there been a maintainer swap recently? Was it the right name? Did I typo squat or typo butcher this and I
Starting point is 00:57:05 actually installed the wrong thing? We've seen that even with chat GPT where people will ask for things and they'll give it back fake. You know this better than I do, Jared, because you run news, but like the chat GPT will give back fake information. It'll hallucinate a package name. Yeah. And then some of you will go register that package precisely thank you jared hallucinating package names and it's not real but now it's sort of in the zeitgeist of hallucinations and then people think it's a real thing and now a squatter will sit on that and do something nefarious which i think that's a great place for ai to pattern match on that because as a human i am generally going to be lazy or potentially even just not as good every single day, all day, every day. And so I'm distracted. I'm going to mess up, you know, flawed. Yeah, I agree. I think
Starting point is 00:57:51 there's a, there's a productivity gain here too, that shouldn't be overlooked, which is like, there are a lot of things that technical people sometimes just don't go do because they're like, well, that's going to take me forever. And like, I'm a good example of this. So I grew up as a developer. I've been a developer most of my career, but I don't develop every day anymore. And so the idea of like getting a dev environment spun up and going and doing something productive is usually like not worth the effort and not really what I'm paid to do. But, you know, a few months ago we were working on a, there was like a Jupiter notebook that helped analyze statistics and
Starting point is 00:58:25 heuristics for the workforce. And it was something I wanted to use as a people manager running an organization to like get insight into my workforce. And it wasn't doing what I needed to do. And I was like, ah, I haven't done a Jupyter notebook in like eight years. This is going to take me forever. And then I was like, I wonder if Copilot can help. And so I literally pulled it up and started asking Copilot some questions and started doing some autocomplete stuff.
Starting point is 00:58:47 And I had it sorted out in like 20 minutes, had it done what I needed to do, got my answer, was able to get back to what I really needed to be doing. Versus that probably would have taken me six, seven hours without an AI assistant being able to do those things. And I just think of that times 10, times 1 times a thousand times a hundred thousand for corporations and organizations. And I think, I think that's gonna just get more powerful as we go with things that get up, go by time to monitor your crons. Simple monitoring for every application. That is what my friends over at Kronitor does for you. Performance insights and uptime monitoring for Kronjobs, websites, APIs, status pages, heartbeats, analytics checks, and so much more.
Starting point is 00:59:54 And you can start for free today. Kronitor.io. Check them out. Join 50,000 developers worldwide from Square, Cisco, Johnson & Johnson, Monday.com, Reddit, Monzo, and so many more. And guess what? I monitor my cron jobs with Cronitor and you should too. And here's how easy it is to install and use Cronitor to start monitoring your crons. They have a Linux package, a macOS package, a Windows package that you can install. And the first thing you do is you run Cronitor Discover when you have this installed. It discovers all of your crons.
Starting point is 01:00:30 And from there, your crons will be monitored inside of Cronitor's dashboard. You have a jobs tab. You can easily see execution time, all the events, the latest activity, the health status, the success range, all the details, when it should run. Everything is captured in this dashboard. And it's so easy to use. Okay, check them out at chronitor.io. Once again, chronitor.io.
Starting point is 01:01:03 I think AI Red Teams makes a ton of sense. I mean, there's probably startups doing this. I don't know if you all are thinking about it or doing it, but I've done some penetration testing, especially after I got out of college. And it's very common for an enterprise to hire a security team, an outside consultant, to come in and pen test their system. And they're
Starting point is 01:01:26 very expensive and they're very good at what they do oftentimes. But a lot of that work is just grueling and fuzzing and like running this against that and doing this. And like they have their set of things they do, you know? And then of course there are like the, the expert hackers who the AI is never going to be as good as this guy or this gal because they know whatever, whatever. That's real. It's real rare, but it's real. But for the 80% of orgs that can't afford red teaming or auditing at all, but could probably just send a bunch of computers
Starting point is 01:01:55 to do non-deterministic fuzzing against their systems, that seems like it'd be a win in the security world. Is that going on? Is that something GitHub's thinking about? I mean, it's definitely something that's going on. I'm closely following several startups that are putting some time and energy into this space. I think it's going to be another powerful tool
Starting point is 01:02:14 in the tool bag very soon. I think generally in the security space, I think there's a lot of things that fall into that category where the first stage of something that we are paying for a very advanced, highly educated user to go do is often repetitive, is often like, I'm going to go query this database to figure this out. And then I'm going to go to do this Splunk query. And then I'm going to take that and export everything into an Excel spreadsheet so I can do a pivot table with these other IP logs that came from over here or whatever the case may be. And I think that's an area too, that we're going to see AI, AI like response and kind
Starting point is 01:02:51 of stock, you know, invent like the early triage stage, gain a lot of speed by because, you know, I think what we really want is we don't want less people on our teams. We want those people doing the things that they're trained to do and the things that really, truly add value on top of that, of being able to use their intuition experience and find the signals that don't look right, that know how to go triage and figure out how to deal with a potential security incident or rule them out. But there's often so much time spent before the experience can kick in. I think this is another area we're going to see some pretty amazing work done. I know there's a lot of companies doing that in the space right now. How do you see GitHub's position in that
Starting point is 01:03:35 world? Where do you decide what GitHub should invest in? I'm sure you're not the only one deciding, but what's the decision-making process of what is worthwhile for GitHub to be doing versus, well, that's something that some startups can do, but we're not going to do that? It's definitely not my decision. I run the day-to-day of our internal security team. But I can share, our focus is developers. Our focus is accelerating human progress
Starting point is 01:04:03 through software development and enabling open source. And so we tend to focus on the things that we can bring a lot of value to in that space. And that's why we're so excited about some of the AI capabilities because I think we all see the news articles every day. There's a million new models and apps coming out every day. I would say it's probably, if not accepted,
Starting point is 01:04:26 at least talked about a lot, that a lot of those are cool technologies looking for a fit or their solutions looking for a problem in some cases. I think software development is just clear how powerful this is. And that's why we're so excited about incorporating that into the editor, into things like, I mean, how often do developers work on code all day long? They finally get to the time they're ready to the pull request and they're like, ah, do I really want to spend an hour writing a really great set of documentation on my pull request? Well, what if we had AI be able to scan all the changes that they made and
Starting point is 01:04:59 write 80% of it for them before they did that? And so I think it's been clear to us for a while. I mean, we released Copilot Tech Preview in late 2021, well before the current kind of wave of things was out. It's been clear to us for a while, this is a huge win. And I think it will come in other parts of the industry, by the way. I don't think, I'm not in any way saying it's not going to be helpful in other parts of our lives,
Starting point is 01:05:22 but I think software development space, given the structure, given the modeling that's existing and given the tooling and the work that's already going into the industry the last 20 years, we're just seeing huge wins and huge gains already. So you're in charge of actually GitHub.com operational security
Starting point is 01:05:37 as well, or yes? Yes. You got any cool stories you can tell us? Like any long nights? Any rough weekends? You know, DDoS? I mean, everybody in the security world's had long nights and rough weekends. Tell us some horror stories. Come on, man. You've survived them.
Starting point is 01:05:52 You survived it. Surviving, maybe. Yeah, so internally, real quick. So our security team is great. We basically, like the short version is, we protect the company. So we protect, you know, we call ourselves hubbers. We protect our hubbers, the laptops and the data and the access of our internal systems. We protect the product. So, you know, operationally and github.com and our products, but also working closely with our engineering partners to make sure that what we build and ship is secure and safe. And then
Starting point is 01:06:22 we also help secure the community. So we have a research team that's out looking for vulnerabilities in open source software. They're helping educate the open source community and researchers on how to use things like CodeQL and secret scanning and how to incorporate AI into their secure development practices. So that's kind of the three-pillar remit that we have. You know, in terms of war stories, I think, interestingly, what's been on my mind a little bit more lately has been, I mentioned earlier that we offer
Starting point is 01:06:51 a lot of amazing features for free to public repos, so Codespaces and Actions Minutes and free repos and most of the capabilities that enterprises have. I know you're going with this one. It turns out that threat actors also have figured that out. Repos and most of the capabilities that enterprises have. I know you're going with this one.
Starting point is 01:07:09 It turns out that threat actors also have figured that out. And when they hear things like free compute, they go right after it for, you know, pick your abuse factor. We see campaigns that will try and escalate the number of stars that a particular repo has to increase its popularity, right? We see people trying to mine crypto using free compute. I mean, we see a lot. Hosting files on github.com that we'll just say don't have anything to do with software development.
Starting point is 01:07:35 And so it's a challenge. And it's something that we have a fantastic team working on. We're employing, we have been employing machine learning and AI for a while in this space. We'll continue to do that. But it's a really complex, challenging problem. And the balance we have to strike here is because we serve so much of the world software development ecosystem, we can't turn the dials too strict because then we start locking out hundreds, thousands of users
Starting point is 01:08:02 that have legitimate use cases or security research. And certainly those things happen. You can't fine tune every filter to be perfect, but we really try and strike that balance of how we do that. And so this is where we're also working with our product counterparts to understand what are things we can make changes to in the actual product or maybe a signup flow or things like like that that will decrease the likelihood of abuse or make it impose cost a bit higher to actors who want to do that. And so that's certainly something, you know,
Starting point is 01:08:32 there's new campaigns every day that come out in the space that our teams are firefighting. And they're doing a great job at it. And it's something that is just top of mind because it's something we don't see getting less of it. We're definitely seeing an increase. And AI is a tool that those actors are using as well. So they're using AI to generate fake issues and pull requests or whatever the content is.
Starting point is 01:08:57 They're using it to create fake profile pictures, all sorts of stuff. Wow. Never a boring day, I'm sure. So what happens in your life, in the life of Jacob, when a DDoS hits or something? You on pager duty? Are you above that? Do you rush into the hospital? Or the hospital? Hopefully not the hospital.
Starting point is 01:09:14 No, rush into the office? Or are you working from home? Do you get situations where like, hey, we're getting DDoSed. What are we going to do? And then what do you do in those circumstances? Yeah, I mean, there's always, I think that's always true in security teams.
Starting point is 01:09:26 We're a fully remote company, so there's no rushing into the office. It's usually rushing to my office if needed. But we have a fantastic team. I kind of jokingly have told the team, if you need me to log into production and do anything during an incident, we've probably already gotten to a state where things are pretty bad. Yeah, bigger problems. That's not what I should be doing.
Starting point is 01:09:46 So no, my goal is to support the teams, understand what they need, and figure out do we need to page more people in or the right people in? How can I support the great leaders that we already have? And then there's a comms element to this as well. So depending on what the issue is, we had to rotate one of our public keys last year. And we believe very deeply in transparency in what happens on the platform. We're trusted in the community and that trust is
Starting point is 01:10:12 only maintained through transparency, I think. And so, you know, a lot of it is, you know, how do we want to get these comms out as quickly as possible? How can we be as transparent as possible, you know, sticking to the facts and sharing as much as we can, particularly actionable information. So that's a lot of what we do. Thankfully, our engineering operations team is world class and handles a lot of the DDoS attacks. And they're very, very quite good at it. So on that particular one, we have a great engineering team on that. No hospital runs for you then, I guess.
Starting point is 01:10:43 Yeah. Let's hope not. Get you out of the hospital. That's good. Don't want to go to the hospital, especially for DDoS. That's a different kind of DDoS, you then, I guess. Yeah. Let's hope not. Keep you out of the hospital. Oh, that's good. I don't want to go to the hospital, especially for DDoS. That's a different kind of DDoS, you know what I'm saying?
Starting point is 01:10:49 Indeed. You show up and you tell the doctor, I got a DDoS. He's not going to know what to do, you know? Well, get out of here. Don't, go deal with that. When you look at the open source supply chain, I really don't even like to call it
Starting point is 01:11:00 the open source supply chain, but it's the industry accepted term. So bear with me. When you look at it and you realize that open source is obviously one and you realize how important a role it plays in just obviously software at large but innovation new startups new side projects join an individual developer's life the freedoms that a person can have to create software and just share it simply when you look at that entire ecosystem as a security expert what do you wish would be there that's not there today to secure it what is it like if you had a magic wand
Starting point is 01:11:41 and you just somehow wave it and a couple new things appear, what would those things be and what role could you personally play in making them possible? That's a great question. I think it goes back to what we were talking about earlier, to be honest. I think that today there is a lot of variation in freedom, which is a good thing, to be clear. I'm not suggesting we take that away. But there's not necessarily clear paved paths for open source developers, hobbyists, and even more corporation-backed open source efforts to know what the best practices are for building, securing, deploying, attesting, signing. It's complicated, right? And, you know, I've been in this space for a really long time. And so, you know, when I rattle some of these things off,
Starting point is 01:12:31 it may feel like, oh, yeah, like, okay, cool. That's easy, quote unquote. But it's not. It's, you know, we didn't have Salsa frameworks, you know, 10, 15 years ago. Like, the frameworks and the thinking are there, but I don't know that as an ecosystem, and this is beyond GitHub, GitHub's part of that ecosystem, makes it really easy for people to do the right thing. So build the right way, secure it, update it, patch it, deploy it, assign it. Like that end-to-end flow is still complicated. People use, maybe they store their source code on GitHub, but they build it somewhere else. And then when they build it somewhere else, they're not scanning it or that place isn't secure. And then when they upload it, we don't have any way to see where it came from.
Starting point is 01:13:13 And then when we download it on the other side, we don't really have a way to automatically get a sense of the risk because it's difficult to tie all those things together. And so I think if I could wave a magic wand, it would be essentially to have us partners in industry, you know, I think GitHub's part of this, to make those things easier for developers to just do the right thing out of the box. And then, of course, have the freedoms if they need to do something more complex or different,
Starting point is 01:13:37 that's totally fine. But I think a lot of use cases just want to know, okay, how do I build this thing and deploy it to this cloud provider? How do I build this thing and deploy it to this cloud provider? How do I build this thing and make it show up in PyPy and have it trusted with a little badge on it? And I think to do that well takes a significantly higher amount of work and expertise than would be optimal if we really want to scale this.
Starting point is 01:14:03 It sounds like that world that you just painted is a world where GitHub accepts more and more responsibility as a security center point. You've already accepted the responsibility of hosting the open source code, right? You've already accepted the responsibility of supporting on all the ways open source at large. So now the final layer might be more and more over time if not just now responsibility on the security front typo squatting like you'd mentioned on maven if i can download something from there or pi pi or wherever and i can have that attestation back
Starting point is 01:14:40 that i think this you're already doing some of the proving grounds for this but it sounds like you're you're for gith up accepting more and more responsibility from a security standpoint. I mean, I'll say that we as a company take that responsibility already very seriously, and we talk about it a lot internally. And I think at a broader level, I think each industry player in this space, we all have to take more responsibility for this. I don't think it's just GitHub. I think it's all of the corporations that are not just investing in open source,
Starting point is 01:15:12 because there's many that do that. They pay developers to work on open source projects. I think that's great. But I think it's also the organizations and companies that use open source prolifically need to take more responsibility in this space too. And I think it's all of us together. I think GitHub's already taking these strides more responsibility in this space too. And I think it's all of us together. I think GitHub's already taking these strides and will continue to do that. And that's why we've released things like advanced security for public repos.
Starting point is 01:15:33 That doesn't necessarily, that's not a free thing for us to do, but it's an important thing for us to do, right? And so I think as far as I'm aware that that vision and direction is not going to change for us. We're going to continue to invest in those things. Well, let's imagine then, because this is probably pretty close to true.
Starting point is 01:15:50 There's a lot of people listening to this podcast with us, and they're an hourish in, and they're like, man, this is awesome. But they hear you say that, and they say, wow, I would love to find a way into – I'm at one of those organizations that could partner with GitHub to bolster the security model of open source, the open source supply chain. In what ways can they reach into GitHub, talk to you, talk to others to create that bridge, to create that partnership? What are some of those paths and methods? Yeah, that's a good question. I think from a practical perspective, without having to reach out, I think there's some simple steps that a lot of organizations can take, which is go play with the new attestation capability and use it. Start signing artifacts and making it part of your build workflows and then talk about it.
Starting point is 01:16:35 Tell people how you're doing it. Give us feedback on what would make it be better. Because I think those key scaffolding building blocks are so important to the industry right now. Turn on things like secret scanning and push protection, like show through example, lead through example on how to do these practices internally. And I think in terms of the partnership angle, we have a fantastic OSPO, open source program office at GitHub that does some of these partnerships. The security research team that I mentioned earlier is always out talking up to the security community about how to do these things and level this up and make it better. And then there's other kind of external entities. So there's the Alpha Omega project as part of the OpenSSF, the Open Source Security
Starting point is 01:17:16 Software Foundation, if I got those acronyms right, that's looking at ways that some of the bigger corporations like Microsoft and Amazon and others have invested money into on how to level up the entire open source ecosystem security space. And what are the programs and possibilities that they can do to help do that? And so I think there's opportunities there for corporations to invest financially if they so choose to be able to do that. And then, you know, at a very practical level, like go sponsor your favorite open source project. Like I use Homebrew like crazy. Homebrew is awesome. Go sponsor it.
Starting point is 01:17:49 That kind of stuff. Dig it. How about the maintainers themselves? Like give me some nuggets for specifically open source software maintainers who are either burdened, tired, excited. Pick any adjective you want to describe a maintainer. What can they do to personally bolster their GitHub profile?
Starting point is 01:18:10 What things should they do? What are specific things they could do on their repositories, etc.? Even their organization, if they have an org for their repo. What are some things for them? Yeah, I think open source maintainers are amazing, first of all. I'm so thrilled that that's part of the community we're able to support every day. I do think that the adjectives you mentioned probably at some points describe every maintainer, maybe all at once, maybe four times differently a day, maybe over their journey.
Starting point is 01:18:37 Because I think it can be overwhelming, right? Some maintainers don't have a robust set of contributors that are helping. And it's, you know, a one or two person effort. Our hope is to be able to give them the security tools built into GitHub that get their level of security up to something that is significant. And so, you know, this is where things like just, you know, go ahead and turn on code scanning if you haven't done that yet and experiment with it and see if it can help secure the product. And, you know, things like attestation,
Starting point is 01:19:11 even if you don't, as a maintainer, use or care about attestation, make it available to your developers because it's part of the repo to your users. Turn it on and include it in there once we've kind of gone full GA with that. I think there's things like that, that developers can do and maintainers can do. And then I think there are other things that we
Starting point is 01:19:29 are continuing to try and make more accessible and easier to maintainers as well. So things like we've got some scanning tools we released open source to help make GitHub actions workflows more secure and detect insecure or overprivileged requests in GitHub actions. So there's things like that as well to just kind of be aware of and, you know, always reach out to the OSPO and other places in GitHub and the community for help on those things if folks are, you know, need some additional guidance. I was thinking about something as part of this conversation, and I'm going to share
Starting point is 01:20:03 an idea with you. Maybe it's a, in quotes, feature request. Maybe it's not. Maybe it's already there, and I don't know. But what if there was this idea of consensus for when you add a new maintainer to a repository that there is a toggle that says, okay, every other repo out there, secure by default, is I have powers. I can give power. I don't need consensus.
Starting point is 01:20:25 But if there's one or more maintainers, you have to sort of have somebody give somebody access to become a maintainer. But then the other people who are part of the organization have to do it as well. And maybe some sort of like personal attestation, which is like I, Adam Stachowiak, agree that Jared Santo can give XYZ access to maintainership and control of this repository. Something that's like, because you can do that personally, right? But is there a way to bake that in with software?
Starting point is 01:20:56 Just a simple thing like that. Does that add more configuration? Does that add more burden to the process? I kind of feel like consensus is a natural thing to ask for. And why not bake it into the blessing of one more maintainer to the project. Yeah. I mean, whether we should or not, I'm not going to touch that one because there's been many books, research papers, and blog posts written on that. The Cathedral in the Bazaar is still one of my favorites on the topic of open source maintainership and kind of thinking about
Starting point is 01:21:22 these communities and systems. In terms of the technical side of it, that's actually what we do internally at GitHub for entitlements. So our internal access system is done essentially the way you described it. So if somebody wants access to one of our tools or a third party capability that we have inside, or they are a developer and they're new and they're like, oh, I need production access to do this thing, or I need that kind of access for this engineering system. They actually open a pull request. And depending on the sensitivity of that entitlement, different people get tagged in to be able to approve that.
Starting point is 01:21:55 And if it's a very sensitive one, it's going to go all the way up to a VP. And if it's extremely sensitive, we have to renew it every six months. And so the ability to do that in some ways is just baked into pull requests and the get workflow already, which I think is really fantastic. So I think, you know, from an open source perspective, I think developers could and maintainers could absolutely do something like that, have like a maintainer file, and use a community driven pull request approach to be able to do it. Whether we need to build something in addition on top of that, I think that's a great question. And I would love folks smarter than me
Starting point is 01:22:28 about open source maintainership and the socio kind of dynamics at play there to weigh in more than I would from a security perspective. The maintainer file is a good start, I think. And you already have pull requests, so there's no software needed to be written, really. It's just sort of like text file at large in a repository everybody touches it what do you think jared is that is
Starting point is 01:22:49 it a good start to something like that do you agree with that there are people smarter than me to answer the question no uh i have only thought about it for a few minutes it certainly makes sense in the context of people who agree that they want to do that, you know? Yeah. But I think there's a lot of people that won't want to do that. And so like, what do we do for them? Make them do it? Probably not. Yeah. Give them tooling to do it. Maybe. But yeah, I mean, effectively you're vouching, you know, you're putting your name online for somebody else.
Starting point is 01:23:20 And so at least then we have culpability in the case of bad vouch, good vouch. You know, at least we know how it went down and it wasn't just like usurped authority. It was actually provided authority. So I could see some positives. It definitely is the way, like Jacob said, they do it internally at GitHub. And so like it can work inside of chains of authority, but open source projects and chains of authority are often at odds with each other. We use pull requests for everything inside GitHub. That's how we do decision documents. That's how we
Starting point is 01:23:51 do all sorts of things through pull requests, which is nice because we have the ability to kind of see the changes and trace the approvals and it's even how we do security exceptions. Alright, cool. Plus your GitHub, so pull requests are really cheap. You guys are getting cheap over there.
Starting point is 01:24:09 That's right, employee discount. Employee discount on pull requests. That's right. This one's free. Use them if you got them. Everybody probably thinks my developer green square chart is wild while that person develops all the time. This is just the way we work at GitHub,
Starting point is 01:24:24 so most people's GitHub activity chart looks that way. What's left in terms of securing GitHub or keeping it secure, whichever you want to phrase it, you'd probably say keeping it secure versus securing it. But what else can we talk about that makes sense before we call this show done? What's on your mind? What have we not asked you? That's a good question.
Starting point is 01:24:42 I mean, we've covered a lot of the topics that I think are near and dear to my heart, certainly. I mean, I can probably talk about the work we do on the security team for another eight hours and not run out of things to talk about. But, you know, at a high level, we take that responsibility very seriously that we talked about it being at the center of the developer ecosystem. It's embedded into everything we do inside GitHub. It's really great to see the partnership between the security team and the engineering teams and the product teams. It happens every day, all day. We're, you know, side by side with our engineering teams, helping to build in security across the board. And, you know, I think part of what we're also excited about is the integration of
Starting point is 01:25:26 AI into those capabilities and what it's going to do for not just being able to kind of have that there for the sake of it, but truly being able to make life easier for the developer and remove some of that security toil and just regular toil from their plate so they can focus on things they want to focus on, things that their teams and businesses want them can focus on things they want to focus on, things that their teams and businesses want them to focus on. And so at a high level, those are the things that we're really focused on as a team, as a business,
Starting point is 01:25:53 and I think make a lot of sense. Appreciate it, Jacob. It's been a lot of interesting conversation. I definitely am with you. I'm bullish on this attestation thing. I'm not bullish on how hard it is to say the word, but I do think it is a feature that should be highly leveraged to much success. It's not easy to spell either, but I agree.
Starting point is 01:26:13 Was there not a synonym? I mean, where's the thesaurus? Can we pick something a little bit easier? A test attestation. This is not a test. Oh, it is. Oh, it was a test. Yeah. you failed it, then you passed it. Jacob, thank you so much for taking time out of your day to just spit some security knowledge with us, take us through the ropes of what you're doing there at GitHub. We obviously are massive fans of the platform
Starting point is 01:26:36 and all the developers on there doing what they do. We appreciate you sharing your time. Thank you. Thanks so much for having me. This was a great conversation. So as you would expect, it takes a lot to secure GitHub. Of course it does, right? It's the largest developer platform on planet earth. It's probably the largest target for all the things basically on earth. And so Jacob and the many teams that support Jacob
Starting point is 01:27:07 and their cause have their work cut out for them. So give them grace and maybe a vote of confidence and some ideas on securing GitHub. Attestation seems kind of cool to me. We talked to Daniel Stenberg about this, about Curl. That is the upcoming episode on Friday on Change Talk on Friends, deep into the world of Curl. And I mentioned at Testation and this episode on that show.
Starting point is 01:27:36 So there you go. Stay tuned to that on Friday. Of course, a massive thank you to our friends over at Neon. They power our tiny little, in comparison to the fleets of databases, serverless managed Neon Postgres fleets of databases. Yeah, they power our database. That's kind of cool. But they also power fleets, RetoolDB, and many, many others.
Starting point is 01:28:03 Check them out at neon.tech slash enterprise or just neon.tech. And of course, to our friends over at Chronitor, I love Chronitor. I monitor all my chrons with Chronitor, chronitor.io, and you should too. And of course, to our friends over at socket.dev, Proactively securing open source, shifting left in a dev tool. The best. That's the best. Check them out, socket.dev.
Starting point is 01:28:34 And of course, to our friends, our partners, our home, fly.io. That's the home of changelog.com. Launch your apps, your databases, and of course, now your AI near your users, no ops, fly.io. And to the beat-freaking residents, Breakmaster Cylinder, oh my gosh.
Starting point is 01:29:01 The beats are banging. I love them. The beats get me going. That take on me riff. Oh my gosh. That's fire. That's fire. Okay.
Starting point is 01:29:14 I'm done. You're done. This show's done. We'll see you on Friday. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.