The Changelog: Software Development, Open Source - Managing Secrets Using Vault (Interview)

Starting point is 00:00:00 Bandwidth for changelog is provided by Fastly. Learn more at Fastly.com Welcome back, everyone. This is The Change Log, and I'm your host, Adam Stachowiak. This is episode 239, and today on the show, we're joined by Seth Vargo. Seth is the Director of Technical Advocacy and advocacy and employee number four at HashiCorp. We talked about his backstory to open source, their open source product called Vault, which lets you centrally secure, store, and tightly control access to secrets across distributed infrastructures.

Starting point is 00:00:57 We talked about the use cases, the problems it solves, key features like data encryption, why they chose to run it in Go, and how they built tooling around the OpenCore model. We got two sponsors today, GoCD and TopTow. First sponsor is GoCD. Head to gocd.io slash changelog to learn more about this open source continuous delivery server. GoCD lets you model complex workflows, promote trusted artifacts,

Starting point is 00:01:22 see how your workflow really works, deploy any version anytime, run and grok your tests, compare builds, take advantage of plugins and more. Once again, head to gocd.io slash changelog to learn more. And now onto the show. All right, everybody, we're back. We got Seth Vargo joining us today. Jared is not here. He's AFK for the week. So I'm here solo with Seth talking about security, keeping secrets, developer advocacy, all that fun stuff. So, Seth, welcome to the show, man.

Starting point is 00:01:58 Hey, Adam. Hey, thanks for having me. Super excited to be here. And from what I understand, you also are from Pittsburgh. So the audience don't know this because I don't often say it, but I'm originally from Pittsburgh. I live in Houston, Texas, but Pittsburgh is my, you know, growing up stomping grounds, man.

Starting point is 00:02:13 Yeah, I moved here. I'm from Pennsylvania, but I moved to Pittsburgh in 2009, did my undergrad here at Carnegie Mellon, kind of fell in love with the city. And I can't say I haven't left since cause I flew 300,000 miles last year, but I pay rent here and it's a beautiful city. And you know, I love the culture of the food, the sports teams, the whole nine yards, Pittsburgh, ski list fan, I'm sure. Right. For sure. Penguins, hockey's your thing or no? Uh, soccer's actually my thing, which, you know,

Starting point is 00:02:42 the river hounds are pretty good. Um, you've probably never heard of them, but that's okay. I'm since I've left. So I was, I haven't thing, which, you know, the Riverhounds are pretty good. You've probably never heard of them, but that's OK. I'm since I've left. So I was I haven't been I don't know. I'm not familiar with that team. They're pretty good. But yeah, the Penguins, Steelers, Pirates try to get to as many games as I can. Love Pittsburgh, man. It's an awesome city and a beautiful place to be in the fall.

Starting point is 00:03:03 That's for sure. I mean, the change of season is the fall. That's for sure. I mean, the change of season is, is the dive. That's for sure. Definitely. Well, uh, this, this show is actually kicked off by a friend of the show, Frederick pollen, uh, AKA derf O on GitHub in issue six or five. We'll link it up in the show notes. But, uh, he said vault is a lightweight tool written and go, which we love. Of course, we have a show called GoTime. If you haven't heard of it, go to GoTime.fm. Or actually, it's GoTime.fm. Sorry about that.

Starting point is 00:03:31 So he asked us to kick off a conversation with you. You responded back on GitHub. And next thing you know, we got an email kicking this thing off. That was about 20 days ago. And so here we are talking about Vault, which is an interesting tool from HashiCorp. You've been there for a couple years now this is an open source tool uh from you all that helps manage secrets you know in the devops space so that's a lot of fun there passwords api keys lots of fun stuff around that uh encryption but one thing we'd like to do is kick off the show with kind of get

Starting point is 00:04:01 into the backstory of the person who comes on the show. So we know you work at HashiCorp. We know that you're the director of technical advocacy there, which I'm not even sure what that is, but I'd love to hear about it from you. But help us with your backstory, Seth. What got you into the position you're in now? Not so much where you're at, but what's your story? How did you get into open source? How did you get to HashiCorp? There you go. Sure. So like I said, I did my undergrad at CMU, which is what brought me to Pittsburgh. One of the programs I was in, the information systems program there, I was super into open source. They encouraged all their students to not only consume open source, but also author open source. So to peruse my GitHub, one of the first repos I wrote was this Ruby tool that pulled some data from the

Starting point is 00:04:45 international book database. And that was basically all encouraged. And my professors helped me get that out there. And since then, I kind of fell in love with open source. After that, I graduated. I worked for a company called Custom Ink. They make t-shirts. But I worked on the web operations team there. I was introduced to a configuration management tool called Chef. Worked with some really great and bright people. Customink has a really interesting technology platform for online ordering. You can design your t-shirt and everything online. And that's supported by some of the smartest tech people I've ever worked with.

Starting point is 00:05:18 Building that and maintaining that. The team I was on was responsible for making sure it was always up and running. So this is where I got exposed to the DevOps space a little bit more, cloud technologies, configuration management. And then I left customing and went to work for Chef, the company that makes Chef. So I was at Chef for about two years where I worked on a number of the different ecosystem parts. I wrote the book Learning Chef, published by O'Reilly. I worked on the community tools like ChefSpec and Test Kitchen and Berkshelf. And I worked on the release engineering team and the community

Starting point is 00:05:48 teams at Chef as well. So that's where I started getting more into the advocacy, public speaking, doing meetups and blog posts and podcasts like this. And then I joined HashiCorp about two and a half years ago. I started as an engineer. I was employee number four. So when you're a four four person startup someone you know you kind of do a little bit of everything yeah you're the you're the accountant and the engineer and sales you know all of all of the vps at once yeah but as as we grew you know people started specializing and there's you know at hasha corporate really dedicated to open source we're an open source company everything we do is open source um and we're really committed to that mission and part part of that is the evangelization and the technical advocacy part. So my role is largely engaging with the community directly through Twitter and GitHub issues and one-on-one at conferences and meetup groups. And then indirectly through things like podcasts and blog posts and videos and

Starting point is 00:06:45 conference talks and trainings. That's funny that you're employee number four, but not funny, I guess, but just funny happenstance because we've kind of been chronicling to some degree Mitchell's work, Mitchell Hashimoto, of course, who started HashiCorp. It wasn't even HashiCorp at first. It was just simply Vagrant. It was this open source tool and we blogged about it when we were simply just a blog and also a podcast. But then we had him on the show in 2012, which I believe is roughly when HashiCorp took off. Or it was in its infancy basically then. And so you're employee number four. That's cool. Yeah. It's great to see how we've grown. Mitchell and Arman, the two founders of the company, obviously have been around since 2012.

Starting point is 00:07:27 But it was really, even though the company was called HashiCorp, a lot of people thought of it as, quote, the vagrant company or the company that makes vagrant. And we came out with a ton of open source tools since then, all popular in their own veins, targeting different segments of the market. We still have vagrant, still very popular. We have Packer, tool for building automated machine images. Console, which is a tool for service discovery. Terraform, which is a tool for provisioning infrastructure. Vault, which we'll talk about today. It's a tool for secret management.

Starting point is 00:07:58 And Nomad, which is our distributed application scheduler. You know, if you're familiar with like Kubernetes or Mesos, it lives in that same space there. And it's very interesting because Vault obviously targets more of the security space. Console targets more of the operations. Vagrant's more of the developer space. So we span this kind of very horizontal line across most organizations. Yeah. Big fan of the work you've done. It's kind of interesting, too, to think about that you were first known as the company that created Vagrant, which was essentially the claim to fame. But since then, you've obviously chiseled it out, as you mentioned, in this horizontal way, all these tools that really help developers be a lot better at what they do in terms of DevOps and provisioning servers and, you know, managing clusters and all that fun stuff. And now, obviously, into this more of a security space,

Starting point is 00:08:45 anything else we should mention about your backstory before we kind of dive into a bit more of the details around vault and the fun stuff that you're doing there? No, I think that pretty much, pretty much covers it. I'm Seth Vargo on the internet. So if you need to find anything more about me,

Starting point is 00:09:00 it's all over the place. I guess the only, the only question I'd like to really know about before we move away from your personal story is like, you know, what was your aha moment? Like, what was it that made you fall in love with,

Starting point is 00:09:11 you know, this piece you mentioned CMU, the information systems program there and the touching open source. What was it? Because this, this show has always been about lifting up and shining a light into areas of open source that don't get enough recognition and thought process put around? And just kind of curious what it was that hooked you.

Starting point is 00:09:29 Oh, that's always a tough question. I think seeing how people interact in open source is always more exciting to me than the code being open. I think there's two definitions of open source. There's kind of like the Puritan, you know, Britannica definition, which is the code is publicly available. And to me, I don't consider that open source. I consider that freely available code that has a license. I think what makes open source open source is the community that surrounds it. the issues on GitHub or GitLab, the engagement with the authors and the people who are writing these tools and using them on a day-to-day basis and contributing features back. And that's really

Starting point is 00:10:12 what keeps me in the game. Anybody can write code and put it on the internet. But what makes a community and what makes a project successful, an open source project successful, is the engagement from the engineers and the people who are working on it all the time. The community is a vital part of any open source project. If you don't have a community, it's not really an open source project, in my opinion. And I think that's what keeps me going is all of these people who, you know, they get paid by another company, or they're doing this in their spare time, because it's interesting to them. And that's awesome to me is like the inspiration that you can put into other people just by making an open source project and building a community around it. Yeah, I totally agree with that. I mean, we produce another show called Request for Commits.

Starting point is 00:10:55 You can find that at rfc.fm. It's about 11-ish episodes in, 12 if you count the kind of behind the scenes look at that. And that show there is definitely a humanized look at the way open source operates. It's beyond the code. It's not simply about the code. It's about the people. It's about the businesses that get propped up beside and around as HashiCorp has done over their open source and how you actually create a sustainable ecosystem of communities and people and software and the fun things people can build because of it. I totally agree with you on that being what sticks you in there, what keeps you in the game, as you said.

Starting point is 00:11:31 Because what keeps me in the game is the people. I love that. You're so professed as being automation obsessed. That's in your own words, of course, because you say that on your awesome website, which is sethvargo.com. But you work at HashiCorp in the advocacy position, so you get to go out and talk quite a bit about Vault. Before we go into the first break, give us the elevator pitch to Vault. Let's kind of get an understanding of what we're talking about here before we dive deep into what it is. Sure. So Vault is a tool for managing secrets, which is a really broad definition.

Starting point is 00:12:02 What's a secret, right? Yeah. Well, and a notepad app also manages secrets. Right, but not very well. I have some sticky notes in front of me too, and I can put them in a folder and they're also managing secrets. But it's a tool that operates in a server client model. So there's a centralized server that securely manages not only the storage of secrets, but also the distribution. So I just like to give a really concrete example

Starting point is 00:12:26 that separates Vault from most other tools on the market. If I need a database password, like I need to talk to Postgres or Cassandra or MySQL or something, the way that you normally do that is you Google, how do I make a MySQL user? You copy and paste some code from Stack Overflow, and then you put that in a text file

Starting point is 00:12:45 or save it somewhere in a password manager. And there's a lot of problems with that in that, you know, you're copying random code from the internet and there's a human seeing this password that's actually for an application. It gets reused over and over again. And what Vault does is it abstracts away in that process. So Vault actually has the ability to make an API call to those databases that I mentioned before and generate that user programmatically. Then as a human or a machine, I just ask Vault for a MySQL credential. It handles all of the authentication authorization for me and just gives me back a username and password to connect to the database. So it really changes the way we generate credentials. And also it puts a lifetime

Starting point is 00:13:25 on them. So if you think about if you're familiar with like DNS or DHCP, like when you get an IP address, that IP address has a TTL associated with it. You don't get to keep that forever. The DHCP server says here you can have this IP for, I don't know, eight days. And then after eight days, you have to tell the DHCP server, hey, I'm still using this IP, or else it'll get recycled and assigned to another client. It gets added back to the pool. And Vault behaves very similarly, but with secrets. So you don't have these credentials that live on forever. They have a lifetime. And Vault manages that lifetime, which reduces the surface area for an attack. Yeah. Well, that certainly tees up the conversation that we

Starting point is 00:14:03 had for sure. i love that overarching view especially the use case examples there so let's pause here we'll take a break when we come back we'll dive deeper into the details of vault and we'll have some fun so be right back our friends at top tile are longtime supporters of this show if you've ever had to quickly scale your team you know how hard it is you have to go through all this hassle of writing job descriptions, adding them to your website, or maybe you have to hire somebody just to go out there and find the candidates for you. That's a lot of work, a ton of work that you don't have to do. If you call my friends at TopTile, they do all the work for you to find the right candidates for your positions. Plus, because they have a very rigorous screening process to identify

Starting point is 00:14:44 the best, you you know you're only getting qualified candidates for your open positions head to toptal.com to learn more that's t-o-p-t-a-l.com tell them adam from the changelog sent you if you'd like a more personal introduction email me adam at changelog.com and now back to the show all right we're back with seth and we're talking about vault managing secrets right you mentioned that you can keep a secret notepad you can keep a secret in any plain text file that doesn't mean it's actually secure so obviously if it's a secret you want to keep it secure you want to make sure that you can manage who has access to it, manage how often they have access to it, whether or not that access is revoked. And I guess what I'm asking here in this overarching theme here of this tool is what other things out there do this?

Starting point is 00:15:36 You got consumer-based maybe applications like 1Password. I know for us here at ChangeLog, we use LastPass. There's also KeyPass and several other things are actually baked into tools like Chef and whatnot. So what was the problem that you all faced at HashiCorp to make you even think, well, we shouldn't make Vault? Yeah, that's a great question. So if you look at consumer-facing products, things you mentioned, 1Password, KeePass, LastPass, we don't actually view those as competitors to Vault. They certainly help frame the conversation, though, because people use those every day. Exactly. Yeah. So I think it's helpful to try to understand, like, where do we draw that line? Vault is designed for like the systems side of things. So computers, databases, machines,

Starting point is 00:16:18 engineers, you know, your tools like 1Password and LastPass are still very useful at an organizational level. We don't expect people in the finance department or the marketing department to have to understand how to authenticate to Vault so that they can get a Photoshop API key or something like that. That use case still belongs in password managers, particularly any kind of shared password that's not related to infrastructure belongs in a password manager. Things like Wi-Fi passwords and things like that should probably be in the password manager where they're more easily accessible.

Starting point is 00:16:50 Those tools have, you know, graphical user interfaces and mobile applications where you can very quickly search and access those secrets. So that's not really Vault's target. Then you take kind of the flip side of that, which is you have things like, you know, Chef and Puppet have these notions of encrypted, Chef calls them data bags, which are like encrypted JSON files. And they're for storing database passwords or API keys that get dropped onto a system so that that machine or application can communicate with other machines or applications or third-party services. And that's really where Vault's target market is.

Starting point is 00:17:22 And the difference between Vault and the existing solutions out there is those solutions are great, but they just do encryption. So they rely on a human to go create a credential and put it in a text file and run the encryption process to generate it and then commit it somewhere or save it somewhere. And then they rely on the machines to decrypt it. Whereas Vault tries to eliminate that human aspect as much as possible. So by being an HTTP API server, so everything's an API, everything's just one curl call away, as I like to say, applications and machines can request credentials without human intervention. So after the Vault is

Starting point is 00:18:01 configured, we don't have to rely on humans to provision these machines and provision all of these secrets that go across them. The additional advantage there is every instance of an application, like if you have, say you have a typical blog that has three front-end instances that are front-end by a load balancer, they can each have their own back-end database password because they programmatically generate that at boot time or runtime. So if an attacker is able to somehow compromise one of those instances, we can revoke exactly one of those credentials and not affect the other two instances, not cause downtime for the people who are trying to come to the blog. Additionally, because there's this one-to-one relationship, if you're familiar with like, you know, ERDs and database modeling, we have a one-to-one relationship between credentials in Vault and the requester, the thing that created it. We have this notion of provenance, which allows us to say, okay, this instance of a machine is compromised. We're just going to revoke every credential that that machine had access to. And as a result, nothing else is affected in the system. And that's only enabled by the fact that these are dynamically generated and there are really minimal human involvement. Machine-to-machine

Starting point is 00:19:15 communication is very important in Vault. And that's ultimately what we're doing. We're reducing the human interaction by adding automation and technology. But it's the same process, the same way that you would, you know, log into Postgres and type commands to generate a user. Vault is doing that at an API layer. So it's still performing those same operations, but it's doing so. And it's, you know, fronting that with an HTTP API server. So you can automate a lot of these things that would normally be manual processes. It's written in Go, which is an interesting thing there.

Starting point is 00:19:45 Obviously, I mentioned Go time. So Go is a phenomenal language. And I think HashiCorp is kind of centered around it. I think it began mostly in the Ruby space with Vagrant, but has kind of curbed into Go. Why is Go a language that makes this kind of tool what it is? Yeah, definitely a great question. And HashiCorp, you know, almost all of our tools, with the exception of Vagrant, are written in Go. And the reason Go is a great choice for a tool like Vault is the concurrency model. As I said before, Vault is a client-server know, tens of thousands of requests per second if you're in a large, you know, enterprise setup. So we get amazing performance out of Vault.

Starting point is 00:20:31 We've done our own internal benchmarks. I've done my own benchmarks. You know, I can push about 25,000 requests per second through a single Vault instance, which is insane when you think about it, considering it was running on a T2 micro on Amazon. So really great performance out of the tool. What we also get out of it is the ability to statically compile across different architectures without the need for a complex build pipeline. So I'll explain that a little bit more. Go has native cross compiling. So what that means is, you know, you write your code, you write Ruby, you write C, you write Python. But then whenever you want to distribute that to, you know, Windows and Mac and Linux and FreeBSD and Raspberry Pis and Arduinos and Android phones, you often have to have one of those to build. You know, I'll take

Starting point is 00:21:15 Chef as an example. When I worked at Chef, we have a very complex build pipeline where we had, you know, a Red Hat box and a Debian box and an Ubuntu box, because that was the only way to build packages for those systems. When we look at Go, Go has this really great build tool chain where on my local Mac, I can build static binaries for Windows down to Android phone if I want to in a single command. And it's all on my Mac. And it's not relying on cloud services. I can do it completely without the internet.

Starting point is 00:21:44 It's actually built into the tool itself. So when you go to download Vault, you can download Vault for everything from like OSX. We support all of the BSDs, FreeBSD, NetBSD, OpenDSD, Windows, Linux. And we don't personally publish the mobile stuff, but you could build it yourself from source. So Go gives us this really great

Starting point is 00:22:05 build tool chain where a lot of people still run Windows and they have hybrid environments where some servers are Linux and some are Red Hat and some are Debian. And things are all over the map. It's very rare that you walk into an organization today that has like, we are 100% the latest version of Debian. There's a mixture everywhere. So being able to support that without basically any overhead is really one of the great things from a, uh, like post-production standpoint, you know, the way we, anyone at HashiCorp can cut a release of any tool from their local laptop, which is great because, you know, we reduce single points of failure. We're not relying on, you know, this huge centralized build system that has a huge backup of queues. You know, we, we build stuff and we can push it out into production very, very fast.

Starting point is 00:22:47 Most of our tools build in, you know, under two to three minutes. And that's across the entire fleet with all of the binaries ready to go. And they're single static binaries. So you don't have to worry about putting a whole bunch of files in place. You download it, you unpack it, unarchive it, and then you put it in your path and it's ready to go. Let's talk about how Vault is used. You mentioned removing the human scenario around that.

Starting point is 00:23:12 You still might generate a password, but it's stored in Vault. Can you talk to me a bit about some of the processes that happen to secure, for example, a server or launching a new server on Linode or something like that? How the team operates around Vault being able to not only generate passwords and things like that, but also like access them? Is it something that the developers end up using? Or is it something that it's simply machine based where it's about automation? Yeah, so that's a great question.

Starting point is 00:23:41 It's actually horizontal across developers, operations, security, and machines. So developers need access to things like AWS, Google, cloud credentials, right? You need your IAM keys to be able to do stuff. And you might need access to be able to read some data from the database so that you can see the columns and the query and do some analysis of the production database. And with Vault, you configure that. So the Vault administrator would configure that developers log into Vault, and I'll talk about what that process of logging in means in a minute.

Starting point is 00:24:14 They log into Vault, and then they can request those things that they've been given permission to do. The security team is obviously a lot focused on a higher level. They might be managing things like the TLS communication between services. So if you're in a microservices-oriented architecture and you have two internal apps that talk to each other, you want them to talk over an SSL connection, an HTTPS or a TLS connection. those certificates and making sure those certificates have really short crills and really short lifetimes, but are also always valid so that you can communicate between services allows the security team to sleep at night because it means even if an attacker is able to penetrate your outermost firewall, they can't sniff traffic between the services because all of the traffic between services is encrypted.

Starting point is 00:25:00 We have the operations team. They might be responsible for doing things like building images or doing maintenance. So they also need access to API keys and databases and the whole nine yards. So this process of authentication or quote logging into Vault is really important because Vault is a dynamic secret acquisition engine, but the way you dynamically acquire a secret is based off of your permissions in Vault. So when you log into Vault, there are a number of ways to authenticate who you are. And let's just stop talking about Vault for one second. If I were to log into a website, I type in like my username and password and I click sign in. And from there, I don't know if most of the viewers are familiar with it, but you're assigned like a token or a session ID from that application, that website. Your browser stores that in a cookie, might be encrypted, it might not be. And that's how you're identified moving forward in the system.

Starting point is 00:25:54 It'd be pretty annoying if every time you went to, you know, Facebook or GitHub, you had to type your username and password in just to, you know, like a post or comment on an issue. So that's how you get identified as you move through the system. And your browser handles that carrying of that session ID along with you. And it has an expiration. Also, you may be familiar with things like OAuth, where you might be able to sign in with Facebook, sign in with GitHub, sign in with Twitter,

Starting point is 00:26:18 in which case you supply your Twitter login or your Facebook login to authenticate to a third-party application. That's something called OmniAuth or OAuth. Vault has a very similar process. It's not OAuth. It's not a traditional website login, but you can think of them metaphorically as the same. So as a human, you can authenticate to Vault with GitHub. You supply a GitHub API token. An administrator of Vault configures Vault and says, hey, anyone who is a member of this GitHub team has these permissions in Vault. So you can say anyone who's a member of, say, the engineer's team on the GitHub organization

Starting point is 00:26:55 HashiCorp has read-only access to all of the production data. We don't actually have that, but I'm just giving you a hypothetical. You could also say anyone in the accounting team has the ability to SSH because Vault can actually manage SSH for you, has the ability to SSH into some service that runs some accounting procedure or something like that. There's also the ability to log in with a generic username and password. You can log in via Okta. I don't know You can log in via Okta. I don't know if you're familiar with Okta. It's a very popular enterprise single sign-on.

Starting point is 00:27:28 Radius, another very popular single sign-on. LDAP, including Active Directory. So Vault doesn't actually manage the authentication for you. It generally delegates the authentication to a third-party service that the Vault administrators have configured. So that's the authentication, typically called AuthN, as in Nancy. Vault does manage the AuthZ, which I'll talk about in a second.

Starting point is 00:27:53 On the machine side of things, machines also need to authenticate. If any machine in the Vault cluster could just request credentials, that wouldn't be very fruitful. So there's a number of ways for machines to authenticate. They can authenticate via a token. Maybe they're supplied that token at boot time. There are ways for machines to authenticate on the cloud providers. So we have authentication for EC2 instances. And there are ways for applications to authenticate as well. Things like App ID and AppRole allow applications like your front-end web app to be able to supply

Starting point is 00:28:23 information it's given at boot time, and Vault will in turn validate that and give it back a token so you can think of it as like username and password but instead of human friendly things it's all uuids and you know special characters all over the place long strings that are really hard to decrypt yeah very very high entropy um you know uuids so those are the different ways you authenticate to Vault. But it all ends up the same. So you can think of it as a big funnel that's all going into this thing called a token. So whether you put a username and password in

Starting point is 00:28:53 or you log in with GitHub or LDAP, the thing that happens is you get back a token. And that's very, very similar to a session ID on a website. And that token is how you authenticate moving forward in the system. So you don't ever supply your username and password again, or your GitHub login again, it's all that token moving forward. And that token has permissions assigned to it. And Vault manages those permissions based on the authorization. So auth Z as in zebra,

Starting point is 00:29:19 and it manages all of that internally through a really verbose policy system. So everything in Vault is based on a policy. So just because you can authenticate to Vault doesn't mean you're authorized to do anything. It would be like logging into a website and the very first page you get is access denied. You were successful in logging in, but you don't actually have permission to see anything in the system. Right. What Vault administrators in collaboration with the security teams do is they generate policies that map authentications to permissions in the system. So I gave some examples before of like anyone in the engineers team can do certain things in vault. And that's really where vault's power is, because as new people join the company, they just get added to the team on GitHub and they automatically inherit permissions for all of these things. As a new person, you know, joins, say you're a large enterprise or a really big company that has, you know, a massive Active Directory installation, and you have an employee who

Starting point is 00:30:12 moves from, you know, one team to another team, all you do is change their OU in LDAP from team one to team two, and automatically their permissions are updated throughout the system. They might lose permission to things they had before and gain permission to new things in the system. And that's all handled out of ban because you're already managing an Active Directory server that's already part of your company culture and company technology. So we're just integrating with those technologies to give you authorization to different resources in the system. You mentioned that Vault doesn't do the authentication, it's handled externally. Is that right? So like LDAP or in this case using a GitHub team or a group or something like that. Is that correct?

Starting point is 00:30:50 You're using inclusion in a certain external interface or an external application to say you can have access to Vault if that's part of the Vault cluster, as you mentioned. Yeah, you can think of it as like delegating. Vault has its own internal mechanisms like username and password is internal to Vault. But for the ones that most organizations will use, the authentication is delegated to something like LDAP or GitHub. So, you know, Vault makes a request to LDAP and says, you know, is this username and password valid? And LDAP says yes or no. And then as a result, the data that comes back is used to map back onto the policy and permission. It certainly makes maintaining your GitHub teams a lot more of a security issue,

Starting point is 00:31:31 like using it in this mechanism. Like most, I can't say most, not many people out there are probably using their GitHub teams or the groups that they have on GitHub in their organizations. They use them for groups, but it's mostly like external to say who's on our open source teams or here's who's involved in projects or whatever, or having access to certain repos, of course. But and I guess that's sort of like it's obviously authentication because you're authenticating to certain you're allowing someone to access a repo or push or commit to a repo or read write access to a repo, for example.

Starting point is 00:32:02 But taking that one step further and using that same feature set to access a server certainly puts a lot more pressure on making sure that you manage your teams well. Definitely. And I'll just plug real quick. If you're looking for a way to manage all of your GitHub teams and the memberships of those teams and the repository permissions, one of our other open source tools, Terraform, actually has the ability to do that. So in a single text file, you can describe what you want your entire GitHub organization layout to be, who has permission to what repositories, what are your

Starting point is 00:32:33 teams, what are your memberships, and it'll actually just go out to GitHub when you run Terraform and it'll figure all of that out for you and apply that in a very declarative syntax. Terraform's fairly new, isn't it? It's maybe a year old, roughly. It's hard for me to say because we used Terraform internally before we made it open source. So the lines between like when it was public and when it wasn't are always a little bit blurred for me. I want to say it's it's it's over a year old at this point.

Starting point is 00:33:00 You know, we're we're quickly approaching a 1.0. It's on 0.8 right now. And I think 0.9 is slated to come out in the next couple of weeks. So I think it's a little bit older than that. And I know we did run a version of it internally before we made it public. Same thing we do with console too, which is another open source tool. We tend to run these things internally a bit before we even release like the view 0.1 to the open source community. This feature with GitHub though is just one of many things it does.

Starting point is 00:33:26 Correct. Okay. And it's an installable application. So you got access to like a Mac version, FreeBSD, as you mentioned, Linux, OpenBSD, Solaris, Windows. You can use it on all those. Yep.

Starting point is 00:33:38 All of the things. And there are, we don't compile them because like as a company, we don't officially support them, but there are people who are running Vault on things like Raspberry Pis and Arduino boards. Certainly, you got to go back and listen to that episode we just recently published. I think I mentioned in the pre-show was episode number 237 for secure software, because that would certainly apply there. Because if you're using on a Raspberry Pi, you might want to make sure that it's a reproducible build from the original terraform source code but uh enough a little tangent there

Starting point is 00:34:09 certainly we'll link up in terraform in the show notes but getting back to vault i'm kind of curious as you mentioned that this tool is for developers it's for machines it's for the security personnel that are you know securing microservices or access between certain applications. Curious about the interface. And from what I can tell so far, it's mainly a command line GUI or sort of like focused on a developer point of view. What's the current state of like an interface in general? I know you have an API for machines, obviously, but what's the common way for people to work with Vault and manage, you know, machine clusters, manage credentials, manage and revote

Starting point is 00:34:46 access? How does that work primarily? What's the GUI for that? What's the interface for that? Yeah. So the primary interface for the open source tool is the command line and the API. And the command line itself is actually just a very thin wrapper around the API. It's just going to do some basic text formatting and some parsing of the output and error handling for you you know the like a hash corp we're an open source company so we have to make money somehow and one of the ways we make money is offering these enterprise product versions and professional product versions so vault pro or vault professional and vault enterprise include a kind of like web-based user interface where you can you know list secrets insert secrets

Starting point is 00:35:26 generate credentials interact with the various uh secret backends you know i mentioned like my sql and postgres and kind of visualize the cluster it all uses the open source api so there's nothing that really stops you from building that internally but what would you right yeah well that's one of the things that we kind of offer is like a paid product, you know, to help us obviously fund the work we do in open source. That's cool. I mean, it's like Vault++. So if you've used Vault in its open source state

Starting point is 00:35:52 where you're primarily focused on like a command line GUI for interfacing with Vault, accessing and revoking machines and passwords and key phrases and all that stuff, but you want to go one step further to a web GUI or even further to something that supports better enterprise scenarios, that's HashiCorp's business model, right? That's what Vagrant was built on.

Starting point is 00:36:15 It was built on this open source tool that then, I think, Mitchell got a call from VMware or wanted to do something with VMware, and they're like, well, that's something I can actually release as a paid plugin or something like that and actually make money from it so I can sustain this and build this. And if he hadn't made that choice or learned that early on in the early days of HashiCorp, then we wouldn't be talking right now about Vault. Exactly. And I think if you look at the market, there's two open source business models that are popular. The first is what we do at HashiCorp,

Starting point is 00:36:44 which is open core. So we build as much as we can in the open source community models that are popular. The first is what we do at HashiCorp, which is open core. So we build as much as we can in the open source community. And then the features that we think appeal to like enterprise markets are our enterprise features. The other is like the paid support model where everything's open source, but you have to pay for support anytime you want help. And I've seen this pattern emerge a number of times. And, you know, we had conversations internally where it's like, okay, that kind of works, but it actually encourages you to make software that's difficult to use because your entire

Starting point is 00:37:14 baseline is based off of the fact that people need help. Right. And, you know, one of the things we get praise from a lot in the community is how good our documentation is and how thorough it is. And I feel like if we had taken the support model, you know, you're almost de-incentivized to make good documentation because you need people to reach out to you to buy support contracts to fund your work. Yeah. Because if you're doing good docs, then why would they call you or need your help? Exactly. So like what we do for Vault Enterprise is like if you need a UI, if you want like a foolproof backup and restore strategy if you want integration like you want the 24 7 sla you want integration with the hardware

Starting point is 00:37:50 security module all of that is like what we consider you know quote-unquote enterprise feature but everything else is open source everything i've talked about before is open source and there's a ton of other features that are you know available straight up in the open source offering um you don't even have to like give us your email to download it. The code's on GitHub. You can download the compiled binary straight from the website. I love the idea of open core. I think I've heard that term before, but for some reason it's like, like blaring at me

Starting point is 00:38:18 right now, just this idea of open core. And now having this conversation with you and having the history we have with what is now HashiCorp, like we've kind of chronicled to some degree your company and your open source offerings over the years. And knowing that there's a base part of each product you deliver that has an open core availability that doesn't require you to give them an email address or then to give you an email address to access it. It's open source. They can contribute. They can list PRs and star it and watch it on GitHub and participate and give feedback. That's an interesting concept. I mean,

Starting point is 00:38:49 has that been, how much, I guess, how much have you all talked about this idea of open core? Is this something new I've just finally heard about, or is this something you guys have coined? I mean, we've thrown the term around a lot. I'm not sure who coined it or where it came from, but it's really ingrained in our principles. We have this thing called the Tao of HashiCorp, which lists the seven pillars of the company. Open source is one of the things that we built. It's why we're here. The community, the contributors, it's the reason we're here. So it is a core pillar of HashiCorp as a company. And we wouldn't be here without it. And I think more and more, we're here. So it is a core pillar of HashiCorp as a company. And we wouldn't be here

Starting point is 00:39:25 without it. And I think more and more, we're seeing traditional software vendors, people like Microsoft even starting to embrace this more open core model, because that's what organizations are looking for. Like I said before, open source isn't just the code, it's the people. And I think larger organizations are slowly starting to realize that by betting on open source, not only do you have free access to the code, but you're also betting on, you know, highly available engineers, you're betting on, you know, people who like to engage with you, as opposed to people who are forced to work in a support queue, and they want nothing to do with you. And you get, you know, faster bug fixes,

Starting point is 00:40:01 higher prioritization, you get to see the roadmap, It's very public. All of the work is being done out in the open. And I think that's what's most exciting to me. Yeah, there's certainly something about the open source way to speak. As you mentioned, it is very much focused on the people of the code, not so much just the code itself. And applying those principles to lots of different things, doing things in the open doing things with uh transparency inclusiveness you know including people in the process even if they're not part of the

Starting point is 00:40:29 inside team or a co-founder of the company or you know the fourth person to join the company to make it what it is today they still have maybe not the same level of a voice but they still have a voice in the future and where things go and that that's, to me, that's a fantastic recipe that obviously just gets results, which is why open source is eating the world. But let's curveball back into the subject of Vault. I want to go over some of the key features. So you've got secure secret storage. That's one of the features.

Starting point is 00:40:56 So I want to go through kind of each of these features and kind of break some of them down that makes sense. So stop me wherever we need to. But secure secret storage, dynamic secrets, data encryption. We talked a bit about some of these leasing renewal, obviously, and revocation where you can actually revoke a secret or access to something as we talked about before. What I'm really interested in is one, it stores some secrets, but then it also can actually encrypt data and store it elsewhere. Can we talk about that one a bit, or should we kind of go in order,

Starting point is 00:41:26 do you think, to these features? Yeah, we can definitely chat about that first. That one seems to stand out most to me, because it seems like, you know, you mentioned earlier the human element, which is why I thought that feature would be first to talk about, even though it's out of order, is because, you know, I can take this key and throw it through an encryption and then put it on the server and then Vault is managing it.

Starting point is 00:41:47 But in this case, it's actually handling encryption for you too. So it's probably making some wise choices for you on which algorithm to use, which are less likely to get hacked recently or decrypted. So maybe that's a good place to start. Definitely. So in terms of storage, all of the data in Vault is encrypted with 256-bit AES CBC encryption in transit and at rest. So we rely on the TLS SSL on the front, and then all of the data is encrypted when it's written to the file system or to wherever you're

Starting point is 00:42:19 persisting the data, the durable storage. That's where all the static data is written. Vault also has this backend, which you can think of a backend as like a secret plugin. It's a plugin that either generates secrets or stores them. One of those backends is called Transit, T-R-A-N-S-I-T. And the reason it's called Transit is that it provides encryption in transit. So you can think of the piece of data as kind of moving on a vehicle through an encryption pipeline, and it comes out encrypted on the other end. It's still the same vehicle, it's just in a different format. And that's why we call it transit. The difference between, say, like the generic secret backend, where you just store data

Starting point is 00:42:59 in Vault, you say, hey, Vault, here's password 12345. Please save this for me. And the transit backend is where the data lives. When you give Vault a generic secret, it encrypts it and stores it in its own backend. When you use the transit encryption service, what you're really getting is encryption as a service. You're giving Vault plain text values, and Vault gives you back encrypted data, never storing any of that data. So this is really great if you have large things like PDFs or large volumes of things like multiple rows in a database where you're encrypting social security numbers or passports or credit cards, and you want to encrypt that data, but you don't want to store it anywhere. You actually want to encrypt it and store it right back in the database.

Starting point is 00:43:43 Then whenever you want to decrypt that data, you give it back to Vault. Vault makes sure that you're authorized and authenticated to be able to decrypt that data and then decrypts you and gives you back the plain text value. And all of this happens in transit. So none of that data is ever stored on the Vault server. If you're familiar with any of the major cloud providers like Amazon, Google, Azure, they all have this notion of a key management service where they will provide encryption as a service for you. And Vault does that with a ton of additional features, not to mention all of the other secret backends, like the dynamic secret acquisition from Postgres and MySQL that we talked about earlier. The advantage of using the transit backend is that it has built-in support for key rotation,

Starting point is 00:44:31 automatic key upgrading, and the ability to specify a list of cipher suites that you want. So by default, when you post something to a particular endpoint in the transit backend in Vault, it'll generate an encryption key, very high entropy. Vault has been audited a number of times by different agencies, so we know it's cryptographically

Starting point is 00:44:49 secure. It'll generate a high entropy key. It'll manage that key for you, never disclosing it to the application. So even if an attacker is able to compromise your database, for example, they don't have the encryption key. They have to compromise Vault in order to get that encryption key. And it's actually possible to tell Vault to never divulge that encryption key. So it just doesn't ever give it away. But if you're like FIPS compliant or HIPAA compliant or any of the PCI compliances out there, you have to rotate encryption keys on a regular basis. I think it's like every three months is like the minimum. You have to be able to prove that you've rotated them. You have to be able to prove that you've upgraded and audited the data upgrade.

Starting point is 00:45:25 And the transit backend has built-in support for all of this, and it just does it automatically with one API call. So if you're ready to rotate, you just rotate the key, and Vault does an automatic, completely online upgrade. So as new data comes in, it uses the new key. Any old data that's decrypted, it decrypts it with the old key and re-encrypts it with the new key when you get it back. And then there's a built-in process for re-encrypting all of the data. We call it re-wrapping. So if you want to force upgrade all of the keys to the newest version, you can run one command and give it all of the data you want to re-encrypt, and it'll give you back the new encrypted data. So it really supports this really broad set of features that allow you to do what we call encryption as a service.

Starting point is 00:46:10 So instead of rolling your own encryption and getting the cipher suite and everything right, we've done all of that heavy lifting and we give you an API. All you do is give data to the API and it gives you a response back. Very similar to how you'd interact with like the GitHub API or the Facebook Graph API. It's just a JSON request and a JSON response.

Starting point is 00:46:30 So in terms of the integrations, you know, there are client libraries that exist for Ruby and Python and Node and Go. All of the major programming languages have the ability to interact with Vault. And if they don't, it's just an HTTP request. So you can bind to curl or do whatever you need to do to make that HTTP request. It seems like this was really like unified, you know, like no matter what angle you need to store secrets, right? Whether you're the developer, whether you're the machine, whether you're on the security team, you need to do things like revocation and stuff like that. It seems like, to me, you've built this perfect tool for literally managing secrets and being able to do it in a way. And obviously, scale, too. So if you need to move beyond the open source tool, then you can. You have other options for the pro version, the enterprise version, obviously, that you can upgrade to.

Starting point is 00:47:21 But it seems like this is the best unified way to encrypt, obviously, and then actually authenticate or get access to certain servers or certain databases or encrypt data, things like that. I don't know. Is there anything else out there like this? I know it's built into other tools, but this one way, unified way, as you mentioned, this tool has been audited by external agencies to confirm its security. Is there anything else out there like this? So we haven't been able to find one. I guess one of the things that's important to mention is like at HashiCorp, we build solutions.

Starting point is 00:47:55 We're builders, we're engineers at heart. So when we identify a problem, we have the problem too. One of the reasons we built Vault is because one of our enterprise offerings involves you giving us cloud credentials like Amazon keys and Linode keys. So if you give us your cloud credentials, we have to have a really good way and a really good promise that they're secure. And that was kind of how Vault started coming up. And we did a huge exploration of the space. And you're right, there are tools that do some of this. But Vault's goal and its tagline is a tool for managing secrets. And it's intentionally very generic because Vault's surface area is effectively infinite. If it has an API to generate secrets, Vault can generate

Starting point is 00:48:36 secrets for it. And that's really what we're shooting for is it should be the one-stop shop for secrets in your organization. It doesn't matter how big or small you are. If you're a two- to three-person startup, Vault can help you. If you're a massive enterprise, Vault can help you too. You might be using different pieces or different components, different backends, different plugins. But ultimately, the goal is the same. You want this centralized tool to manage credentials and secrets for you. And by putting everything in this centralized system, you reduce secret sprawl, you reduce, you know, the attack surface. If you know, you're in a

Starting point is 00:49:10 situation where you have an intruder on your network, or someone's downloading your database, you have separation of concerns for all of this data. And you can quickly revoke things in the event that a system is compromised. And that's what's great. The corollary to that is, I'm sure there are a lot of listeners who do work at, you know, a startup or medium-sized companies. And I just like to throw out this hypothetical question, which is, how many of you have a production credential on your local laptop right now? You know, I know some of you might be driving in a car. My hand's up. Yeah, you might be driving. It's actually an old server, but it's still, I've done it before.

Starting point is 00:49:44 Exactly. And it's so, I've done it's it's still i've done it before exactly and it's so i've done it too i'm not pointing the finger by any and it's in a plain text file it's actually a markdown file because it was instructions back to me how to get access so it's like i just told whomever could you know circumvent my computer and get access to it how to get access to my stuff exactly and and that's super common definitely like not pointing the finger i've done it but what vault does is when you adopt Vault early on, and again, it's open source, so there's basically zero risk in doing so, is you eliminate that because the secrets have a lifetime. So if you're not using it, it's going to expire, at which point, so what? It's on your laptop, but it's not valid anymore. But also you have the centralized source of an audit log. So you can see, you know, which credentials being used for what and where and how, and you can try to understand what the rollout effect is. So like, if I revoke this credential, what happens, what breaks in the

Starting point is 00:50:34 system, what services are depending on this? And that's, that's really important. You know, the other thing I ask is every once in a while, I have people who are like, nope, no production credentials on my laptop. And then I say, well, can you clone the production application from GitHub or GitLab? Can you download production code and can you commit to production code? Like, yeah, I have an SSH key. I'm like, isn't that a production credential then? Well, they're like, oh, it's just for GitHub. But you have access to code that runs in production.

Starting point is 00:51:02 So you have a production credential. And Vault can manage that too. Vault manages SSH keys. It can manage the whole lifecycle there. And that's what's really awesome is it is truly like the one-stop shop for secrets. And it doesn't matter whether you're a massive enterprise or a tiny little startup. That's definitely a good place to pause because when I want to come back from this break, I want to kind of get into those first steps that they're getting started, you know, maybe the smaller teams are just rolling out like what what you might prescribe as a good initial rollout as a test.

Starting point is 00:51:33 So let's take this break. When we come back, we'll kind of dive into that. I'll go from there. We'll be right back. I've got good news for you this Friday, February 24th. We're launching a new show called JS Party. It's a live celebration of JavaScript and the web every Friday at 3 p.m. Eastern. The show is hosted by Michael Rogers, Alex Sexton, and Rachel White at thechangelaw.com slash jsparty to subscribe.

Starting point is 00:51:59 And here's a quick teaser of what's to come. Take a listen. Let's talk a bit about progressive enhancement. It's crazy that this has been, I mean, how long, we've been talking about progressive enhancement for more than 10 years, right? And it's just been this general good thing that everybody should be doing that everybody talks about at conferences and then people go away and some of them do it and some of them don't. Progressive enhancement now is the same concept, but fundamentally different than what people

Starting point is 00:52:28 were talking about. Or maybe not fundamentally, but in practice, very different than what people were talking about 10 years ago. Yeah. I think the thing that people have been getting mad about in the past month is more accessibility focused than any other kind of progressive enhancement stuff from what i've seen yeah it's not even about speed right now it's just like can everyone who is using the internet use your site well i guess that would be speed if depending on where

Starting point is 00:52:59 your internet is perhaps we're on slightly different twitters then. Yeah. My Twitter is, if you have Nolan Lawson and Alex Russell in your feed, it's entirely speed related. Like JavaScript first applications that require JavaScript to run before you can see things. Really? Versus server rendered with very fast interactivity. Sam Saccone is another person really on there. The stuff that I've been seeing lately is, um, making tools,

Starting point is 00:53:30 uh, for the, um, you know, like the, you might not need JavaScript stuff, but then not making it accessible for everyone to view, but using JavaScript to display the page somehow. I don't know.

Starting point is 00:53:47 Let's rewind just a little bit and unpack this. So that if anybody's not on our exact Twitter feeds, they can figure out what we're talking about. This goes to show how much people confirm their own biases. All right. That was Michael Rogers, Alex Seon and rachel white in our upcoming show js party february 24th is the first live show hit the changelog.com slash js party and now back to the show all right we're back with seth and we're talking about vault and a more important subject here which is like if they've seen light, they understand that they've got production credentials, they've got unsecure passwords on their local laptops.

Starting point is 00:54:29 I'm talking to the developers out there. They've got these things and they're realizing, man, I am in an unsecure situation with my startup, my business, my employer. We are at risk, so to speak. And here comes Vault. We've talked about it. We talked about all the different features of it that truly make sense. And it can scale with you if you need it to scale from pro to enterprise. But maybe the first question is getting started.

Starting point is 00:54:53 What are some of the first steps a small team or an intro team can take to begin to use Vault? Where's the first steps best to take? Yeah, definitely a great question. So let's say you're skeptical, right? I'm on the show. I'm trying to get you to use this open source project, but I'm basically a salesperson. You're pretty sketchy.

Starting point is 00:55:10 Yeah, yeah, you know, I do some sketch diagrams. I have some hanging on the wall. But, you know, let's say you don't buy what I'm saying. You want to try it out for yourself. So zero risk. You can launch an in-the-cloud Vault environment with an interactive tutorial right from Vault's website. Vaultproject.io, there's a big old button, really hard to miss, says launch interactive tutorial.

Starting point is 00:55:33 If you're not driving, you can actually pull out your phone and do it now. It works on mobile, but it'll walk you through some really common Vault commands. And it actually launches your own Vault instance in the background. You know, HashiCorp funds that so that people can try out Vault. Zero risk, no installation. All you need is a modern browser with some JavaScript enabled. So once you've bought into the idea, because you will fall in love with it, trust me, the next thing you want to do is actually download it. And before you put it in production, you're going to want to download it locally. So I run on a Mac locally. I know that's very common for software engineers. So you can head on over to that same website,

Starting point is 00:56:04 vaultproject.io, download the Mac binary. You're going to want the 64-bit one because if you have a 32-bit Mac, I feel bad for you. Download the 64-bit one. That's going to give you a single static binary that you can run. It's just a vault. That single static binary is the client and the server. It can run as the client or the server or both. If you want to ever try something out locally, and I still do this myself, there are times where I forget an API or I want to see what a response is. Most of our tools have this thing called a dash dev flag, DEV, which is short for development. So locally, you can just spin up a vault server by running vault server dash dev. That'll give you a fully ready to go, ready to accept request Vault server that you can do some sample commands against.

Starting point is 00:56:46 The same thing that you were running in the cloud, but it's all local on your laptop. And you can really abuse that. You can do a lot more to that than you can do to the cloud instance. You can mess around with different configuration. You can see all the different log output. Then once you're satisfied with that,

Starting point is 00:57:00 you're ready to put it in production. And from there, there's actually a number of techniques. You have to make a few decisions though. If you're a tiny startup, it might be best just to start with a single vault instance. So don't worry about high availability, especially if this is a beta test. Just get it in production. Make sure you're comfortable using it. It's senseless to invest a ton of time and energy into a tool that you're not 100% sold on yet. So if you're in a cloud environment or bare metal, spin up a server, use your automation tooling,

Starting point is 00:57:27 Chef, Puppet, Bash, typing things in the terminal, and install it. The tools are really easy to install. We have a service called the HashiCorp Releases service. If you head on over to releases.hashicorp.com, you can browse any version of any product we've ever published. You can download it right to a machine,

Starting point is 00:57:46 and it's a single static binary, the same that you downloaded for your Mac earlier. So you put it on the server and then you run it. Maybe you use systemd or init or upstart or whatever your init system is. You have it running on the system. And from there, you can hit it publicly, right? You might give it a public IP or put it within your private subnet

Starting point is 00:58:01 and you can start addressing it from other applications and really see how it behaves. Give it a trial, but you're not ready to invest yet. You're just kind of really feeling the waters out in a production or a staging environment. Once you're sold on it, you're going to want to automate the process of standing up a vault cluster,

Starting point is 00:58:17 and you're going to want to move to a highly available environment. And again, this is all completely open source, so you don't need to pay for Pro to have high availability. High availability just built right into Vault. What you do is you spin up multiple Vault clusters. And Terraform is a great tool for doing this. You spin up these Vault servers.

Starting point is 00:58:33 You connect them to each other. They all talk to each other. They do a leader election algorithm. One of them gets the leader. The other one goes into standby mode. And from there, you can make requests against Vault. If one of those servers goes down, you know, there's an outage or someone pulls the plug or it just dies, the other two will pick up

Starting point is 00:58:49 the work. So you run in that high availability mode, which is something that's really important as this becomes a crucial piece of, you know, the secret management and the credential management in the organization. So that's the best way to get started. That's kind of my recommended path is you should try it out on the internet before you download it. You should download it locally before you put it on a staging server. And then once you're ready into production, we have a couple of guides that we've published that are kind of like best architecture practices. Once it's running in the staging environment, so that step before production, I think it's important that the organization kind of takes a step back and thinks about how we're going to manage this thing. Who's going to be responsible for it? Who's going to be setting up a policy? What's the backup strategy? Because that's ultimately going to be important. Systems fail. It happens all the time.

Starting point is 00:59:34 So how do we plan for that? And what's our recovery strategy? Vault internals are also going to become important at that point. I've talked a lot about here on the podcast, but we have a whole section of the documentation online that's devoted to vault internals, the architecture, the high availability model, the security model, the threat model, how you do telemetry, like are you pushing things into graphite and monitoring, the process of key rotation, online upgrades, the whole nine yards. Those are things that whoever is responsible for managing the vault cluster, they're going to want to familiarize themselves with those things before you move into a production scenario. Those are details that if you're just getting started,

Starting point is 01:00:08 you don't need. You can read them if you want to, but they're kind of dry. They're very academic in nature, but they are crucial if you're going to run this in production. You should really understand the internals and the architecture and how it's working so that if something does go wrong,

Starting point is 01:00:21 you have a better picture of how information is flowing through the system. This is certainly standing on the shoulders of the security giants. I mean, to me, it seems like hearing you share the getting started story and best practices and paths to getting it into a staging environment and testing it and then putting the right person in place to do these things. And obviously all the documentation available to someone for Vault is like, you've done all the thinking

Starting point is 01:00:46 about how best to secure infrastructure and how best to revoke and provide access to that infrastructure. And to me, that's like, you can't put value on that, right? And that's immensely valuable. But I do have one question for you that I thought was kind of a ha-ha thing.

Starting point is 01:01:01 So to take it like that, when I was playing with the interactive tool at vault project.io as you mentioned uh it told me that i couldn't access my vault unless i had the key one which i believe that is that initial root token so like if i lost e1 i could not unseal as the the terminology is and i want to talk a little bit about lexicon so remind me about that if we forget but if i lost e1 where do i store key one to get access to vault do i store that somewhere else like one password or where do i keep these keys at to never unsee it to be able to unseal my vault okay so that brings up a bit of a broader topic um so we'll have to take a step back for a second we talked about these things called tokens, and those tokens are not what you're talking about. What you're talking about right

Starting point is 01:01:50 now is this thing called an unseal key. U-N-S-E-A-L, unseal key. So when Vault is started, aside from that development mode that I mentioned earlier, when a Vault server is started, it comes up in what is called an uninitialized and sealed state. Initialization is easy to describe. That's just the process of preparing the storage backend. So if I'm storing my data in the file system, initialization is the process by which we do like make dir-p to set up the file system to actually receive the encrypted data. Sealing and unsealing is a little bit more challenging to explain. One of the pillars of one of the principles is that no one person has complete access of the system.

Starting point is 01:02:34 The reason that's challenging is if you think about a physical bank vault, like you walk into your bank around the corner, whoever has the key to the vault can go inside and take money out. If you go to like a larger bank, sometimes the manager and another employee both have to insert their key and turn them at the same time in order to open the vault. So we took that same principle a step further, did a lot of research, and we discovered this thing called Shamir's secret sharing algorithm, which is actually borrowed from mathematics. And we can link to this in the show notes for those who are more interested. But Shamir's secret sharing algorithm allows you to generate a key, a string, let's say for easy argument, and split that string into a number of parts.

Starting point is 01:03:15 So if I took a string and I cut it into five pieces, I could distribute that string to five people such that those five people can come back together and regenerate the string. So that's, that's easy. We can think about that. We can kind of model that in our head, if I take a string and I give five, you know, they can come together in the same order and generate that string. What Shamir secret sharing algorithm does is it allows us to take that a step further. And it says, okay, let's cut the string into five pieces. But mathematically, any three of them can come together in any order to regenerate that same holistic string. So it's kind of hard to think about in a physical world.

Starting point is 01:03:52 You know, I can't take a string, cut it into five pieces, and then regenerate that same string. But mathematically, it's possible, where we can take a string, we can split it into substrings, give it out to a certain number of people, configurable. We'll call that N, which is the number of people, the number of shares. And then we have this thing called a threshold, which is the minimum required to come back together. We'll call that T, which is the minimum that have to regenerate that original key. So all of that is a bunch of background into explaining when Vault starts up, there's two keys. There's the encryption key, which is what the data is actually encrypted with, and that supports rotation and the whole nine yards. Then there's this thing called the master key. The master key is what encrypts the encryption

Starting point is 01:04:36 key because the encryption key lives on disk, but it can't be encrypted with itself. That would be senseless. So instead, it's encrypted with the master key. That master key never actually exists beyond memory. So when the vault is initialized, the master key is created and it's split out into a configurable number of shares. Those shares are then distributed to people in the organization, and they have to come together to generate that master key so that they can decrypt the encryption key in order to unseal the vault. So it's a little bit complicated, and I can decrypt the encryption key in order to unseal the vault. So it's a little bit complicated and I can share a graphic

Starting point is 01:05:08 that we might be able to link to in the show notes that makes this a little bit clearer. But we have the separation of concerns where no one person has complete access of the system. So in order to unseal the vault, which is just a one-time operation whenever the vault first boots, it remains unsealed unless it dies or someone comes along and seals it.

Starting point is 01:05:28 We need more than one person to come along and put their unseal key in. Kind of like we need more than one person at the bank to insert their key to unseal the vault. We need more than one person to supply their key to unseal the vault. And that prevents one employee from going totally rogue. So if you have a bad employee who just wants to disclose all of your information, you can seal the vault, which is what we call the break glass procedure. It's like, in case of emergency, break the glass. But if that one person could just unseal the vault, that would be senseless. So we rely on the system of checks and balances where multiple people have to interact with the system to perform these really sensitive operations.

Starting point is 01:06:09 And that's based on that Shamir secret sharing algorithm, which I used five and three as an example, because those are the default numbers, but they're totally customizable. We have some companies that do upwards of 20 key shares. So both the number of key shares and the threshold of people who have to come together are configurable. So that step there of unsealing it, I believe is step three in this interactive demo where it was just kind of scary language was like vault does not store the master key as you mentioned there without at least one key which is they give you the key one with one single key because that's what you said to do on the init process which was how many key shares you said equals one in this demo you could have said five or three as you mentioned in your example there

Starting point is 01:06:44 but it kind of scary when it says your vault remained permanently sealed and i was just So you said equals one in this demo. You could have said five or three, as you mentioned in your example there. But it kind of scared me when it says your vault remained permanently sealed. And I was just thinking, where do I keep these keys? So sure, I can break them up into shares and give you one, give Jared one, give other people one that, you know, that maybe they're part of my organization to have access to vault to seal or unseal it. But I was thinking, so this primary key, since in this case, in the init process, which is one single share, there's only one key. So where do I keep that? Yeah, that's a great question.

Starting point is 01:07:14 Because you don't want me to put it in that MD file I told you about earlier with instructions on how to unseal and seal vault, right? You want me to put that somewhere more safe. So that was really meant to be a ha-ha thing, but it was a great, complex, complex, deep example of, of, of which I think was very valuable, but I was really meaning it to be a slight joke, but you're didn't laugh that much. Uh, no, most definitely sticky note. Um, definitely in the place where you want to, you know, for sure. Right. Um, I mean, I do think that's a good

Starting point is 01:07:40 question. So, you know, it really depends on the size of your tinfoil hat. And those of you that aren't familiar with people who are really security conscious are often referred to as tinfoil hats because the government can read their thoughts otherwise. Right, right. So there are some people who are very security conscious who will actually print out that key on a physical piece of paper and place it in a physical safe. The justification there is like if an EMP came by, it would destroy that key and then they wouldn't be able to help contribute

Starting point is 01:08:09 to an unsealing process. Other people are like, I'm going to put it on a thumb drive. So it's on a completely separate medium or something like a YubiKey. You know, that's personally what two people at HashiCorp do. They mount an encrypted DMG volume on a thumb drive that they bring with

Starting point is 01:08:25 them to places, but they only plug it into their laptop whenever they need to get their unseal key. Me personally, I have GPG encrypted my unseal key and I store the GPG encrypted version in 1Password. So we use 1Password internally at HashiCorp. Again, because as I said earlier at the beginning of the show, they're really different targets. So I store my unsealed key, the GPG encrypted version, in 1Password. And my GPG key is not stored in 1Password.

Starting point is 01:09:00 So an attacker would have to compromise my entire laptop, 1Password servers, and my GPG key in order to get my unsealed key. At which point, it's one unsealed key. Even if an attacker gets an unsealed key key they have to have a quorum of them they have to meet that threshold in order to do any damage in the system and that that threshold internally at hash group is rather high so the risk value is low uh it's still there obviously there's still risk value of the physical safe too right someone could rip it out of the wall um but it's really truly You're never truly secure. It's just measures of security, right? Layers of security, so to speak, even. It's all onions.

Starting point is 01:09:29 And that's where Vault is different, is that in Vault, we treat security as the onion layer, the whole nine yards, but also there's a lifetime associated with it. Because if you give someone 10 years to get to the core of an onion, they're going to make it there. But if every 30 days or 15 hours or 20 minutes,

Starting point is 01:09:45 the core of that onion is changing, they're never going to make it on time with modern computing. And as computing evolves, we can just make our things faster as well. It makes a lot of sense. I love it. So the seal and seal process is simply, like you said, it's a break glass.

Starting point is 01:10:00 You're going to do it on the init state to unseal it so that you can put things in it, right? And then you're going to do it on the init state to unseal it so that you can put things in it right and then you're going to only seal it again if it's that break glass in in case of emergency kind of situation where you want to completely shut down your vault basically or stop it from interacting with the various services or whatever to essentially revote all access to vault not simply just a secret you have inside of it. Right. Cool. So getting started,

Starting point is 01:10:28 we'll have that linked up in the show notes. You mentioned a graphic. We'll get that from you after the show. We might even try to actually put that into the show notes, not just linked to it, but if we can actually embed it into the show notes, like an image, you know,

Starting point is 01:10:38 the internet, you can put the image tag on, you know, a piece of document on the web and boom, it shows up. We're going to do that. We're going to try our best. Hashtag HTML, man. Hashtag HTML, that's right, Seth.

Starting point is 01:10:49 Seth, this is closing out. What else should we share with the audience about either yourself, HashiCorp, the future where you guys are going, Volt, that we haven't shared well enough in this show? I think the only other thing I would like to mention is if you're already a HashiCorp fan or you want to be one, we have a number of events around the world.

Starting point is 01:11:10 We call them HUGS, H-U-G's, HashiCorp User Groups. They're kind of our own meetups. They're sponsored by us, but they're run by the community. There's probably one in a town near you. And if you happen to be near New York City, London London or Austin, New York City and London this year, we're hosting these things called Hashi Days. They're one day events where you can come hang out

Starting point is 01:11:30 with the HashiCorp engineers and the people who work on these tools and the customers who are using them at large scale and ask them questions and get answers to your things. And if you want to come to the big shebang, HashiConf, that's going to be in Austin this year and we're going to be announcing the dates shortly. But, you know, that's where you're going to hear from, you know, all the

Starting point is 01:11:49 major players in the industry talking about how HashiCorp is, you know, changing the way that they're working. And we might have a new product announcement or two. I have the dates are announced, man. September 18th or 20th this year. Yeah, nice. The dates are announced. HashiCorp.com. Yeah, I don't know if it's actually been announced, but it's at least on your website. So I'm trusting that.

Starting point is 01:12:08 Yeah, I think I think the dates for HashiConf have been announced, but we haven't announced the dates for the the Hashi days yet. Oh, gotcha. Gotcha.

Starting point is 01:12:16 Those are going to be in New York City and London later this year. Well, if listeners, if you're listening to this, we like to make announcements in this email we do called Change Law Weekly. So if you don't subscribe to that, you should check listening to this, we like to make announcements in this email we do called Change Law Weekly.

Starting point is 01:12:25 So if you don't subscribe to that, you should check it out. I'm going to encourage Seth and his team to share things like this with us so that we can share with you when Hashi Days is coming around or when HashiConf has got their dates up, which they do. So if you're listening, HashiConf.com, check that out. And is it HashiDays.com or is it some other URL that you're aware of um it's gonna live on hashicorp.com whenever whenever we get it up that's a super cool name I love that I love the way that you all have extended the last name of Mitchell to great extents it's a phenomenal brand name I love it I love hugs I mean literally love hugs and I also love HUGs hashicorp user groups that's uh that's super cool man I love it. I mean, literally love hugs. And I also love HUGs,

Starting point is 01:13:06 HashiCorp user groups. That's super cool, man. I love it. So Seth, thank you so much for joining us on the show today, man. It's been a blast having you. And thanks, man. It's been great. Awesome.

Starting point is 01:13:15 Thank you for having me. All right. That wraps up this episode of The Change Log. Thanks to our sponsors, GoCD and TopTile. Thanks to Jonathan Youngblood for his editing

Starting point is 01:13:25 skills on the show great master cylinder for the awesome beats and if you're excited about our upcoming show js party it goes live on friday february 24th at the changelog.com slash js check that out and thanks for listening Bye.

Your Ad Here

The Changelog: Software Development, Open Source - Managing Secrets Using Vault (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.