Software at Scale - Software at Scale 46 - Authorization with Or Weis
Episode Date: May 10, 2022Or Weis is the CEO and founder of Permit.io, a Permission as a Service platform. Previously, he founded Rookout, a cloud-debugging tool.Apple Podcasts | Spotify | Google PodcastsMany of us have strugg...led (or are struggling) with permission management in the various applications we’ve built. The complexity of these systems always tends to increase through business requirements - for example, some content should only be accessed by paid users or users in a certain geography. Certain architectures like filesystems have hierarchical permissions that efficient evaluation, and there’s technical complexity that’s often unique to the specific application.We talk about all the complexity around permission management, and techniques to solve it in this episode. We also explore how Permit tries to solve this as a product and abstract this problem out for everyone.Highlights[0:00] - Why work on access control?[02:00] - Sources of complexity in permission management[08:00] - Which cloud system manages permissions well?[11:00] - Product-izing a solution to this problem[17:00] - What kind of companies approach you for solutions to this problem?[22:00] - Why are there research papers written about permission management?[38:00] - Permission management across the technology stack (inter-service communication)[42:00] - What are you excited about building next? This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev
 Transcript
 Discussion  (0)
    
                                         Welcome to Software at Scale, a podcast where we discuss the technical stories behind large software applications.
                                         
                                         I'm your host, Utsav Shah, and thank you for listening.
                                         
                                         Hey, welcome to another episode of the Software at Scale podcast.
                                         
                                         Joining me today is Or Weiss, the founder and CEO of Permit.io, which is a permissions as a service platform.
                                         
                                         Thank you for joining me.
                                         
                                         It's a pleasure to be here, Ustav. I'm really excited for our conversation.
                                         
                                         It's pretty early in the morning there, right, in Israel.
                                         
                                         Let me start with just asking about your background.
                                         
    
                                         My background starts in the intelligence core in the IDF.
                                         
                                         I had a long career in the IDF and then as a VP of R&D and I worked in several startups
                                         
                                         and I founded another company before this one, another DevTools company called Workout.
                                         
                                         And throughout our careers, both myself and my co-founder, we've built access control
                                         
                                         for products that we've been building probably thousands of times.
                                         
                                         But the most annoying part is that it've building probably thousands of times. But the most annoying
                                         
                                         part is that it was more than one spare product. So for example, in my previous company, Rookout,
                                         
                                         I ended up rebuilding access control five times for a product that wasn't even three years old.
                                         
    
                                         It literally drove me insane. At each point, I thought, okay, I've built this, it's perfect,
                                         
                                         I'm done. And every time it surprised me and you with more challenges coming either from the customers,
                                         
                                         from security, compliance, from the infrastructure, or also from weird angles.
                                         
                                         So for example, we were working with Cisco as a biz dev partner.
                                         
                                         They were selling Rookout directly to market.
                                         
                                         And at some point, they came in and said, we want our own back office,
                                         
                                         we want to manage users on our own, we want to assign permissions on our own, we want our
                                         
                                         salespeople to be able to work with this. And I looked at what we've built in and said,
                                         
    
                                         there's no freaking way that I can make this solution support two back offices, I have to
                                         
                                         once again, throw it out in the window and start from scratch. And I just thought, this is so silly. I don't want to do this. I want to focus on actually
                                         
                                         building my product. And I remembered feeling that sensation, that mindset across my career.
                                         
                                         And I just thought there must be a better way. And that's what brought me to create a
                                         
                                         permission service so developers can focus on building
                                         
                                         their products and not rebuilding this over and over.
                                         
                                         Maybe can you walk us through where does the complexity in permission management come in?
                                         
                                         As I think about this as a layperson who hasn't thought too much, you have different user
                                         
    
                                         types, maybe they have different attributes or different roles.
                                         
                                         When does this get complicated?
                                         
                                         So I think actually the question that you're posing
                                         
                                         is the crux of the problem at each point that we're building it it's hard to see what we'll
                                         
                                         actually need down the road i myself fail i myself also have fallen for the same fallacy
                                         
                                         um at each point that i was building this i I thought, oh, I have the entire picture. I know what I need.
                                         
                                         I'm going to build this and I'm done.
                                         
                                         But things are constantly changing.
                                         
    
                                         And so if we take a zoom back, look at broader strokes, we can see that almost every company
                                         
                                         starts with or every product starts with having admin and not admin.
                                         
                                         And then you move to admin, not admin and super admin.
                                         
                                         Then you move to access control lists.
                                         
                                         So the people on that list can do A,
                                         
                                         people on that list can do B.
                                         
                                         Then usually as you start working with customers
                                         
                                         and you need to have more structure in your permissions,
                                         
    
                                         you move to role-based access control.
                                         
                                         Then it also often comes with compliance
                                         
                                         because compliance like SOC 2
                                         
                                         specifically talks about these kinds of controls.
                                         
                                         And then you realize, oh, actually, roles are not enough because I need more granular things. Like
                                         
                                         I need roles plus ownership. It's not enough that an editor can edit files. It needs to be able to
                                         
                                         edit only his own or her files. Or you need other attributes. I want to enable this only if a
                                         
                                         customer is paying or only if they're in a specific geolocation.
                                         
    
                                         So that's our back plus ownership plus some attributes.
                                         
                                         And as you start to add more attributes, you start to slide toward attribute based access
                                         
                                         control.
                                         
                                         So everything that is a arbitrary element or property of either the identity, or the
                                         
                                         resources in the application itself, or the actions you perform
                                         
                                         on those resources are generally referred to as attributes. And as you start to add those,
                                         
                                         you either find yourself at attribute-based access control or policy-based access control.
                                         
                                         And as those gain complexity, you either try to simplify it with relationship based access control that can
                                         
    
                                         also work with graph based access control. And nowadays, most of these would translate
                                         
                                         into some kind of policy as code, the challenge is not necessarily understanding each of those
                                         
                                         models. By the way, for each of those models that I described, we can probably have a discussion
                                         
                                         for five hours straight on the different structures
                                         
                                         and layouts and objects that you can have for that. And also how you create DB schemas for it.
                                         
                                         And now we create an update mechanism for it and how you create an edit mechanism for it and a
                                         
                                         versioning mechanism for it and an auditing mechanism for it. And each of them, it will
                                         
                                         be slightly different, but also none of them is like the correct answer. application is a snowflake each application is unique otherwise it wouldn't need to exist
                                         
    
                                         because there's another application like it so you need to be able to adapt these mindsets
                                         
                                         or concepts to the concrete requirements of your product and the most challenging part
                                         
                                         there is that your product is evolving, just like your company is
                                         
                                         evolving and you as a development team are moving forward and gaining more features, more capabilities,
                                         
                                         more infrastructure. And with that evolution, your permission model will change. It will also,
                                         
                                         as we said before, be affected by what the customers want and what the product managers want and what
                                         
                                         security, compliance, infrastructure, all of that will change your product constantly. On average,
                                         
                                         every company refactors or rebuilds their permissioning system every three to five. And
                                         
    
                                         the change, depending on how they're shifting or what they're shifting to can be between a month to eight months
                                         
                                         of intensive labor of on average of three people team so the cost is also very high every time you
                                         
                                         readdress this and the organizational fixture friction is also very high because these often
                                         
                                         float in so it probably sees something like a product manager talking to a customer and the customer
                                         
                                         saying, oh, we need another role because we have this guy who's working on that department
                                         
                                         and we need them to have a slightly different set of permissions.
                                         
                                         And the product manager would go, oh, yeah, sure.
                                         
                                         Let's just go back and open up a ticket in Jira.
                                         
    
                                         And some poor schlep of a developer receives that
                                         
                                         ticket. And the people that opened that ticket don't actually realize that there's a world of
                                         
                                         pain behind that simple requirement of adding another role or making roles dynamic, or making
                                         
                                         roles more auditable or whatever it is. So it ends up just out of the blue becoming a huge project. And as it gets delayed and has more friction, more people start to clamor around it.
                                         
                                         And that puts more pressure on R&D.
                                         
                                         And then it asks.
                                         
                                         And there's this gap between the requirements and the people floating them in product security
                                         
                                         compliance and the understanding that's actually needed to build this that only resides with developers.
                                         
    
                                         So a lot of the tension is there.
                                         
                                         So it's both about the organization understanding and the developers understanding where this
                                         
                                         is going and also aligning the organization around the journey because otherwise people
                                         
                                         are constantly being surprised about simple things that are actually interesting.
                                         
                                         And like you mentioned so many different kinds of like
                                         
                                         access control like role-based attribute-based graph-based i've certainly dealt with like
                                         
                                         policy-based when it comes to like cloud system like aws so i'm just curious like off the top
                                         
                                         of your head like are there any particular cloud system that you think do permissions well or
                                         
    
                                         because permissioning in like and it i amAM and AWS has always been like the biggest source of
                                         
                                         confusion for me what's your opinion as someone who thinks about permissions like day in and day
                                         
                                         out yeah so I think the for example the AWS IAM is an amazing system it's super powerful
                                         
                                         but I see a lot of developers that are impressed by the power of that solution and consider that
                                         
                                         breadth of scope ideal. It's actually not, I think not the right way to think about it
                                         
                                         because what's right for a big cloud infrastructure solution like AWS is not the same as a SaaS
                                         
                                         application or even a PaaS application. So there are definitely
                                         
                                         concepts that you can take from it. But you probably shouldn't take it one by one and
                                         
    
                                         definitely not fall in love with it. Sometimes I actually see that too, and expect them to work
                                         
                                         perfectly for your solution. In the end of the day, all of these models are back a back the IAM
                                         
                                         system policy based systems. These are concepts and tools. They're not final
                                         
                                         product for a specific application. So you should look at them more as options and suggestions
                                         
                                         and pick and choose what's right for you throughout in a journey where you'll constantly
                                         
                                         be updating this. The reason that IAM is so powerful is it's because it needs to be. It's catering to a highly technical audience
                                         
                                         that requires super flexibility. It can allow itself to be less interfaceable or addressable
                                         
                                         by the average Joe, and it needs to cover a lot of different services. That's not your run of the
                                         
    
                                         mill application. So definitely, it's a very good example of a well designed solution, but it's not your run of the mill application. So definitely it's a very good example
                                         
                                         of a well-designed solution,
                                         
                                         but it's not the perfect example for every application.
                                         
                                         Yeah, no, I think it's definitely really flexible,
                                         
                                         but I do think the complexity is just inscrutable at times,
                                         
                                         like service accounts, like linked roles and stuff.
                                         
                                         It's just like the concepts keep piling on,
                                         
                                         but you definitely see how it allows you
                                         
    
                                         to do a lot of interesting things there's like
                                         
                                         solutions you can build on top of that like access manager getting all those access logs ensuring
                                         
                                         you have auditability then combining roles and stuff together there's like a security audit role
                                         
                                         that combines a bunch of policies so that's interesting then my question for you then is
                                         
                                         how do you productize this problem, right?
                                         
                                         Like you, as you mentioned,
                                         
                                         there's a lot of complexity here
                                         
                                         and every solution,
                                         
    
                                         every company probably needs
                                         
                                         a different set of answers to this problem
                                         
                                         based on how technical their audience is,
                                         
                                         what kind of flexibility they need.
                                         
                                         How do you build a product
                                         
                                         that encapsulates all of that?
                                         
                                         Great question.
                                         
                                         I'll start by saying that with the IAM and with AWS, there's a choice there to keep it
                                         
    
                                         more complex and not addressable for the common Joe.
                                         
                                         And that's on purpose.
                                         
                                         That's not really what's relevant for more applications.
                                         
                                         But you did
                                         
                                         mention some other things there, like the auditing and logging of it and connecting this to higher
                                         
                                         level concepts like roles. That's something that you'd probably find in almost every application.
                                         
                                         So that's something that we can definitely look on in a positive light there. So every application
                                         
                                         should probably have audit logs at some point and should have versioning on its policies and should have the ability to combine roles and to combine attributes.
                                         
    
                                         Those are, it's a matter of time until they chime into the conversation.
                                         
                                         And the way to think about it, I think, as in general with software is to think about it in a kind of modular stack. So you don't have to have everything at day one, but you want to
                                         
                                         have the right stack and components built in so you can add more capabilities as you go. So you
                                         
                                         want to start simple and you want to start with something that answers your needs now, but can
                                         
                                         grow and can add more interfaces to the other people, the other stakeholders that are involved.
                                         
                                         The way I like to think about it is in kind of three ways. And those are also the things that
                                         
                                         we offer people when they work with us. I like to think about it in best practices,
                                         
                                         infrastructure, which can ideally be open source, and then experiences and interfaces on top.
                                         
    
                                         So with best practices, you have things like decoupling the policy and code.
                                         
                                         Once you understand that things are going to change and both your application is going to
                                         
                                         be different, and both your authorization layer and policy are going to change,
                                         
                                         you understand that if you couple them together, every time you want to change one of them, you'll have to change everything. And that's going to be very painful
                                         
                                         and add a lot of friction or basically slow you down and often reforce you to redo everything for
                                         
                                         every little thing. So by decoupling policy and code, which essentially in a modern application
                                         
                                         means creating a separate microservice for authorization.
                                         
                                         You can keep your application more simple and your authorization more simple, and both
                                         
    
                                         can evolve separately, but side by side.
                                         
                                         So that's one of the key best practices.
                                         
                                         Another one would be keeping things event driven.
                                         
                                         Permissions and access control is a critical experience.
                                         
                                         You want it to be quick, you want it to be performant, you want to be consistent.
                                         
                                         So if you have something that updates in delays, you're going to have a bad time. For example,
                                         
                                         if you want a policy, only users that have paid for a feature can use it. The information there
                                         
                                         on who paid doesn't exist in your database today. That's a third party service like Stripe or
                                         
    
                                         Chargebee or PayPal. So you need a way to synchronize with
                                         
                                         that services is changing. And the best way to do that is to listen into events. So you have events
                                         
                                         propagating in from different services. And you allow your authorization layer to be updated by
                                         
                                         those events in a real time manner. There are more best practices, but we can circle back to them in
                                         
                                         a minute. The second part is building the right infrastructure.
                                         
                                         If you have a plug, a pluggable infrastructure that is extensible, once you want to add more
                                         
                                         interfaces on top.
                                         
                                         So for everyone starts, as we said, with just having basic permissions, basic enforcement
                                         
    
                                         and a really simple model.
                                         
                                         But on top of the model changing, you want more capabilities on top.
                                         
                                         So you probably want to add user management with the ability to assign roles.
                                         
                                         And you want to add API key management because you also provide some automation.
                                         
                                         You want secrets management, you want audit logs, you want to be able to see who did what
                                         
                                         in your system.
                                         
                                         You want to multi tenancy, you'd want impersonation for the ability to see who did what within
                                         
                                         the system logging in as that
                                         
    
                                         user. You'd want approval flows, asking permissions from another user. And this list is, first of all,
                                         
                                         things you've seen a billion times, and also never ending. There's always another item to add to that
                                         
                                         list. So if we design the authorization layer where those kind of interfaces, experiences can
                                         
                                         plug in on, we can grow gradually with the evolution of the application.
                                         
                                         So we don't need to have impersonation, for example, at day one.
                                         
                                         But we want to be able to easily add it
                                         
                                         without refactoring everything when we get to that point.
                                         
                                         And if we use the right best practices
                                         
    
                                         and the right infrastructure, we'll be able to.
                                         
                                         And it's just a matter of either adopting the right tools or learning yourself how
                                         
                                         to work with those tools and best practices. And lastly, these are the experiences themselves. I
                                         
                                         think it's the recognition that it's not you're not just delivering a feature here, you're delivering
                                         
                                         a organizational pattern here. So it's not just the developers being involved with this. It's
                                         
                                         all the other stakeholders, product managers, security, compliance. They'll need a modicum of
                                         
                                         self-determines, an ability to manage this on their own and at least chime in on the conversation.
                                         
                                         So we want to be able to provide them with interfaces early on, not necessarily at day one,
                                         
    
                                         but we want to plug in those interfaces.
                                         
                                         Once we recognize that,
                                         
                                         we are ready for most patterns of that evolution.
                                         
                                         And once we have interfaces for ourselves,
                                         
                                         we can also offer interfaces for our customers,
                                         
                                         which is also something that arrives pretty early.
                                         
                                         The customers themselves want this democratized.
                                         
                                         They want to be able to control
                                         
    
                                         who they're adding to their organization within your application.
                                         
                                         A subset of permissions that they can mutate on their own, maybe create a few roles on
                                         
                                         their own or attributes on their own, et cetera, et cetera.
                                         
                                         So generally at what stage of company does like someone approach you?
                                         
                                         Have they generally like built an auth system or two and they realized
                                         
                                         they should be outsourcing this is it like different for like b2b companies versus b2c
                                         
                                         companies because i can totally understand that kind of complexity with oh if you want to do like
                                         
                                         geolocation based permission checks like i don't want to build that on my own yeah so we're seeing
                                         
    
                                         companies of all sizes all of them arrive
                                         
                                         to us at the point where they they are actively working on this they are actively thinking about
                                         
                                         this because some requirement has come that has came in and changed the way that they need to
                                         
                                         build this we are seeing companies starting at square one just saying i have so much else to
                                         
                                         build i don't want to deal with this at all. I think in
                                         
                                         general, that's the common thread. Developers often don't care about this. They want this to
                                         
                                         work well, but it's not a unique part of their product. And just they don't want to build billing
                                         
                                         or authentication. No one really wants to build this and definitely not build this and make errors
                                         
    
                                         while building it. The other two types are either companies have already built something in place
                                         
                                         and realized that they need to change it because of those incoming requirements, or companies even
                                         
                                         going through a more significant change. So we see big companies, for example, as they're going
                                         
                                         through an IPO process or an M&A process, there are a lot of demands coming in, pushing also
                                         
                                         critical timeline on the changes that they need to apply, or when they're
                                         
                                         doing significant infrastructure change. So for example, we had several companies moving from
                                         
                                         monoliths to microservices. So when you're working with a monolith, you can often rely on the built
                                         
                                         in access control mechanism. So for like in Django, in Python, or Spring Framework, there are some
                                         
    
                                         basic RBAC admin panels baked in. The moment you move to
                                         
                                         microservices, that just stops working at all, especially if you're polyglot, if you have multiple
                                         
                                         languages. So that often brings players to the table. And the painful part is if you arrive at
                                         
                                         this later than earlier, the amount of refactoring you have to do is where most of the pain is.
                                         
                                         And I think the most painful parts are people that have already learned,
                                         
                                         they've glanced there's a different way to work about this.
                                         
                                         They've decided, we don't want to put the effort of changing this,
                                         
                                         we'll just tweak what we have.
                                         
    
                                         And then they come back a year later and saying,
                                         
                                         okay, we realized that didn't solve it.
                                         
                                         And now we have to completely revamp it. And we actually added more friction on the way. So bottom line, we're seeing companies
                                         
                                         of all sizes, but they come in with different requirements and different needs. And the idea,
                                         
                                         like I said before, is to enable them to find a quick solution for what they need now,
                                         
                                         and gradually evolve with it. Okay, that's interesting interesting to know and doesn't match my intuition like i
                                         
                                         guess i i assume that as companies get bigger they would run into this but i guess it makes
                                         
                                         sense that sometimes people just they know that this is going to be a problem from like their
                                         
    
                                         previous job and they're like i'm just going to outsource this from day one so i just don't have
                                         
                                         to think about this at all i think what the difference there is that people are learning
                                         
                                         that this is an option,
                                         
                                         just like with authentication. If you go five, seven years back, most companies would say,
                                         
                                         why do I need to use an authentication vendor? I can just store passwords. What's the big deal?
                                         
                                         And now most, I think most developers would react to that and say, okay, that's insane.
                                         
                                         Storing passwords is really hard. It's the security and cryptographic aspects of it,
                                         
                                         like hashing and salting and just tracking everything and doing SSO around that. That's
                                         
    
                                         a huge pain point. And there's no unique value in implementing this again. And as people learn that
                                         
                                         authentication solutions are an option, and that they are readily available, the mindset shifted.
                                         
                                         I think the same thing is happening
                                         
                                         now with authorization, a lot of developers are learning that they don't have to build this.
                                         
                                         And most of them don't want to build this anyway. So if there's an alternative, they'll they often
                                         
                                         stick to it. Some people are still struggling to saying, Oh, I've been building this for it's
                                         
                                         actually with the bigger companies. So we've been, we've built this huge complex thing that we're really proud of. So what if it doesn't meet our
                                         
                                         requirements anymore? So what if it doesn't meet the modern standards anymore? I think I can make
                                         
    
                                         this work. And they're right. But every time they make that statement, they're just postponing
                                         
                                         another point where they'll have to reconsider and actually adopt
                                         
                                         the modern patterns. Because there's it's again, it's not about having the right solution. Now,
                                         
                                         it's about having something that can evolve quickly. Okay, then one question that I have
                                         
                                         for you is like you mentioned the Google Zanzibar paper in one of your documents,
                                         
                                         maybe you can walk us through through, even behind the scenes,
                                         
                                         permission management is not easy to run
                                         
                                         in a nice and fast and consistent and scalable way.
                                         
    
                                         Why have people written research papers about this?
                                         
                                         Isn't there just an access control list
                                         
                                         and you need to check whether a person's in the list
                                         
                                         or not in the list?
                                         
                                         Where does the performance challenge come in?
                                         
                                         First of all, you need to realize that the average microservice sends three authorization
                                         
                                         queries for every request it gets.
                                         
                                         So if your authorization layer is inefficient, you're going to have a bad time because you
                                         
    
                                         if it adds, let's say, 50 milliseconds, you're quickly getting to several hundreds of milliseconds before your application has done anything.
                                         
                                         On top of that, there are other hidden complexities in how you store the data that you need for authorization and how you fetch it.
                                         
                                         The data that you have for the application, first of all, is not all the data that you need for authorization.
                                         
                                         We already covered the third-party services and distributed data plane and data sinks
                                         
                                         you're working with.
                                         
                                         But even just the data for the application itself, the way you structure the schema of
                                         
                                         your database for the application is not the ideal way to structure it for the authorization
                                         
                                         layer because they're actually querying different things and they need to do different joins and different aggregates. And you see that often that pain point starts
                                         
    
                                         when people are moving from RBAC to attribute based. So they're piling in attributes, just
                                         
                                         adding more queries to the database, essentially. And initially, it's fine. But then at some point,
                                         
                                         the database chokes, because there are too many queries, they're too slow,
                                         
                                         and while the authorization layer might be still quick, the underlying data layer can't
                                         
                                         really support it, and everything screeches to a halt.
                                         
                                         And so there's complexities in how you store your data, how you propagate it, and how you
                                         
                                         manage its schemas.
                                         
                                         And lastly, and that's something that is actually unique specifically to Zanzibar, is how you manage its schemas. And lastly, and that's something that is actually unique
                                         
    
                                         specifically to Zanzibar, is how you apply consistency. So one of the key challenges
                                         
                                         when you have a large complex system is things can change while the system, for example,
                                         
                                         you're sending a request to the service, it starts at microservice one. And as that microservice is querying another microservice, during that transaction, the
                                         
                                         world picture, the data for authorization has changed.
                                         
                                         That's often referred to as the new enemy problem or a subset of the new enemy problem.
                                         
                                         So now you have, as you're running queries for your systems, you're handling requests,
                                         
                                         they're inconsistent.
                                         
                                         So now you can have a case where at one moment, you're giving someone permissions and the other one they
                                         
    
                                         don't have, or they have a different set of permissions. And you end up either failing the
                                         
                                         request or providing the wrong result, or worse leaking data or access that you weren't supposed
                                         
                                         to. And that's something that's really hard to track, especially if you have a high-scale system.
                                         
                                         So in general, taking a step back,
                                         
                                         there are two camps today.
                                         
                                         What's interesting about the authorization landscape
                                         
                                         is it's still nascent.
                                         
                                         It's still evolving.
                                         
    
                                         As, I don't know, humanity, society,
                                         
                                         I don't know what you want to call it,
                                         
                                         we haven't decided on what are the best practices
                                         
                                         and standards.
                                         
                                         We have some of them, but it the best practices and standards. We have
                                         
                                         some of them, but it's not finalized yet. We're still writing that book. So unlike with authentication
                                         
                                         and with JSON web tokens and with SAML and OpenID Connect on the IAM side, things are still evolving
                                         
                                         in the authorization space. And currently there are two camps. There's the code-based camp and
                                         
    
                                         the graph-based camp for implementing access
                                         
                                         control. In the code based camp, you'd find things like open policy agent, which essentially says,
                                         
                                         you should write policies loaded into an engine, a load data in the form of JSON documents in that
                                         
                                         engine, you can have that engine run as a sidecar or as a cluster next to your services, and then
                                         
                                         they can query it. It's really the equivalent of
                                         
                                         the policy decision point in the ex ACML methodology for those who are familiar with it.
                                         
                                         And the graph based camp says something different. There's a lot of data here, a lot of complexity,
                                         
                                         a lot of users, we need to manage it in a consistent picture and consistent graph,
                                         
    
                                         and be able to query it all the time in an efficient manner. And these camps have pros and
                                         
                                         cons that I'll try to run through some of them quickly. So with code, first of all, code is
                                         
                                         Turing complete. So you can describe any policy that you want. With a graph, you can't have a
                                         
                                         Turing complete really, because then navigation on the graph won't be efficient. The moment you
                                         
                                         make it cyclical, And the more it's
                                         
                                         not a DAG, not a direct acyclical graph, it's going especially if the graph is large, you're
                                         
                                         going to have a really bad time navigating through it. And it will most likely fail. So you can only
                                         
                                         have more with Zanzibar and most graph based solutions, you can only have more simple policies,
                                         
    
                                         mostly around relationship based access control. But it's really
                                         
                                         great to describe hierarchies like nested files or folders or organizational structures, but it fails
                                         
                                         when you start to do multiple attributes, for example, when you try to do more a back, I never
                                         
                                         thing is the ability to do reverse indices. So you often ask the question in authorization, can who staff access this thing. But a lot
                                         
                                         of times you want the reverse of that you want to ask who can access that thing. So
                                         
                                         with code, if you have code, this answering the question can X you it's basically impossible
                                         
                                         to get the reverse code only runs one way. You can try and maybe brute force it and enumerate all the options, but that's
                                         
                                         really a bad way to do that. With a graph, you have the advantage of navigating the other way
                                         
    
                                         around. So you can get, basically we get reverse indices out of the box. That's what some people
                                         
                                         call the spice of Google Zanzibar. The graph, because you're managing a big graph in the cloud,
                                         
                                         you get consistency. You control all of the pictures. So you can make sure that picture is consistent.
                                         
                                         But when you work with a distributed layout, it's harder to do. But if you work with a with a graph,
                                         
                                         and it's, it's a big graph that is remote from the services themselves, you're paying for latency,
                                         
                                         when you're querying it as opposed to a small, efficient agent at the edge that
                                         
                                         you can query.
                                         
                                         So you can see that there are more pros and cons, but there are a lot of them that we've
                                         
    
                                         already touched on.
                                         
                                         And another thing that I think is interesting to see is that they're complementary.
                                         
                                         So what the policy, what code is good for is the complementary or opposite image, mirror
                                         
                                         image of what the graph is good for. So what I'm
                                         
                                         actually advocating for is using both, is using both the graph-based solution to manage a bigger
                                         
                                         picture in the cloud, and to use the code base to have efficient answers at the edge. And if you
                                         
                                         have a component in between that syncs the two, you can actually enjoy both options. And I think that's
                                         
                                         probably the ideal way to think about it. But it's still evolving. We'll still have to see
                                         
    
                                         where things go. So like the ideal graph based solution would be like a Google Drive or something
                                         
                                         where you might mark this person has access to this folder therefore they have access to every filed and
                                         
                                         recursive subdirectory and that gets complicated really quickly because you could have tons of
                                         
                                         subdirectories and they all need to do it so you need to traverse and that there's a code-based
                                         
                                         solution is tricky and you're advocating for keeping these both of them because they have
                                         
                                         these different use cases and then you have to figure
                                         
                                         out how to keep them consistent which is like tricky yeah and yeah the more i think about it
                                         
                                         there's it's not just a google drive that needs it like anybody who maintains things like here's
                                         
    
                                         a collection of documents that maybe are not like don't have a lot of subdirectories, but you can add permissions
                                         
                                         to the collection, you can add permission to the document itself.
                                         
                                         So a lot of people are like building something in use case like Figma, or even like the company
                                         
                                         that I work at might have to think about this kind of stuff.
                                         
                                         And yeah, I guess I just didn't appreciate how complicated all of this could be.
                                         
                                         And to be sure, just to clarify, we're just
                                         
                                         scratching the surface here. Just on Google Zanzibar, we can talk easily for 10 hours and
                                         
                                         not get to all the concepts there. We didn't even touch on the main reason that Google Zanzibar was
                                         
    
                                         created, which is great scale. So if you just have a few users and a few objects that you're
                                         
                                         interacting with, it doesn't really matter how you manage
                                         
                                         this.
                                         
                                         You can just shove it into a database, make most of the available data in cache, and it
                                         
                                         would just work.
                                         
                                         But as you start to move from hundreds of thousands to millions and above that, both
                                         
                                         managing all of that data and the continuous scaling up of that data, that's what's going
                                         
                                         to get you.
                                         
    
                                         And so Google Zanzibar was built for those scales.
                                         
                                         It was built to maintain that constant huge picture for things like Google Drive and YouTube,
                                         
                                         which are running within Google and Google Zanzibar.
                                         
                                         I should probably mention also that there are open source implementations of Google
                                         
                                         Zanzibar. So Google hasn't released Zanzib that there are open source implementations of Google Zensibar.
                                         
                                         So Google hasn't released Zensibar as an open source.
                                         
                                         They just threw a white paper at us. But some cool folks at companies like AuthZ and Auth0 have taken up the mantle of implementing
                                         
                                         it.
                                         
    
                                         They actually haven't implemented it fully, but it's getting there.
                                         
                                         But I think it's important to understand that for most companies, at least at the beginning, you don't need Zanzibar, you're not going to run things at Google
                                         
                                         scale, you might need to be able to grow into that scale down the road. And that's an important
                                         
                                         difference. So you want to create a modular solution with the interfaces that will later
                                         
                                         on enable you to change your data layer into something like Zanzibar, for example, you can
                                         
                                         definitely start with Zanzibar at they want, but you need to understand that
                                         
                                         there are trade-offs.
                                         
                                         So you will, for example, you'll have more latency and perform and general performance
                                         
    
                                         to aggregate, but you'll get a better picture, more consistent, and you'll have an easier
                                         
                                         time scaling.
                                         
                                         But I think if anyone takes anything out of this is you should stick to the best practices.
                                         
                                         Decouple your policy
                                         
                                         and code, create a separate authorization layer, have an event-driven fashion to update it and
                                         
                                         have it modular enough so you can layer interfaces on top. And then it doesn't matter. You can start
                                         
                                         with the stupidest thing. You can have a microservice that always returns true for any
                                         
                                         authorization query. That would be a good place to start because you can build on top of that,
                                         
    
                                         as opposed to having something baked in into some if in your code that later on, if you want to
                                         
                                         refactor, you have to do a full code review and change everything in the application itself.
                                         
                                         So start simple, start modular, grow gradually, you don't have to cover all of this in day one.
                                         
                                         It's also so hard to code review or like check for correctness with authentication
                                         
                                         checks or like authorization checks. Like very few people write sufficient integration tests when
                                         
                                         they add things like permissions logic or like they evolve it from admin, non-admin to something
                                         
                                         more involved. So refactoring that code is often like another whole project.
                                         
                                         That's why also the system themselves, the way you manage the code, you rarely see in the modern
                                         
    
                                         solutions, just functional code. You don't see Python or Java as the recommended language to
                                         
                                         write policies in, because it's hard to make sure that you cover all your bases when you're running because unless
                                         
                                         if you have a rule but you don't invoke that rule you're you basically you're screwed but with for
                                         
                                         example with opa or also they are using logical programming languages they're both derivatives
                                         
                                         of prologue so also is a derivative of prolog, OPA is a rego, the language for OPA
                                         
                                         is a derivative of Datalog, which is derivative of Prolog. And the idea there is that you have
                                         
                                         a recursive engine that runs through all of the rules that are defined in a performant way. And
                                         
                                         that way it ensures that you cover all your bases. Same thing is true of the graph, you have an
                                         
    
                                         engine that does the graph navigation for you. So as long as you structured the graph correctly, it's going to do what you're
                                         
                                         planning for. So it translates the problem from making sure that you cover all the bases within
                                         
                                         the logical layer of the policy to structuring the policy correctly and auditing the policy itself,
                                         
                                         takes it on another level higher and enables
                                         
                                         you to focus on what you actually want as opposed to how it should work with prologue it really
                                         
                                         takes me back to college like thinking about data flow languages i haven't thought about that in a
                                         
                                         while but we've been talking about opa like open policy agent right so there's two separate
                                         
                                         permission conversation
                                         
    
                                         that we're having.
                                         
                                         One is for like the end user
                                         
                                         when you want to build like a system
                                         
                                         that lets a certain user access
                                         
                                         certain party for application
                                         
                                         or a certain document or whatever.
                                         
                                         There's also the microservice,
                                         
                                         can this service call this other service type of logic,
                                         
    
                                         which I think OPA helps with
                                         
                                         because like OPA,
                                         
                                         you can put that into like your Kubernetes, you can put that in as like a sidecar, as you mentioned. But the more I think
                                         
                                         about it, you're basically trying to solve the same problem within your product and as like an
                                         
                                         infrastructure component. Like does that sound right to you? Like, what do you think?
                                         
                                         Yeah, yeah. So both OPA and also our general purpose decision engines, you can use them to make whichever decisions are relevant to you. They're focused on policy, but they're general purpose decision engines. OPA got its real kick, its real control across the stack. You need physical access control.
                                         
                                         You need like locks on door.
                                         
                                         And then you need network level access control,
                                         
    
                                         like firewalls and zero trust networks.
                                         
                                         Then you have infrastructure level access control
                                         
                                         with admission control and service to service access control.
                                         
                                         And then you have application level access control.
                                         
                                         And then it evolves more and more in complexity
                                         
                                         within the application layer into more logical.
                                         
                                         And OPA really got its go in the infrastructure authorization layer.
                                         
                                         And it's actually quite difficult on its own to take it to the application layer.
                                         
    
                                         The big problem there is how do you keep it in sync with the changing application?
                                         
                                         Like a new user is paid for the service.
                                         
                                         How do I make sure that OPA knows about that user? Or we change the policy,
                                         
                                         we added a new role and we did it from the UI. How do we make OPA know that there's a new role now?
                                         
                                         And that's actually solved by another open source project. I'm actually wearing the t-shirt for it
                                         
                                         now. So we created OPAL, Open Policy Administration Layer, that essentially takes that event-driven best practice and applies it to policy.
                                         
                                         You are able to subscribe to topics for both policy and data.
                                         
                                         And as events come in, they propagate into each of the instances at the edge, keeping them constantly up to date with both the policy and data that they
                                         
    
                                         need, and only those that they need. And so you have a distributed administration layer for OPA,
                                         
                                         and you can have your different third party services that are changing with your applications,
                                         
                                         webhook and notify Opal on what has changed. And you can have your Git repository webhook on policy changes to Opal,
                                         
                                         and it will pick those elements and trickle them down like rain to the various
                                         
                                         Opal agents through what we call the Opal client. Opal does two things through that. One, it
                                         
                                         solves that challenge of bringing Opal to the application there. And two, it really helps you
                                         
                                         tackle the inconsistency problem
                                         
                                         because it really focuses on propagating events quickly.
                                         
    
                                         So the agents at the edge, even if they don't have the data,
                                         
                                         they know that they're missing data, that the picture has changed.
                                         
                                         And you can already start seeing this working with something like Zanzibar.
                                         
                                         So if you have a big graph in the cloud managing the bigger aspects, you can take subsets of it through Opal as the graph changes and
                                         
                                         propagate them in real time into each of the edge nodes. So each edge node has what it needs
                                         
                                         being supported by the bigger picture managed for everyone in the cloud. So that kind of also
                                         
                                         touches on the hybrid solution that we're seeing here,
                                         
                                         and also how we are literally moving towards the hybrid solution and implementing it.
                                         
    
                                         So your company is not just working on like end user like application security,
                                         
                                         but it's also working on tooling for basically permissions across the stack.
                                         
                                         Yeah, so we just we just try to solve this. So our notion is, developers don't want to build this,
                                         
                                         it's really hard to build, there's a lot of complexities, it's really hard to be aware of
                                         
                                         all those complexities, we want to abstract those away, we want to always enable developers have
                                         
                                         access to the code to manage this with GitOps to manage this with infrastructure that they control.
                                         
                                         But unless they want to do something, they shouldn't
                                         
                                         be forced to. They should have the option, but not the responsibility all the time. I don't think
                                         
    
                                         most people care about the difference between RBAC and ABAC. And I don't think they should.
                                         
                                         I think a solution should abstract that and enable you to dive into that only when it's relevant.
                                         
                                         You should be able to start simple,
                                         
                                         build this, have it work and grow with you as you go. And the way to achieve this is by creating
                                         
                                         standards. It's by creating solutions that are inherently built to address the problem and are
                                         
                                         flexible enough to be extensible by the different snowflake solutions that need to use them. And that's really the mindset that we had with Opal.
                                         
                                         And also why I think it's, though it's a really young project, it's only a year old.
                                         
                                         I think that's why it's seeing so much success.
                                         
    
                                         It's already in use in companies like Tesla in production, in Zapier, Accenture, and
                                         
                                         dozens of others.
                                         
                                         And as a significant community in Slack of people asking questions on a daily basis,
                                         
                                         I think we were able to do that because we built something that is both powerful enough and flexible enough for developers to adjust it for what they're building. Yeah, I'm noticing this
                                         
                                         consolidation across the industry around standards and going up the stack. It's very similar to what
                                         
                                         AWS is doing, but more in like the open source ways.
                                         
                                         Like now you have like open telemetry.
                                         
                                         I was talking to the LightStep people a few years ago
                                         
    
                                         and it really seemed like it's matured.
                                         
                                         And now I'm guessing there's like more and more standards
                                         
                                         coming out on authorization,
                                         
                                         like how you should be doing this.
                                         
                                         People are converging on to OPA
                                         
                                         and saying this is the way it should be done.
                                         
                                         It's interesting to see, see yeah as the industry matures
                                         
                                         you think less about the infrastructure that's running your systems and more about your end
                                         
    
                                         use cases you have to is it's basically the story of humankind right at the beginning we were working
                                         
                                         like you had uh you just pick a stone and use that to hunt or to cut your meat or whatever and then one day someone
                                         
                                         came in and said oh you should take stone from that guy he make good stone and then everyone
                                         
                                         said you should take spear wood shaft from that guy he makes good shaft and then one guy then one
                                         
                                         day someone came in and offered you a shaft with a stone already tied to it and say, oh, this is much better than getting it and assembling it on my own.
                                         
                                         And we constantly spread out, create new solutions, then we consolidate and then we build more
                                         
                                         layers on top.
                                         
                                         And every time we add a layer on top, we have to specialize.
                                         
    
                                         We have to create people that are or solutions that are specialized in building that.
                                         
                                         So other people don't have to understand all of those complexities.
                                         
                                         And the same thing is happening here.
                                         
                                         The only difference is that
                                         
                                         we don't have the right answer yet.
                                         
                                         It's still evolving.
                                         
                                         So what we're trying to do as a vendor
                                         
                                         is to give you that promise of
                                         
    
                                         no matter what spear or sling
                                         
                                         will come into existence,
                                         
                                         we'll wrap it for you
                                         
                                         and make it available for you.
                                         
                                         So you don't have to care about it.
                                         
                                         As you go, you can focus on building your product.
                                         
                                         And I also think it's our responsibility
                                         
                                         to chime in on the conversation
                                         
    
                                         and make sure that together
                                         
                                         through the open source we're offering
                                         
                                         and through integrations that are being built,
                                         
                                         we create the right standards.
                                         
                                         That's why we took this open source.
                                         
                                         So we can have a public conversation
                                         
                                         on how we can
                                         
                                         all together build the correct thing for again us as society humanity whatever you want to
                                         
    
                                         so then let me wrap up with what are you most excited about what you're building what's the
                                         
                                         next big thing that you're excited about what's like the next feature or like the next project that's a good question i'd say i'm
                                         
                                         most excited about the human interfaces which is funny to say for a developer tools product but i
                                         
                                         think that's really key because when we explore the space when so we started with our own pain
                                         
                                         but we wanted to see how it looks across the space. So we looked in into the bigger
                                         
                                         organizations like the Facebook and Google as a glimpse into the future. And what we realized
                                         
                                         there is that a they've invested a lot of time to build this. So for example, in Facebook,
                                         
                                         they invested a team of 30 people for half a decade to just build the infrastructure components
                                         
    
                                         for their X. And what they did is two things.
                                         
                                         One, they, at some point, they had to move from just static rules, just policy you create
                                         
                                         to a intelligent component, to a machine learning component that can react to the gray points
                                         
                                         between the policies.
                                         
                                         And B, that AI ends up translating the interactions back into organizational behaviors
                                         
                                         and flows. So for example, when an employee tries to access the Facebook database, or the metadata
                                         
                                         database, I should say, and they're querying more data than they probably should, or they do on
                                         
                                         average, the AI can detect that as an anomaly. But because they
                                         
    
                                         want business to continue, they don't just shut it down. Because you have thousands of employees
                                         
                                         doing thousands of things. If you just shut down everything that passes the anomaly, things will
                                         
                                         just screech to a halt. So what they do instead is they translate that into human interactions.
                                         
                                         So for example, they ask the team lead for that person,
                                         
                                         is what they're doing okay? Is there an assignment around this? Should we throttle this? Should we limit this? Maybe you should talk to them. And by going back to conversations and having
                                         
                                         the people align back with the machine, they're able to both keep it secure and keep it fast
                                         
                                         enough for the business to run. And I think that's something
                                         
                                         that's coming up for all of us, both the as we're like, when we're building applications down,
                                         
    
                                         it's mostly we're thinking about human users using our applications, but more and more, it's
                                         
                                         applications on behalf of applications on behalf of applications on behalf of applications,
                                         
                                         using our application. And we're it's like with algo trading, if like in the past, it was just like humans yelling at each other,
                                         
                                         buy, sell, buy. Nowadays, it's all automated in a speed that humans can't really work with. So we
                                         
                                         need a very quick layer that can react those things, interpret it and provide back interfaces
                                         
                                         for us as humans to manage it and have it work the way
                                         
                                         we want. So what we're building today, we already covered a significant part of the basic
                                         
                                         infrastructure. And we're starting to look at the more automation around it. But mostly and more
                                         
    
                                         importantly, building interfaces, low code interfaces, no code interfaces, human conversation
                                         
                                         interfaces, that all the stakeholders can come in and
                                         
                                         build this together in a way that can move quickly.
                                         
                                         Yeah, like, it seems very similar to IAM right sizing, right?
                                         
                                         Like this kind of stuff seems like super chaotic, but it makes sense that if you notice a certain
                                         
                                         role is not using all the permissions that are assigned, AWS can tell you, you should
                                         
                                         reduce the set of permissions, increase the set AWS can tell you, you should reduce the
                                         
                                         set of permissions, increase the set of permissions, because if you see like an access denied,
                                         
    
                                         but what if you do that in a more naturalistic way? I can also imagine you can actually use
                                         
                                         your permission system to understand whether somebody is worth upselling this. Oh, this person
                                         
                                         keeps going on a feature that they don't have access to maybe show them an ad saying buy
                                         
                                         the product it's a it's interesting to think about one thing i remember from dropbox is like the
                                         
                                         highest or the biggest like the the most popular way for them to make money was when a user was
                                         
                                         over quota and they got like an error message because once they got that error message there
                                         
                                         was a click do you want to buy more space that That's what made them most of cash. So it would be cool if we had an inbuilt like feature
                                         
                                         flagging permission based system. I know you all have, I remember looking at OP to toggle, which
                                         
    
                                         does something like that, right? Yeah, so that's one of our other open source projects. So you want
                                         
                                         to be able to have a one core place where you manage your policy and have all of
                                         
                                         your application feed from that. So with Opal, we already talked about how that propagates with
                                         
                                         Opal and OPA. We talked about how that propagates to the backend. But what about the front end?
                                         
                                         You want the front end experience to also adjust. So for example, if someone's going to get an error,
                                         
                                         like a four or three error when they query the API, you don't want that to just be thrown in the
                                         
                                         UI. You want to give them a different experience. If they can't click that button,
                                         
                                         don't show them that button. And the way to do that today in general is with feature flag solutions.
                                         
    
                                         That's the way front-end applications adjust our experience. So with Optogles, you can sync your
                                         
                                         feature flag solution to your open policy. So you change your open policy and through Opal,
                                         
                                         Optogles listens in and then updates your launch directly, split IO, etc. So you can have everything
                                         
                                         chime in the right way. But more importantly, and kind of like touching on what I said before,
                                         
                                         everyone gets the right
                                         
                                         interface.
                                         
                                         So the backend engineers can work with the policy engine and the GitHub solutions.
                                         
                                         And the frontend engineers can work with what they're accustomed to, which is a feature
                                         
    
                                         flag solution.
                                         
                                         So everyone chimes in on the same conversation, but with the right interface for them.
                                         
                                         Yeah, ideally, to me, you should just have the same thing like feature flagging permissioning
                                         
                                         etc etc should just be like this one big product that manages all of that for you and helps you
                                         
                                         like maybe upsell and block unless necessary but anyways thank you so much for joining this was a
                                         
                                         lot of fun and i hope i hope you had a great time I had a great conversation thank you so much was great talking to you and I look forward to next time
                                         
                                         yeah thank you I will take you up on it
                                         
