No Priors: Artificial Intelligence | Technology | Startups - How Palantir’s AI Bet is Revolutionizing Defense and Beyond, with CTO Shyam Sankar

Episode Date: July 27, 2023

Can frontiers as high-stakes as next-generation, AI-enabled defense depend on something as mundane as data integration? Can "large language models" work in such mission critical applications? In this ...episode of No Priors, hosts Sarah Guo and Elad Gil are joined by Shyam Sankar, the Chief Technical Officer of Palantir Technologies and inventor of their famous Forward Deployed Engineering force. Early employee and longtime leader Shyam explains the evolution of technology at Palantir, from ontology and data integration to process visualization and now AI. He describes how a company of Palantir's scale has adopted foundation models and shares customer stories. They discuss the case for open source AI models fine-tuned on private, domain-specific data, and the challenges of anchoring AI models in reality. Show Links: Shyam Sankar - Chief Technical Officer - Palantir Technologies | LinkedIn   Palantir Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @ssankar Show Notes:  [0:00:00] - Palantir's CTO Discusses Company's Background [0:10:17] - Apollo and AIP [0:20:25] - Future of UI and Application Integration [0:28:29] - Investment in Co-Pilot Models and Education [0:31:22] - Exploring AI Implementation in Various Industries [0:38:19] - Operational and Analytical Workflows in Context

Transcript
Discussion (0)
Starting point is 00:00:00 Palantir has built a data-driven application platform over the past two decades that is used by governments, militaries, and some of the world's largest companies for analytics and operational decisions. Palantir recently announced a new platform, AIP, is part of a push to invest in AI. This week on the podcast, Sarah and I talked to Palantir's CTO, Sham Sankar, who was the company's first business hire and has led the company for nearly two decades previously as a COO. Sham, welcome to No Pryors. Thanks for having me. Great to be here, Sarah Nulad.
Starting point is 00:00:34 So I think you have a very unique background. I believe you grew up in Nigeria and then moved to the United States. You got interested in computers reasonably early. It'd be great to just hear your personal story and your background. Yeah, I spent the first three years in my life really in Nigeria. My father had built the first pharmaceutical manufacturing facility on the continent until then all the drugs were really imported. And we fled Nigeria during some violence.
Starting point is 00:00:58 and really resettled in the U.S. as kind of like refugees. So a great deal of gratitude, understanding the counterfactual reality of, like, you know, how the world could have ended up there. And so I grew up in Florida, I think relevant to the current age, I grew up in a time where when the space shuttles would launch, we would all file out into the recess courtyard to actually just watch it, you know, and that seemed quite normal. And it also seemed really normal that on Saturday morning at like 6 a.m.,
Starting point is 00:01:26 you'd be woken up to double sonic booms every night. out and then. And so I'm eagerly awaiting the return of the space age, that commercial space and new space has been bringing back to us here. But I made my way out to Silicon Valley in 2003 and started getting involved with startups. The first company I worked at was Zoom with an X that was founded by Kevin Hartz, and it was international money transfer company. And then after three years at Zoom, I started a Palantir as the 13th employee, really the first person on the business side and have had the most fantastic ride ever since, but never been more excited about what we're doing than what we're doing right now with AIP and the opportunities that
Starting point is 00:02:06 are in front of us. How did you originally find Palantir? Because, you know, it was a very secretive company very early on. It was sort of a very small community and technology in Silicon Valley had actually heard of it. So I was just sort of curious, like, how you connected with the company got involved. One of my friends, so Peter was also a seed fund investor in Zoom. So got some exposure there. And one of my friends was actually a freshman year roommates at Stanford with Joe Lonsdale. And so I had heard about this company that was very small at the time. And it really talked to my heartstrings. My uncle was a victim of terrorism. I was in New York during 9-11. And it just like this was going to be, I would rather fail on working on a problem
Starting point is 00:02:47 this important than doing anything else. And then you joined as you mentioned as employee number 13, the first business hire, etc. And you've had a variety of roles over time. Could you explain a little bit how your role has changed over the lifetime of the company? I think the kind of common thread through it all is just doing what you need to do, which I know sounds like a banality, but really when it when it first started, I like wrote our first kind of candidate management system. What are you doing as a business hire before you have really a product here? And I was, I was an aggressive QA tester, you could say. But the real initial contribution was what we call forward-deployed engineering. It comes from an insight that Alex kind of had around like, well,
Starting point is 00:03:25 you know, he muses why are French restaurants they're good? Well, maybe one theory is that the weight staff is actually part of the kitchen staff there. You know, it's not, they have like deep context and understanding of the food. And so the forward-deployed engineering idea was that the people who are going to be interacting with customers in the field, we're going to be computer scientists. You could actually understand what does the product do today, what does it need to do today? Under what conditions is it going to work? And how do you kind of create this hybrid role that's product management, customer success, and engineering all in one? And that's really the team that I first built up. And then as we went from Gotham and Foundry and now AIP, there's a lot
Starting point is 00:04:06 to do there. Whether it's interacting more often with the customers, thinking about the technology, or really thinking about how are you going to get to value? And I think a lot of what forward deployed engineering really gets you to think about is how do you do these things backwards? Instead of really going forward from the technology, how do you work backwards from the problems that your customers actually experience? And use that to create an accountability function. Like, is what I'm doing mattering? Did I make the life of this customer better today? Can I do better tomorrow? And using that absolute standard to judge yourself rather than saying, you know, did my software work according to spec? Like, who cares about the spec? Who cares about
Starting point is 00:04:42 what my ambition was yesterday? What's my ambition today? And why shouldn't it be bigger tomorrow? That framework is super practical. Was it the customers that you were working with early that drove the need for this? Were you trying to solve a particular problem? One, the customers working within government is just motivational, right? Like, you kind of view it and you're thinking about it and like, how do I walk as many miles in their shoes as possible? Like, how could I possibly just think my job is done because I check the box here? It's like the job is defined by how are they doing and what more could I be doing for them?
Starting point is 00:05:12 Then I think there's a more kind of cynical component of this, which is like, okay, well, are you as a company going to succeed in this sort of environment, particularly with early government customers if you don't have that sort of mentality? Because if you think about the vertical stack, you need to deliver your outcome, you're dependent on so many things going right. So if you just want to build the software at this component in the stack, if anything down here isn't working. And certainly at the time we were doing this, there really wasn't AWS yet,
Starting point is 00:05:39 but you couldn't depend on any of that stuff working. And so your visualization of what you would need to own so that you see, succeeded needed to be quite ambitious. So before AIP, the company had three key platforms, Gotham, Foundry, and Apollo. Could you tell us about what these different things do for the sake of our audience, since they're less familiar necessarily with the company? And then how does AIP fit into this? Yeah, Gotham is our flagship government product. It's really focused on intelligence and defense customers. And it helps them integrate and model their data to really drive decisions in the context of their enterprise. So in the defense community,
Starting point is 00:06:16 that I'd be thinking about in terms of the kill chain, how do I go from targets to effects on those targets? And intelligence, it's often kind of a different sort of structure, but how do I track and gain context and understanding of the things in the world that I need, I have a responsibility to understand about. You can think of Gotham is kind of conceptually at the highest part of the stack. Much of Gotham then depends on Foundry. Foundries are general purpose data integration platform. It allows you to deal with structured and unstructured data, to transform that data, to really treat data like code and then drive that through to an ontology, a semantic layer that models not only the nouns of your enterprise, like the concepts that you think about, but the verbs, the actions.
Starting point is 00:06:56 And so I think, you know, the buzzword for this is often digital twin and that can mean a lot of things to a lot of people. But, you know, how do I have some sort of conceptual understanding and model of what we do as a business and use that to actually affect decisions and then drive that all the way through to the application and decision-making layer? So not dashboards that give me visibility, but really pixels that I can make changes. So if I want to allocate inventory, I need a platform that's going to allow me to write back to SAP or read and write from my transactional systems and orchestrate my enterprise. And Foundry, I think, really gives its customers the ability to kind of squint and model their enterprise as a change series of decisions or kind of like a
Starting point is 00:07:35 decision web, and then giving them the modeling and simulation ability to understand and ask counterfactual questions. What happens if I do this? And this is the same platform that was used to build the COVID vaccine response distribution in the U.S. and the U.K., same platform that commercial companies were using to manage the supply chain crises, when suddenly a steady state kind of equilibrium wasn't really there, and being able to model the counterfactuals became really, really crucial. It's kind of interesting that you mentioned those various customers, for example, on the COVID vaccine distribution side or things like that, because, you know, the perception of the company
Starting point is 00:08:08 is very early on. A lot of the earliest customers were intelligence and defense, and then it kind of brought in from there. Is that a correct assessment? And was that intentional at the time? Or was it just you found that there's a pocket of customers that really cared about your product and you know, were a good fit for what you were doing initially? Well, yeah, we founded the company to work with intelligence and defense organizations. And really, I think, we expanded almost reluctantly. You know, I think it was like 2010 or 2011 where we started working with our first commercial customer. But really what we realized was that it took something as sexy as James Bond to motivate engineer to work on a problem as boring as data integration. But we had our own
Starting point is 00:08:47 ideas of what would be valuable in these spaces and we built software for it. But all of those ideas kind of presupposed that the data was integrated. And I think the kind of popular view is like this is a boring and solved problem. But I think it might be kind of a boring and highly unsolved a problem and that people are kind of like duct taping together everywhere they go. And so by productizing a solution to that, we kind of expanded our market. And what we could sorts of problems in the world that we could go after. Apollo is quite an interesting platform as well. So, like, we really originally built Apollo for ourselves.
Starting point is 00:09:18 If you think about our customers, we're deploying an air gap environments. So how do you deploy modern software when, you know, you can't see ICD to the target? We had to build this entire infrastructure that allowed us, because, you know, our software, it's modern software. We have 550 microservices. We're releasing multiple times a day for each one of these services. They need to be able to upgrade independently. But also, our environments are complicated.
Starting point is 00:09:41 like, hey, the submarine only wants to upgrade on these windows or these environments are not connected to the internet. So how is that going to happen? We had to build kind of what we think of is kind of a successor to CICD, which is like autonomous software delivery and deployment. So Apollo allows you to think about your software and the environments you're deploying in separately, model the dependencies, and kind of hand that to Apollo to manage and orchestrate the upgrade. It will understand what's your blue-green upgrade pattern. How do you think about health checks? How do I roll forward? How do I roll backward? How is that integrated with my understanding of vulnerabilities and CVEs? When do I need to like recall software or block software? And that software has started to get a lot more traction as people are dealing more and more with complicated environments. Not only air-gapped customer environments in defense and intelligence, but if you go to Europe where people have a strong push towards sovereign SaaS or a lot of people want you to deploy inside of their VPC as a SaaS company, how are you going to manage now having a thousand
Starting point is 00:10:40 customer environments to manage. And Apollo just makes that really easy. And then how does AIP tie into this? And can you tell us more about AIP and that initiative? Yeah, AIP allows us to, it was really a course of technologies that allow you to bring LLM powered experiences to your private network on your private data to drive the decision making everything from how do I integrate this data? But I think much more interestingly, how do I build these AI-enabled applications?
Starting point is 00:11:07 You know, it's like an application forge. And a core part of the proposition here is that these LLMs, they need tools. Like, certainly there's something quite magical about this kind of non-alorithmic compute. You know, it's neither human thought nor kind of traditional computer science. And they're very good at what they're good at. And they're also quite bad at what they're bad at. And so, like, getting this right is really about providing sometimes people call it plugins, but I think tool might be a more appropriate word.
Starting point is 00:11:32 But how do I give my customers not just a tool bench, but really a tool factory to go make the tools they need to get the most out of LLMs. Like an LLM is not going to know anything about orbital simulation or weaponeering or predicting forward inventory 30 days from now. It's certainly not going to do that well. But with the right tool, it's going to do that quite excellently, and it's going to give you a lot of leverage on the workflows
Starting point is 00:11:55 that you already have. And I think in some sense, much of the foundational work that we've done with Foundry has enabled people to run really quickly with AIP, build co-pilots that deploy into their existing decision-making surface area? Like, how do I allocate inventory? How do I adjudicate auto claims and then get that efficiency in weeks? At what point did you decide you wanted to make a big
Starting point is 00:12:21 investment in LMs? And sort of what did the company do first? I think it was really around Q4, you know, the last part of the last year there, where it just felt like, obviously the LMs are exciting, but what was more exciting to us is it felt, like the LLMs were just waiting for something like ontology. You know, it's like to really get the value out of the LLM, the way that we had modeled the world, it's almost like accidentally. We had spent the last 20 years really thinking hard about dynamic ontologies,
Starting point is 00:12:51 how you model them, why they're valuable to humans. And you can kind of think about the ontology as having this semantic layer that gives you an incredible amount of compression that you're putting into the context window and allows you to build LLM-backed functions in very reliable ways. And I think part of this is just like recognizing the LLMs, it's more like statistics and calculus. And I think this is one of the impedance mismatches for a lot of engineers who are working on them. They're kind of like they model them a little bit like calculus, a lot like calculus.
Starting point is 00:13:20 And then, you know, when it works, it works magically. When it doesn't work, it falls off a cliff. So how are you actually going to get this to work when you kind of have this like stotastic genie now? I think you're going to need a kind of a whole tool chain around that that kind of presupposes. It's a stochastic genie. And I think the ontology is one of these things that massively, grounds your LLM in your reality, in your business context, and allows you to manage that without having to change the model itself. What are some of those components that you think
Starting point is 00:13:49 are the tool chain that you need to sort of bottle the stochastic genie? And I love that phrase, by the way. I think it's a really good way to put it. So you're probably going to need everything you kind of need with the dev tool chain, but you're going to have to adjust it for the fact that it's stochastic. So you even see it like people call it eval and not unit tests, but you're going to need, like how many unit tests do you need? If you're going to write an LLLL and backed function, and it's a stochastic genie, how many times does it need to execute before you have confidence that's going to do what you want? And then, so then you can think about that, that's like day zero. Okay, so I build this thing, how do I think about it? But what sort of telemetry
Starting point is 00:14:21 and production law data do I need? And how often am I going to be looking at those traces? And it's like, I might even be writing unit tests against my traces. I guess you could call that like a health check, right? And like there's going to be a lot more emphasis that you're going a need there as an engineer as you think about using this. And then there's going to have to be some calibration on the use case. The best use case is going to be ones where when the LLM gets it right, there's massive upside. And when it doesn't, it's a no-op. Right. And so picking those ones, I think, are going to be quite important as you build and tune the specific applications of these. Going back to this idea of an ontology, I feel like I suddenly understand this much better
Starting point is 00:15:02 in that there are a lot of companies right now trying to to figure out how to take all of their messy less than perfectly integrated and largely unstructured data and create some sort of intermediate representation that the models can handle well. And if you have something like an existing ontology of your business, then leveraging it with LLMs does feel like a really natural magical fit. That's exciting. Maybe to make it a little bit more real, you could walk us through an example of like an AI tool using this tool, chain ontology that you're excited about the one of your customers is building with you? Yeah, sure.
Starting point is 00:15:40 I'll just pick an example of something we worked on recently in Hawaii, which is how do I do automated COA generation, courses of action generation from an operational plan. So the Department of Defense has these O plans, they call them, that are kind of like, that's the other thing. You have these industries where there's just a tremendous amount of doctrine, whether that's pharmaceuticals or defense, where there's so much knowledge and how we want to do things that's been written down. So you have this O plan that describes the phases of a potential conflict or the key risks and assumptions. And so you might want to do something like a non-combatant
Starting point is 00:16:16 evacuation operation. So if conflict happens here, how will we get all the civilians out of a city? Okay, well, we've thought about that and we've written that down. It's in this document. And so how do I then just say, like, build me a course of action to drive this evacuation? Well, the plan to specify the resources that you're probably going to need, the types of resources, the phases, the timing of it, the risks and assumptions you need to worry about. So then how do I take those words and then hydrate the application state that people use to manage the common operating picture? And that's a big part of what we're really thinking about, which is, you know, I kind of think of like chat is a massively limiting interface. You know, at the limit, prompts are for developers.
Starting point is 00:16:57 Now, I think that's, it's really hard. It prompts leak over to users. And users sometimes want to chat, And lots of people start with that because of the popularity of chatGBT. But really, what I want to do in this context is they're entering a question, like generate a course of action that is this evacuation operation. And what I'm getting back is not words. I'm actually getting a map with resource matrix and the requisition of the necessary resources. And, you know, I can hit a button and say yes.
Starting point is 00:17:24 And that comes to life. And so those are the sorts of experience that we're starting to build with customers now. And on the commercial end, it's really co-pilots to help people. I was just looking at a demo this morning from my team on helping a major auto manufacturer adjudicate quality and claims. So how do I manage the cost of quality, cost of warranty on the production line and post-production when these cars are in service? Well, I need to be able to cluster these claims and more efficiently understand what supplier, subcomponents, where in the supply chain, and how do I remediate these issues, how do I drive down my cost of warranty and recall? So building co-pilots that are kind of looking at the text of the claims, understanding the components, helping them identify. early indicators and signs of the kind of conditions under which these parts need to be recalled
Starting point is 00:18:09 and managed. And there it's really about human agent teeming. I was listening to the, I think you guys hosted like an AIP day, like a set of like demos and presentations recently. And one of the points of view you take is that the models themselves are increasingly commoditized and certainly more broadly available and available in open source. How do you think about the value that Palantir's building for its customers? Well, I think it's, how do you actually use the models to drive these experiences? So there's kind of two ends of that. One is the existing experiences that people have today.
Starting point is 00:18:44 So how do I build a co-pilot that's going to help me adjudicate auto claims or help me understand my production process? On the other end of that, it's like, how do I develop trust in the underlying models? You know, if we go back to the stochastic genie here, maybe we should actually think about these as like slightly deranged mad geniuses. and then, you know, are you going to only ask one of one of these experts to help you solve your problem? Like, how do you think about the configuration of mad geniuses that you actually really want to have? And I think there's-
Starting point is 00:19:14 I want an ensemble of mad geniuses. I'll feel better about that. Yeah, I think that's probably the correct direction. And I think one that model companies will have a hard time with, right? Because I think they need to have kind of one model to rule them all or directionally. That's where it needs to go. But if you're an enterprise and you're thinking about I have high-consequence decisions that I'm trying to drive, here, how do I do that? And I think certainly if you're living in a chat world where the outputs
Starting point is 00:19:36 chat, that's valuable. And you want to think about, like, where do people agree and disagree? But if you start thinking about actually the output of the model is a DSL or JSON, that outcomes even easier then. Now, you can actually parse these things in very structured ways to understand not only the consensus view, but maybe divergent views. And then then you're kind of more authentically treating this as statistics and not calculus. And then that forces you to flow that through to the UI and how you're designing this for actual human users to interact with it in a way that's thoughtfully statistical. So you've said that you don't think that, you know, chat is the be all end all, and it's
Starting point is 00:20:14 quite limiting, especially when you think about like these complex workflows, the outputs that are most efficient for your end users. What are you imagining in the near term that turns into? Is it assistance overlaid on the existing software that your customers use, whether it's things they've built with you or software they already have. Is it just automatic workflows and outputs in that software? Help us imagine it a bit. Yeah, I think the ideal visualization of it is something like,
Starting point is 00:20:45 I have an application state and I have an intent as a user. Combine those two things to give me a new application state. Now, that can be very hard. I think depending on the workflow, that might be super obvious. I think if you're thinking about this like a co-pilot, for GitHub co-pilot, it's more obvious. I have an application state and intent, and you generate something for me.
Starting point is 00:21:06 Now, it becomes less obvious when some of these things get a little more complex, and you may need a little bit of hinting and a little bit of user-prompting. That's where I think the art is. So that would be one piece of it. Then another piece of it is like, okay, let's just say that's too hard. Or it doesn't fit the use case properly,
Starting point is 00:21:20 not just about it being hard. You have the prompt that is then the return is JSON or DSL that is manipulating the applications. state. The whole point is not to give me answers, but you change my app. And then that starts changing how you think about interacting with these things. It becomes a new UI layer. You know, kind of the most extreme version of this is like, why have any UI at all? If you have really beautifully done APIs and you have, let's call it, data APIs and ontology, the data that's actually going between these APIs is incredibly well modeled. I think you can actually use the
Starting point is 00:21:55 LLM to generate a lot of the experiences that you want as an end user. Some people have been talking about this in the context of, you know, everything becomes this sort of programmatic, agent-driven world. We're five years from now, or N years from now, you just have agents that represent you as a user with a specific task interacting with other agents or APIs. To your point, you really
Starting point is 00:22:14 minimize the UI dramatically. Do you think that's the most likely future, or how do you sort of think about where all this stuff is heading from a UI perspective? It's hard to see so far in the future on this, but what I think it definitely does, when I think about the integration layer, like when you look at like the guerrillas paper, and you can, you can teach, you
Starting point is 00:22:30 Can you fine-tune an LLM to basically tell you what API to call with what parameters? Like, yes, it turns out. And so, okay, so if that's true, what does system integration look like in the future? That's going to be quite different. So then I think it allows you to create more single panes of glass that are actually truly integrated, which is incredibly hard right now. I think there's some subtle and interesting benefits. One of the consequences of a hack we had a number of months ago was that I had an engineer
Starting point is 00:22:53 who could build a feature in a couple hours that we had previously scoped. It was on the road map. It was a feature that was going to take, like, two months. and two people. And it's just simply because the amount of UI that was involved was so intensive. You just replaced the UI with language. The whole thing changes. So like that's, that's one way of thinking about, okay, well, what sort of UI are you not building today that you actually don't even have to build today? And that you probably have the tools, the primitives in the back end or the application that you can now surface. So I think that's, that's an interesting place to go there.
Starting point is 00:23:22 Like I don't know about the extreme view of like, look, there's going to be no UI, just build every UI custom every time. But I think the ability for a user to get the last mile to be what they need is going to be really powerful. It's a big unlock. And I certainly think in the enterprise context we live in, I see that all the time where there's kind of, there's no perfect solution for all the different kind of tugs you're getting from the customer to generalize that. Or the cost of generalizing it is so high that you can't actually meet the need.
Starting point is 00:23:47 But now they can make it specific, actually, to their needs. Yeah, the systems integrator point that you made, I think is super interesting because I think there's a lot of companies where part of their defensibility is the fact that you basically had to munge specialized data integrations. You know, that would be the SAPs of the world or different ERP systems, workday, et cetera. Like, a lot of these things have modes in part because it takes six months to implement them and to integrate against and customize them. And to your point, this can really simplify a lot of those things down in a pretty
Starting point is 00:24:16 perform way. So it's this pretty interesting macro shift that you're into leasing firsthand in terms of where your customers are going. I think so. I think it's going to change a lot of things. You think about, like, how much control it gives the customer in terms of, like, like, how do you manage your API surface area? Like, people have tried to do this with, like, API buses or bridges.
Starting point is 00:24:35 And it's like, none of that stuff's really kind of working because just having all your APIs in a big list, it turns out doesn't help you. But having something that allows you to think about how you string them together is pretty transformative. And you think about some of these big systems, you know, yesterday I was seeing a note where I'm sure it's somewhat traumatized, but Boeing and the Navy are fighting about data rights for, you know, the F-18 Hornet. But that's kind of like this whole problem is just a consequence.
Starting point is 00:24:59 of the fact that it's hard to do this with the data rights. And that actually, the more that we get to a world where you can fine-tune an LLM, that is the F-18 production design LLM, that's a very different world. What do you imagine doing with that? Well, how you manage, so the ability for a third party, let's say the government doesn't want to be locked in. It's like, how can I interrogate the design specs or how do I understand how I'm going to do maintenance here? So how much of that is just kind of locked up because the expertise is so difficult
Starting point is 00:25:29 to acquire. So as part of the, when I'm turning over my first plane, am I also turning over the LLM? Is that part of the value proposition that helps you manage this? And the kind of hardened pipelines behind that and the training behind that. So how do I give you more leverage over something that's insanely complicated? Yeah. And you think about this system, it's a simple, it's a complex subsystems behind it that have all been integrated together. Like how were they integrated? And what if I want to swap out a component in the future? How do I reintegrate that? And so that's the part that's really hard. And it maps to an enterprise. If you think about it's like, okay, I need swap out warp day for something else. Like, what is it going to break? What does it all touch? How do I do
Starting point is 00:26:03 do that? And I think that's going to get a lot easier. Shamm, can you talk about some of the technical challenges that you guys feel like you need to solve or that you're working on now for customers to make this work? I think the key one is more conceptual. It's realizing that it is a stochastic genie. So, you know, where I want to invest the most in the tooling is really around kind of like a robust eval IDE environment that enables people to unlock the value of this. And like, how do I develop trust and put these kind of copilots through probation? How do my users think about that? Another big part of the investment right now is in making the sort of co-pilot models accessible to everyday users.
Starting point is 00:26:47 I think there's a fair amount of companies I see going after kind of, let's call it, the canonical data scientist as an archetype of, like, I want to fine-tune a model, and I'm going to go do that. I see a smaller number trying to go after devs as an archetype. But I want to go after the head of maintenance at an institution who's like, look, I know all these things. And I don't need all the knobs exposed to me, but I need you to have an opinion on all those knobs underneath that. And I can get my whole team to help generate the Q&A pairs. And there's a lot that I'm willing to put into this that revolves around my expertise.
Starting point is 00:27:20 Help me get a co-pile in a production that's affecting the lives of my users. And I think that's like just a lot of hard systems integration. Like how do I integrate? Like all those building blocks exist. How do I make that a very smooth workflow that you can trust and actually use? And I don't want to feel like we've solved all the kind of like statistical differences here. I think that's the more you kind of pull on that thread, the more you realize like, okay, like this way of approaching it that would have what we would have done with traditional code, you actually have to account for. You need more defensibility in your thought process here because it's not traditional code.
Starting point is 00:27:54 Do you feel like you are doing a lot of customer education on how to, like, absorb this? Or is it more like you need to do that work in the product so that they don't have to understand, like, the stochastic output? We have to do both. And even internally, like, you know, it's like as engineers kind of ramp with it, the more they play with it. You know, it's like the more they kind of, okay, I'm starting to get my hands around the genie. So I think part of it is like a lot of customers aren't seeing past chat right now. Everything, everything that's interesting is kind of like a chat bot. And I think some of the most sophisticated customers, that actually scares them because they're like, well, you know, like, how could I trust the output, the textual output of this thing to make a decision of this sort of consequence and like getting them to realize that actually we're, you know, you should be thinking about how this this manipulates your application layer and how you can be using that. And then how do you validate the outputs of that? Now, that's kind of fitting into the context of your state machine, which is probably my biggest comment on agents is like, I almost can't use that word because the connotation of it has come to be the agent.
Starting point is 00:28:53 has to come up with this own arbitrary kind of plan, like the planning is the exoticism of the agent as opposed to really the practicalities of an enterprise. There is either an implicit or explicit, and often it's kind of 50-50 or some combination, implicit explicit state machine that represents that enterprise. So the idea that you're just going to have an agent that kind of comes up with a plan and does things is it's not going to meet reality. But the idea that you're going to have an agent that has context of a part of the state machine, understands its authorities, the guardrails are left and right of what states am I allowed to manipulate? And you probably want to start pretty small, like one state. You have authority over one state transition. That's it.
Starting point is 00:29:32 And then how do I build that up so that I'm linking these together to drive the real automation? And that's going to map pretty cleanly to the humans you have in the enterprise. Like there's a human who probably owns that one state transition. And so now you're naturally building these human agent teams. And you're kind of upgrading or promoting your individual contributor to being a manager of agents. And that's pretty safe way from a change manager perspective. It generates the log data I need to have trust that this is actually valuable and helpful and it's assisting agent. And it's probably more akin to how Tesla has tackled self-driving as opposed to, you know, cruise, like big long shot, we're just going to go all or nothing. It's like actually we're going
Starting point is 00:30:09 to get a little more self-driving every single day. Yeah. Sort of end-to-end magic planning. We can figure out one state transition at a time with an existing ontology. The business understands. and get feedback along the way that feels really promising, especially with, as you mentioned, like things like the Gorilla's paper, if you're helping people fine-tuned to their own data, even a small amount of that seems to like dramatically increase quality of tool choice and such.
Starting point is 00:30:35 So sounds really exciting. One of the things that you've mentioned a few times is really different ways that engineers or people on your team have gotten familiar with the technology and the capabilities of it. And I feel like it's LLMs are very non-intuitive relative to both traditional engineering, but also traditional ML. You know, I feel like a lot of organizations have kind of had to adapt to thinking differently a little bit in terms of what are the actual capabilities and how does it work and where does it
Starting point is 00:30:59 and where does it work and where does it, where there's specific things that Palantir did early on to onboard people to sort of this new way of thinking or have people play around with the models in specific ways. I mean, you mentioned the hackathon. I'm just sort of curious how this all got started and, you know, how you now incorporated it into how people think about these problems. Yeah, we've made it a huge organizational focus. really to experiment and play with these things. So how could you bring this to your own area of the product? But as importantly, like, how do we build this into our tool chain?
Starting point is 00:31:29 So, hey, we are doing incident response on our stacks. Let's build a co-pilot for ourselves to go manage that more efficiently. And so by trying to solve your own problems with it, you get much stronger intuition of where it's amazing and where it falls off a cliff and how you have to think about that as you build it. So aggressively adopting it to drive our own productivity has been one dimension of it. And who came up with that mandate? Like who was the person who said, okay, let's go do this?
Starting point is 00:31:57 And that was really my push. You know, it's like, I think that's part of the FDE mindset, though. You know, it's like, it's no credit to me. It's like it's almost an obvious consequence of our culture of like, what's the ambition? How do we aggressively dog food everything? And like, if it doesn't work for us, why would it work for anyone else? And so then I think the other part of this is like we live in a world where we can't count on GPT4 everywhere. We don't have that on classified environments, right?
Starting point is 00:32:19 So it is a beautiful, kind of easy button for lots of problems to go after. But then when you start having to use open source models, you're like, oh, this one works in this context for these sorts of problems. So then how do I start exposing engineers to that? Because I think maybe one of the easiest ways to understand the models is to use multiple models concurrently and understand the outputs of it. And so kind of like our internal version of chat GPT isn't one model. It's actually multiple models, and you're able to evaluate the output of each of these, how long it took, the amount of tokens putting out. You're able to control and tune in and kind of almost inductively explore the surface area of these models. One thing that was a claim that Alex made that is a wonderful level of ambition is that volunteers of companies aiming for the entire market share of AI.
Starting point is 00:33:08 What does that mean? What does that vision look like? I mean, I think it's exactly what he said. It's like we certainly think that we have done the necessary pre-work, essentially, like the foundational technologies that we've built that allow enterprises to securely to protect their data, to bring these LLMs to their private networks, and then to deploy them operationally, to get beyond kind of the dabbling and the innovation. I can harken back to a period when data was quite early. Everyone had something like a data innovation lab.
Starting point is 00:33:38 They're not calling it generative AI innovation lab, but it's kind of structurally similar right now where people are kind of really working hard to think about what use cases and how will it be valuable if you think about this from the customer end. And actually, it's like, I'll tell you what use cases. The same use cases you were working on a year ago.
Starting point is 00:33:54 Like the problems haven't changed. You need to be applying these technologies to the problems that are the most important problems in your business. And where we've already made those investments and how do I manage and model your digital twin? And how do I already connect up the different decisions you're making,
Starting point is 00:34:10 together. So like if you think about this kind of like connected company to that decision web idea, no decision is truly independent. It's, you know, like the decisions you made upstream from it affect it. That decision you're about to make is about to have severe consequences for the decisions that can be made downstream. So if I can bring that visibility to you, you're actually in some sense simplifying the problem in terms of how much, how hard of a problem am I asking the LLM to solve and how incrementally can I deploy these things to go faster? And I think that's the compounding loop. So we feel like the value is really going to accrete to folks who own the application layer and the enterprises, and we're going to go after that very hard. So one thing we've talked a
Starting point is 00:34:50 little bit about are some of the customer use cases on the DoD side or, you know, more general sort of defense and related areas. And one area that I know that Palantir added quite early on as a vertical is health care. And you mentioned some of the work through it during COVID. I know that last year Palantir announced, I believe it was a 10-year partnership with Cleveland Clinic. to improve patient care. And when I look at the implications of LLMs and generative AI to health care, there's so much low-hanging fruit because it's such a big, people-intensive services industry. It'd be great to just hear your viewpoint in terms of how you work with some of these health care customers
Starting point is 00:35:24 and what you think this coming wave of AI will do. What are the areas that will be most impacted by that? Yeah, healthcare is roughly a third of our business. It's certainly, I mean, it's probably one of the fastest growing parts of our business as well. And we do that in a number of countries. So the NHS in the UK and multiple hospital systems in the U.S. And across both kind of dimensions of clinical care and operational care, the hospital operations.
Starting point is 00:35:51 And I think that's relevant because the pace of adoption for these will vary. And kind of the challenges you solve for the use cases with LLMs is different between them. I think the operational context is very obvious in the sense that it's just like operating any institution, really. You have kind of supply, demand, you have labor inputs to that. You're trying to manage that so that you can deliver the product, the care that you actually have. And there, it fits very cleanly to how will we help, you know, auto companies get better at what they're doing, or how will we help manufacturers or energy companies? And there, I think, probably the archetypal pattern that I see across all industries is something like you today have something. If you squint at it,
Starting point is 00:36:29 it looks like an alert inbox, where your state machine is essentially saying, here's an exception or something that I need someone to think about. And the human kind of, then you get so many exceptions. I need some help prioritizing all these alerts, and then you prioritize them and you deal with them. What the promises of LLMs and what we're focused on with AIB is turning that from a place where I'm surfacing alerts to a human to I'm surfacing solutions. So instead of saying, here's an alert, what should we do about it?
Starting point is 00:36:55 It's here's a recommendation, here's a staged scenario of what you could do about this alert that's happened. Do you approve or reject it? And that's a concrete manifestation of the, like, kind of co-pilot. And what I really like about that is, like, to the point of, like, having done the foundational work, for that to really work, you need a primitive that is this scenario, that is this, like, staged edit. You know, like a branch in Git, right? Like, without, that's a very powerful primitive without it.
Starting point is 00:37:22 You lose a lot of capabilities if you have to build that at every customer over and over again. So having something at the yellow one could say, look, here's a branch, and here's a stage set of edits. And then I can have a human evaluate that. that in the operational context of how they view hospital capacity. That's one set of workflows. And then on the clinical side, I think it's really about reducing human toil. Like, I don't think you're trying to get the LLM to decide what to do for the patient here. It's probably exactly the domain of the doctor here.
Starting point is 00:37:51 But what's in the clinical records, in the clinical histories, how do I drive the workflows? What is it that the doctor can't get to or the nurse staff can't get to today? Because it's just too much toil and that we can turn that into something it takes, you know, maybe 400 milliseconds. That's going to improve what's happening at the point of care. So driving completeness in that picture. And I kind of see that as a natural dichotomy between operational and analytical workflows. And the other thing I was looking at today is how do you optimize the throughput of a state machine?
Starting point is 00:38:20 So, like, I was looking at this claims processing workflow. And then I was looking at this, like, claims optimize, like, what's wrong with my state machine? It's almost the question. And this one, the second one is for like a manager who's looking down at this and they're saying, like, oh, there's a cycle here. And so the sorts of manipulations you're trying to do with the LLM is structurally more analytical. You're not asking it to change the state machine. You're not asking it to, you know, there's no magic button there to press. In the operational context, you can get closer to something that's more like, give me a recommended action that I can evaluate as a human.
Starting point is 00:38:49 And then there's kind of thresholding and learning over where might that be most valuable. And I certainly think one of the things that's promising about that is today we're so constrained by, is it worth solving this alert? Because what are my human costs to go after solving this alert? In a world where the LLM can process all the alerts and give you a stage set of actions, now you're prioritizing not on the severity of the alert, but on the possible consequences of the solution. So that's already an improvement in the sort function. And then you're much more likely to be able to get through all of them. That's a really useful framing.
Starting point is 00:39:21 Yeah, I think that covered all the things you wanted to talk about. I mean, it's a really great overview of what Pallenture is doing and some of the really exciting initiatives and customers that you work with. Thank you so much for joining us to Dano Pryor. It was really a lot of fun and a lot that we learned. So thank you so much. Sarah, a lot. Thanks for having me. It's great. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.