The a16z Show - a16z Podcast: The Taxonomy of Collective Knowledge

Starting point is 00:00:00 Hi and welcome to the A16Z podcast. In this episode, we talk about collective intelligence, human computation, and really the mapping out of knowledge with data ontologies in particular and how data ontologies enable scalable knowledge creation, both in philosophical terms and also in a very real practical way in terms of, for example, a doctor making a diagnosis for a patient. Joining us on this episode are Luis von On, founder of Recapsia and Duolingo, known for for his work on human computation. He's the first voice you'll hear. Jay Komarnanii, founder of Human DX,

Starting point is 00:00:36 the Human Diagnosis Project, Vijay Pande, A16Z's general partner from the bioteam, who has also founded Folding at Home, a distributed computing project for disease research, jumping in, and moderated by A16Z's Malinka Wallaliaade. So what exactly is a data ontology anyway?

Starting point is 00:00:55 Ontology is basically a way to add order to a huge amounts of data. You can think of Wikipedia as a huge ontology. You can think of, you know, even Google search as some sort of very large ontology. A lot of the things that are quote unquote called AI or artificial intelligence, a lot of times are just fancy ontologies. They're used in medicine. They're used in almost every aspect of our lives. The idea of an ontology actually originally comes from philosophy. And really it's the philosophical study of the nature of being and kind of what is real and what isn't real. The famous philosopher Wittgenstein really talks about how the limits of language are the limits of our world. And the way that we actually encode, structure, or organize representations of the world is essentially what an ontology allows us to do in a way that we can mutually agree upon. Why is ontology so important to us in the modern world? It's really just a way of ultimately describing how a set of entities relate to each other and how they can be classified relative to each other. So one of a classic example that we kind of think about is a taxonomy, right? A

Starting point is 00:01:59 taxonomy being one type of ontology or like the folder structure on your Google Drive or whatever it happens to be, that is one example. But then Siri is an example in terms of the way that it organizes knowledge about potential queries, the way that Waze organizes traffic data into different events. All of these are examples of ontologies. Ontologies is such a critical way to structure the world for computers because computers haven't lived in our world. They don't know that a cat is an animal and an animal is a thing and so on. But what is I think the biggest surprise is, is that we are implicitly as human beings, we have our own ontologies, but that doesn't mean we actually agree with them. And so actually, another interesting thing is not just that humans can interact with computers

Starting point is 00:02:38 in a structured way, that humans can interact with humans in a structured way. And that's really important to scale. One of my favorite kids' book is this book, Fish is Fish, about this, like, a frog that leaves his buddy to fish to see the real world

Starting point is 00:02:51 and comes back and describes the world to the fish. And Fish imagines everything in the context of the fish's world and the fish's ontology. Having a common ontology is the way to avoid chaos between people as well as to connect to computers. So the ability for humans to coordinate flexibly at scale is really what the internet enables,

Starting point is 00:03:10 something that's instantaneous always on that we can use to actually share information in a way that we can all benefit. If you don't have ontologies to actually organize that information and structure it in a way that can be usable by both different types of human beings, so potentially people who speak different languages, people who have more or less technical expertise, potentially different types of agents altogether. So machines versus humans, you have an inability to actually unlock the power of that information to actually transform society. How important is human involvement in coming up with the ontologist? Because one of the things we're very excited about machine learning is the ability to automatically group sets of data and create a machine-oriented ontology. This is actually an interesting opportunity to talk about one of the challenges in machine intelligence today. As you guys probably know with respect to deep learning and a lot of newer techniques, it's actually the human interpretability of the output of such systems and ultimately what the output of such systems means. Having common ontologies and common ways in which unstructured data is represented that's intelligible or interpretable to humans is actually what allows us to build human interpretability into such systems.

Starting point is 00:04:18 So the ability, as we talked about for human and machine agents who are diverse to ultimately be able to coordinate and understand one another is a function of being able to have human representations in such systems. And the only way you're going to get the representations that are most valuable to humans is from human beings themselves, right? The way that different types of entities are going to look at the world is dramatically different. The perception of our reality drives what our reality is. and what we think is important in terms of our day-to-day perceptual experience is a function of what's actually important to humans. The other way in which humans are needed here is to create the ground truth. All of these deep learning or deep AI algorithms need a ton of ground truth in order to get very accurate.

Starting point is 00:05:05 It has to be entered by humans. With that said, it's always interesting to me to compare the gold standard ground truth of the human ontology to what the computer comes up with. It's because it kind of gives you some sense of what's in the data and what's not. What are the things that we just take for granted because we have in our experience, but that the data doesn't support. And I think that's going to be critical for understanding how all these pieces come together. This is a major challenge for a lot of very specialized fields.

Starting point is 00:05:33 How do we actually come up with truth in a scalable way? And that really depends on the field or the specific industry that you're in. Lewis, can you tell us about how you utilize humans on the internet for your recapt project and then how you build an ontology there. The standard captures these distorted, squiggly letters that you see all over the internet that you have to type, for example, whenever you're buying tickets for ticket master. And the reason that's there is to make sure that you're a human, basically doing something

Starting point is 00:05:57 that computers could not yet do, which was reading these distorted characters. There were literally hundreds of millions of people every day typing these captions on the internet. And the question is, can we use these, you know, all of these people typing captions on the internet to get something very good to get something useful like ground truth for something so the idea came about that we could use captcha's to help digitize books at the time google was trying to digitize all the world's books the computer needed to decipher all of the words in these in these pictures of the pages but computers are not perfect they're not as accurate as humans

Starting point is 00:06:33 so we started getting people on the internet to recognize the words for us in the form of a capture so when people were typing captions these were actually words, and we were getting people to recognize them for us while they were trying to buy tickets and ticket master. And the way we got to ground truth here is simply by having 10 different people agree with each other. So if you gave the same word to 10 different people and they all agreed, that we consider that ground truth, that to a very large extent that worked really well, when you ask 10 people the same thing and you can compare their answers and you see if they're all equal. That is a really powerful method of getting to ground truth. So with recapture, anyone who

Starting point is 00:07:14 could read English and was on the internet could participate and contribute. Whereas with Duolingo, you needed people who are bilingual, where there's a small set of people. So what problems did you run into with duolingo that you didn't run into recapture? Translation is a significantly harder problem than deciphering what a picture of a word, because there's multiple ways of translating the same text. So here we could not use agreement. If you give the same sentence to tend to different people to translate, you're probably going to get eight or nine different translations. So then you have to start doing other more sophisticated things like having people vote on each other's translations. And this is what we ended up doing. Basically, one person was translating

Starting point is 00:07:50 and another person was voting on whether that translation was correct or if it wasn't correct, they would fix it a little bit. So the more complex the data, the harder the problem of getting the ground truth becomes. That's really fascinating in terms of the specific mechanisms you have to come up with to deal with these different problems. How does this play out in other domains? Here's a theme I'm curious to get your take on. If you think about the common healthcare system, it's a structure. If you have community workers, nurses, PCP, specialist,

Starting point is 00:08:18 and like you have this whole stack. And as you go higher up the stack, it gets more and more expensive. And today, more things, you know, more healthcare issues flowed through that entire stack than they need to. And so, like, a common goal that we're seeing generally is augmenting the N-1 layer to be able to do a bunch of what the next layer is able to do. And the way you augment them is with, you know, machine augmentation and in order to get to that, you need to be able to use data ontologies

Starting point is 00:08:42 to train machines to help them. Is that a... Yeah, absolutely. So taking a step back, I think the power of being able to create ontologies is ultimately to do scalable knowledge projects like Wikipedia, like open source software, like the type of stuff that ReCAPTCHA has been able to accomplish. You really need a way of ultimately enabling diverse stakeholders or agents to be able to speak to one another. So humans and machines need to be able to coordinate and interact in a way that both can understand with respect to making healthcare decisions. The way human DX essentially works is an open system similar to Linux or Wikipedia to

Starting point is 00:09:24 help make any clinical decision. People post cases on the system and then multiple physicians independently attempt to solve those cases without necessarily knowing who created the case and without, necessarily knowing each other. Where ontologies really play in is that they are essentially using the system to look up diagnoses that are classified by the international classification of diseases, which is created by the World Health Organization. By putting in those diagnoses in that structure and format, we can one to one compare how different people solve the same cases. And then we can coalesce or combine that knowledge into what we call a singular collective answer that actually

Starting point is 00:10:00 shows the results of the group of people coming to that answer versus individuals. And then what's really powerful from there is that we can compare a reference set of true cases, which we know the outcomes for, and we can see how the collective solves those cases versus individual physicians solve those cases. We're now seeing that a collective of multiple physicians can outperform 90 plus percent of individual physicians. How do you measure that? So by creating a a standardized set of reference cases that are true based on actual clinical outcomes, we can actually look at how individual physicians solve those cases, and then we can look at how groups or collectives of physicians solve them in human dX, and then compare those apples

Starting point is 00:10:44 to apples. Does it really make sense that the majority of results is the right result? You know, could it be that the, you know, if the sci-fi minority report is the one really that's interesting? And how do you tease that signal out? That's a great question. By using reference cases, you can actually understand who knows what about what. So you can understand that different physicians may have differential knowledge on differential topics. So a radiologist might understand certain cases better than an oncologist than an endocrinologist. And so by understanding what we call clinical quotient, which is essentially our understanding of what physicians know what about which topics, we can actually wait their opinions preferences or understanding of the problem appropriately to get to some better answer. And then we can actually combine the result of that collective answer with clinical guidelines,

Starting point is 00:11:39 with research data, with potentially claims or electronic medical record data. Ultimately, what is truth? Truth is an approximation that asymptotes towards reality, right? So ultimately, you never get to 100% truth. You can only approximate truth by having more, you know, ultimately more sources of information as well as more agents ultimately interacting around those source of information. Yeah, in that sense, you're not just learning about the problem, you're learning about the agents. Exactly.

Starting point is 00:12:08 And that gives something. It would be interesting to see if we could tweak democracy with such approaches. But that's a whole not other discussion. So actually, there's an entire area of kind of structural thinking in political science around the idea of epistocracies. How do you weight different people's perspectives by the knowledge that they have on different topics. And I think a lot of people would potentially be excited about a world where the people who are making decisions about the things that are relevant to society are doing so in a way that they actually

Starting point is 00:12:38 know about those things. And that we could potentially delegate our preferences on those subsets of issues accordingly in such a fashion. Yeah. Yeah, a different type of and sort of information represent representation of democracy. There's a lot of places where this could be applied government. In particular, you know, having something where a lot of people look at maybe laws to find inconsistencies in them. I mean, most of our legal system is extremely inconsistent. And, you know, it would be great if we could start finding inconsistencies in law, finding patents that don't make any sense. Also, finding corruption in Mexico.

Starting point is 00:13:14 There was so much crime. And some people were doing projects about, you know, just reporting areas where crime is happening right now so you can avoid them. You can apply it in all kinds of places. Interestingly, one of our advisor slash investors, Albert Venger, he actually talks a lot about the idea of using bots to represent you around certain issues. So actually using machine agents to represent you on topical issues with respect to a given set of votes or preferences that you may have on a variety of things. How would you weight opinions from different constituents? Yeah, yeah.

Starting point is 00:13:47 So with respect to human DX as an example, the way that we do that is a function of their knowledge on a, set of topics, right? So if we find that we have someone go through a set of training cases and they can solve those differentially better or worse than someone else, we can also look at the micro features within those cases, right? So do these cases have more features that are related to cardiovascular issues versus this other set of cases, which is related to endocrine issues? And then we can actually tease out and using machine intelligence, not actually hand-engineer those features. We can just obviously have the system optimized around which agents know more or less than other agents. Luis, how did you do that with dual lingo?

Starting point is 00:14:28 Yeah, it's very similar techniques that have been applied throughout time. We even did that for recapture. I mean, there were some people who were just much better than others at reading the distorted text. And we just gave them more weight. The basic idea is that every now and then you give the user, whatever they're, you know, the person that you're trying to crowdsource from, you give them things for which you already know the answer.

Starting point is 00:14:50 and then you're using that to measure whether, you know, what they're good at. And then you can start getting different weights. I mean, some people are just pretty crappy at a lot of things, and you just basically don't give them much weight. And then there are others that are very good at certain things and give them a lot of weight for things like that. That basically increased accuracy, but it also allowed us to improve efficiency. When you're doing things at a very large scale and you don't have very many humans, you may have 1,000 humans or 10,000 humans. And if you need to label millions of things, you can't afford to start giving the same thing to all 10,000 humans. You can only give it to, you know, three or four.

Starting point is 00:15:26 And if you can start figuring out who's good at what, you can become a lot more accurate and more efficient. As you guys are probably seen with those captures, they're now basically impossible to read because the systems have gotten so good at actually classifying them on their own. Yeah. With recapture, we had to become really efficient. I mean, the problem we had with recaptures, there are 100 million books. that needed to be digitized. That was the total number of books that has ever been written before the digital era was 100 million.

Starting point is 00:15:56 At the pace that we were going, we were able to digitize about two to three million books a year. So we started basically taking the input of both humans and computers to become extremely efficient. The computer would first try to recognize all of the words, and then it would have a whole statistical model about how well it recognized each word. each word. And then for some words, it was certain it was that. So we wouldn't even give it to a human going in the future. You're going to get humans doing some stuff, but also the computer, a lot of times will have an opinion about what it is after you've used some sort of machine learning or deep AI algorithm. The question is how you combine both of these, because in some sense, you know, the human

Starting point is 00:16:37 is kind of the limiting factor here. You just don't have very many and you don't have that much time with each human. Yeah, and that's a great opportunity to kind of define the difference between some of these key terms, right? So if you think about crowdsourcing, crowdsourcing is this broad idea of using multiple people to come to some answer. Human-based computation and human computation, which is one of Louise's expertise, is really about using that to solve problems that machines find difficult. So outsourcing the work that machines find challenging, and the machine agent itself can do that. They can ultimately outsource that work. And then collective intelligence is really coalescing the intelligence of humans, machines, and organizations to solve complex problems

Starting point is 00:17:20 and using that to come to better results than maybe one or multiple of those could on their own. How do you incentivize people to contribute their time and their effort into any of these various projects? I think it really depends on the project. There are all kinds of different incentives. One is actually money. You can just start paying them. I mean, that's the whole idea with Amazon Mechanical Turk. I have found that paying people is not so good. then you really have to spend a lot of effort trying to stop people who are just there to, you know, get your money. But that's another valid incentive. In the case of recapture, we just put a little puzzle in front of them saying, please read these words.

Starting point is 00:17:57 And, you know, their incentive was I need to get my tickets for my concert or I need to, you know, get my account for Facebook. In the case of Duolingo, we, you know, the way we started is we would say to help us translate stuff, in exchange we essentially help you practice the language that you're learning. Another one is really the social meaning of a project. So Wikipedia is a classic example where there really wasn't that much incentive to contribute other than creating a shared knowledge resource for the world. And really the, as it's called in game design, the idea of the epic meaning with respect to Linux or other open source software projects, it's really people want to create a tool that they themselves can use. And so they actually gain utility from actually

Starting point is 00:18:39 contributing to such a project. And then, of course, there are things like ways where you have kind of this desire to contribute your knowledge in terms of what accidents are happening or what else is happening in a given location as a function of your desire to have kind of reciprocity with the community of people. If I'm getting benefit from this system, then I should give something back. In terms of folding at home, there were the social reasons too that people want to make an impact in Alzheimer's and cancer and so on. But also gamification is a very, very natural way. It's funny how putting a score and some badges and so on, even I'm a sucker for it. I mean, we all love games and we easily get sort of caught up even in the sort of the light

Starting point is 00:19:19 competition of it, especially if it's competition to see who can help the most. That's a win-win. Absolutely. I mean, whether it's credit card, you know, loyalty points, or it's staying at the same hotels or whatever it happens to be, I mean, all of these are systems that are designed to capture our behavior. In Human DX, we actually like to use what we call impact in the system, and it allows us to differentially provide contributors who contribute more valuable contributions to the system in

Starting point is 00:19:45 terms of what contributions the system most need. So perhaps there's a case that we don't have enough diagnoses for or we have nothing in terms of our knowledge base about this set of treatments. We can actually differentially give through a kind of a market-based mechanism physicians who contribute that knowledge more impact than other physicians. Louise actually his his work on Duolingo and really kind of this idea of creating these micro interactions that almost have this gamified structure was really a major inspiration also for the way that we built human dX to be these brief interactions where you create and solve clinical cases from your phone. One area that's really interesting is the idea of evolving ontologies, right?

Starting point is 00:20:29 So instead of having some fixed framework or structure that is assigned by some group. So a great example is the international classification of diseases by the World Health Organization, or the unified medical language system, which includes SnowMed and RXNorm and other things in healthcare. The issue with those is that they have to be updated manually every several years. It's like publishing Encyclopedia Britannica instead of having Wikipedia, right? So if you can have data ontologies which can evolve with human input and ultimately machine suggestions, as Louise pointed out, where machines can actually make suggestions about, hey, these two things look different. We thought this type of pneumonia was actually one type of pneumonia. It turns out it's actually

Starting point is 00:21:10 two types of pneumonia. That's the power of ontologies because you can then close the loop between humans actually looking at those and saying, actually those are two different things. Or they actually, these things that we thought were different, they are synonyms for each other. They are actually the same thing. There's actually a Twitter handle of the funniest ICD-10 codes like bitten by duck. second incident, sucked into jet engine. These are like actual clinical diagnoses that are put into... It does happen. They do happen. And now we can code for it and get paid for it critically. Exactly. Another one that's, I think, really interesting is kind of what's happening with respect to

Starting point is 00:21:49 distributed decentralized systems like the blockchain. You know, the ability to ultimately compensate people with application-specific tokens is a really interesting incentive to use ontologies and distributed knowledge creation collective intelligence to come to better answers around given issues or given problems. You're probably seeing that there are prediction markers like Auger and other things, which are decentralized application-specific tokens that are ultimately issued as a function of your participation in those networks.

Starting point is 00:22:20 It's going to be interesting to see what is left for humans to do in all of these things. I mean, some things that computers are better at playing Go than humans are, but humans are still better at recognizing whether a picture has a cat or not. They're still better at it. Is actually, is that true? That one might not be true either. I think they're still better at it. I think it's getting to almost the point where it's not.

Starting point is 00:22:44 But there's just really simple things that a lot of times humans are still better at. And to me it's just very interesting to see that. I think one place where you see that natural place that humans are really good is when there's a high number of scales of different types of information or data, right? So, for example, in healthcare, everything from how your mitochondria or the electron transport chain and your mitochondria function at the subatomic level matter all the way to what type of society you lived in as a child matter, right? And so because of the complexity and kind of the difference in the scales of potential information that are involved, that almost becomes like a general intelligence problem.

Starting point is 00:23:25 because you have to be able to assign, understand, and ultimately manage and synthesize information from many different scales and representations into some understanding of a given problem. So we've talked about applications of ontologists in various fields. Jake, can you talk about some of the unique challenges for ontologists in healthcare and also how you actually see some of these crowdsource knowledge that's being used in practice? Yeah, so one is essentially the scale, the kind of the complexity of the number of scales that information comes across, right? So if you're dealing with genomics or epigenomics or proteomics data, you don't necessarily even need the same types of ontologies that are human interpretable.

Starting point is 00:24:11 Then if you're dealing with information, that might be like symptoms or physical exam results or social history or medical history. These are things that having human interpretability is really useful. So it's essentially combining this very abstruse kind of unstructured information, whether it be images or genomic data sets or whatever, and then coalescing that with things that humans actually think of as important to our health. So that is a very big challenge in healthcare compared to other sectors. And how do you think some of the crowdsource data sets that you're building, how do you think they'll be used? So ultimately the goal of our project is to really understand how do physicians make decisions. and not only do what decisions do they ultimately make, but how are they thinking about those decisions? And by encoding that information in a way that can scale at zero marginal cost, we can actually extend physician access to underserved patients around the country who can't currently afford it. And in doing so, hopefully create a system that can ultimately use scalable knowledge creation to help patients who currently have no way of being served.

Starting point is 00:25:17 Not talking about replacing doctors, but really augmenting what doctors can do. do and giving them tools they just don't have. Exactly. Extending the capacity of the same number of physicians to serve more patients at a lower cost. I had always been fascinated by this question of when you or someone you love isn't well, what should be done. We really think of that as the essential question of human health. And that's really what every person who touches your health is answering. The clinician who touches your health, your loved ones, perhaps the insurance company that's helping mediate your health, the farmer company, whatever it happens to be, every single stakeholder who's touching your health is answering the same question. And really the only

Starting point is 00:25:54 way to actually build a scalable knowledge solution which can ultimately help serve the unserved, a true project to invert the pyramid and help the billions of people in the world who don't have access to health care is essentially knowledge creation that can help such people get access to this information themselves. We've talked about ontologies in many, many different industries all the way from translating basic words, basic text, all the way to sophisticated diagnoses, a medical diagnosis. Thank you for joining us in the 16th, podcast. Great, thank you.

Starting point is 00:26:28 Thank you so much for having us. Thank you. Thank you for having us.

The a16z Show - a16z Podcast: The Taxonomy of Collective Knowledge

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.