a16z Podcast - a16z Podcast: AI, from 'Toy' Problems to Practical Application

Starting point is 00:00:00 Hi, everyone. Welcome to the A6 and Z podcast. I'm Sonal. Given all the ongoing excitement around artificial intelligence, deep learning and machine learning, especially with the NIFs conference this coming week, today we're talking about what happens when we go from so-called toy problems to practical AI in production. The conversation is also part of our ongoing series on AI and practice. You can find other past and upcoming episodes on our website under that tag. But joining us for this episode, we have Joe Spisak, who leads strategic and programmatic partnerships for Amazon Web Services. So have a front row seat on what's happening with a bunch of companies interested in AI and machine learning. We have Scott Clark, who's a CEO and co-founder of Sigopt, which provides optimization as a service. And then we have general partner Martine Casado. The discussion covers everything from taxonomies of startups and methods for AI to a brief debate about whether AI means the end of theory or not. And we also discuss the problems of data and optimization, as well as the pros and cons of machine learning as a service and touch on the theme of the API economy. But we begin by quickly reflecting on where we are right now, what are we seeing with companies adopting AI

Starting point is 00:01:03 beyond R&D? The first voice you'll hear is Scott followed by Joe. Why now? So I think AI is kind of this, in this unique position that it hasn't been in historically before. All the pieces are coming together. People have the data sets now. They have the tooling and the open source community. It's been huge in that with tools like MXNet and TensorFlow being widely adopted and productionalized. And now they have the infrastructure readily available with things like AWS and all these new invidia chips. In addition to a whole bunch of APIs to make a lot of the hiccup and like difficult parts

Starting point is 00:01:34 of the system easier and easier. And so the combination of all these things together means that instead of spending a decade in the R&D lab to try to come up with something, now a couple of data scientists can make real business impact almost immediately with the AI go-to-market. I mean, as AWS, I think we have more than 2 million customers now on our platform.

Starting point is 00:01:53 You can imagine all the inbound that we get for all these customers want to get into AI, It's today's mobile first, right? I think Sundar Pichai, actually, even in one of his big talks of the state of the union at Google, called Google an AI first company, which was quite a big shift. Oh, totally. And Microsoft has switched from a mobile first over to an AI first now as well. So I think everyone sees there's actual business value. The funny thing that I've seen is a lot of this, what we're calling AI today was really just statistical predictions or like, you know, using basic regression techniques.

Starting point is 00:02:24 Yeah, sometimes AI is a little bit of a buzzword in that context. to Buzzword, like financial firms, they use really basic techniques and they call it AI. So we see financial services getting disrupted. We see health care, life sciences. Preventative maintenance is looking at all the sensor data on big machinery, on airplanes, on all kinds of equipment and trying to predict failures in the future. This is actually one of the top use cases that I'm seeing recently. Why does AI uniquely help in that context or in this specific ML deep learning? Because all these sensors now sprinkled all over these big machines, your airplanes, your vehicle now is full of sensors. You can take all that data. You can actually predict a lot further into

Starting point is 00:03:00 time or learn a lot more from the data over longer time series, series of time. It's one of those industries that is catching up. They're kind of sitting on a gold mine of data, but data doesn't equal AI. I have a lot of those kind of larger customers. They come to me and they say, I have all this data, what should I do now? How do I do this AI thing? I get that so much. And, you know, we have to even step back and say, okay, let's go build a data late because you have disparate data sources. You know, let's talk about the problem you're trying to solve. Like, actually start with the question.

Starting point is 00:03:29 Start with the question. Oh, my God, we got all this data. What the hell are we doing with, you know, what is it we're trying to do here? You can talk about hyper-premer tuning where you want. You can talk about, you know, cleansing data and prepping data and annotating data and deploying it at scale and on IOT devices. But if you're not actually understanding the problem you really want to solve and you see business value.

Starting point is 00:03:45 And a derivative problem with that is really the ROI. Like, how do you define the ROI on which problems to solve? Because there's infinite problems to be solved. That can be discovered. I'm tracking. something like five or 600 use cases internally that our sales folks are coming to us and saying, hey, I got this problem, I get this problem, I get this problem, I get this problem, which are the ones that are salient enough to drive an ROI?

Starting point is 00:04:04 And this is something that comes up all the time with us. But for you to even apply an optimization algorithm, you need to know what you're aiming for. Yeah. Right. So it needs to be tied to business value. You need to be able to articulate, like maybe if I'm building a fraud detection system, maybe naively accuracy is the most important thing you could think of. But if you catch all the $1 fraudulent transactions and miss the million dollar ones, that's actually bad for the business.

Starting point is 00:04:26 So it requires a lot of domain expertise. And I think this goes into needing specialized data sets for every individual application, but also unique targets and goals to shoot for. And then once you have this complicated system and you have a target that you're shooting for, then it becomes an optimization problem. But if you don't have that data and you don't have that target, you need to figure out what it is you're even trying to achieve. Well, what's interesting hearing you talk about is like, every time you have like a really hot, frothy space, even the most basic questions aren't answered, like something as simple as like, what is AI good for? What can you apply to, et cetera? And I think every time you've got these kind of new buzzy things, like people treat it like magic. They're like, you know, I have like standard treacheral thing. I add magic and then I get something amazing. Add AI in a box and boom, you're up and running. So I actually categorize startups that come in in one of four buckets in the AI spectrum from. like kind of the most basic to the most science fiction. And here are the following.

Starting point is 00:05:22 So there are companies that come in that have been doing, you know, hardcore ML stuff for a long time, but they haven't called it AI. They're probably older techniques, probably not the kind of latest DNN stuff or whatever. And then they start calling it AI because they know that that AI has profit. The second one, and it's the one that I tend to focus the most on,

Starting point is 00:05:39 they actually understand what you can apply AI to. So they're like, you know, it's good for these things to solve these problems. They're taking that and they're applying it to an existing problem. They're doing a new startup. So those I spend the most time, just because they understand the technology, they normally have the core team, they understand the problem.

Starting point is 00:05:52 The third ones are really interesting to me, and I'm getting more and more interested in them, but like it takes a little bit of a range. And I call them the end of theory. Oh, interesting. So what they do is they basically believe that you can apply AI to problems where you don't have to have a theory beforehand.

Starting point is 00:06:06 You don't need to know what you're looking for. So for example, let's say you've got a bunch of security data, you don't know what to look for it in there, we'll tell you what to look at. Or maybe you've got a bunch of marketing data. You don't really know if there's something there, but we'll tell you what to look at. So you don't have to have a theory of what we're looking for, but we'll apply data and give you a theory.

Starting point is 00:06:22 And then the fourth, the most science fiction, these are the ones I don't give a lot of credibility to. They basically want A.A. to solve their product market fit problem. So they basically say, I don't really know what company to build, you know? So what I'm going to do is I'm going to enter a space. I'm going to add AI. And then, like, that'll basically tell me what company or product to build. And those, I think that's mostly just kind of wishful thinking. I don't think it's going to actually solve, like, what company to build.

Starting point is 00:06:43 I love that taxonomy. And it's funny because what you described as the end of theory, which, by the way, was a cover, Chris Anderson wrote for Wired, making this argument that in the age of big data, you don't need theory because you have so much data. You can essentially mine it to learn what you don't know. And yet you have this chicken egg problem that you're describing where the ideal case for the companies you might work with is that they have a goal or something they're trying to do. So how do you see people actually navigating this? So I think a lot of times in machine learning and artificial intelligence, you can kind of break it into two camps.

Starting point is 00:07:12 There's the completely like supervised learning algorithms where you know what you're going for and you just want an algorithm that can do. that better than anything else in the law. And those generally have big data sets. They're discovering themselves versus an unsupervised, which is a contrast. Exactly. So the idea is I have a bunch of fraud data and I just want to minimize fraud. And so I can come up with some sort of metric that I care about that's correlated with business value and I just want to maximize that metric. Then you have unsupervised learning algorithms. And this can fall into things around like anomaly detection or it's just like, here's a big soup or lake of data. Show me interesting things. Well, the other contexts that thinking of when I think of unsupervised, I also think of cases like the recent news about alpha

Starting point is 00:07:51 zero and the algorithm kind of learning on its own from no data. That was a reinforcement learning case. Is that technically unsupervised? Well, so then I guess you could say there's a third class. This is a third one then. Reenforcement learning. And I mean, it may have started with no data, but it's generating it as it goes along. And learning and kind of going. Right. So that's a third category. And with one shot, because remember there's a big move for a while with like one shot learning type of algorithms, would that also go in that category? What I'm trying to get at is a difference between small data and big data, basically, and where those fit in your taxonomy. Yeah, so the nice thing about something like unsupervised learning, and the nice thing in general with all of these,

Starting point is 00:08:28 is you can kind of composite them together, depending on what you're doing. So for something like a natural language processing task, if you maybe have some end goal, like you want to have some sort of conversational AI system, or you want to be able to do question and answer over some like large corpus of data. You're very carefully avoiding the word bot, which I love. That's another hypey one when we did a podcast on that last year. But basically what you can do there is take an unsupervised learning approach to kind of learn the features of the language itself and then apply a supervised learning algorithm on top of that feature representation that you've learned.

Starting point is 00:09:03 And so this is the nice thing about, like, you don't have to have the machine necessarily no English to start because you feed it in all of these different examples and it learns ways to represent that, which then can be aimed at a specific goal like answering these questions. Traditionally, those systems are thought of as independent, though, and they fall into these kind of three categories of machine learning. But what's really interesting when we're seeing a lot of customers do is start to treat that like an entire pipeline. This does make it a lot more complicated. It makes it more computationally intensive.

Starting point is 00:09:32 So leveraging some of the new technologies is incredibly important for doing that. But it also makes it a harder optimization problem as well because you've taken a system with a lot of knobs and levers that's normally been optimized via trial and error, and now you're making it twice as large. You're making it competentially more complex. Yeah, the complexity grows exponentially. And so some of the standard techniques that people do, like trying to solve this tuning problem in their head or via brute force, just completely fall flat. Scott, do you think AI is the end of theory? Like, do we have to, like, stop knowing what I want to know that too. The way we think about it is one of these aspects of machine learning is unsupervised learning.

Starting point is 00:10:08 You just throw a bunch of data at the problem and try to have the data learn on its own. You're not aiming for a specific goal. That's the end of theory. view. Yeah. This goes in with that end of theory in the sense that you're not asking a question. You're not shooting for a specific objective. You're just trying to create these patterns and formulate some sort of, maybe it's anomaly detection or you're just trying to look for something interesting. Yeah. Find a signal on the noise. Exactly. And it's clustering algorithms. It's all these sorts of things that are incredibly useful. And now we have large enough

Starting point is 00:10:39 data sets that is becoming an incredibly necessary tool because you can't necessarily ask every single question and hope to get an answer or you don't want to do like what some of the like psychology research is suffering with right now is this like key value packing of like if you ask enough questions then just like statistically you might get a spurious answer it's actually the false positive problem in that context exactly and so the idea is if I can just give it a bunch of data and then it comes up with the questions for me or I can go back and say oh this is really interesting this actually I know how I'm going to leverage this to then ask this question that I would have never asked before, I think that's becoming extremely powerful. You painted a three-level taxonomy

Starting point is 00:11:19 of supervised, unsupervised, and reinforcement learning. So where are you then on the end of theory? I think there's going to be need for all of it, to be honest. When it comes down to solving a very specific business problem, like fraud detection, you don't want the algorithm to learn on its own. Just let a lot of fraud through as you slowly come up with an idea of what the world looks like. You want to solve this very specific supervised learning algorithm. Or when you're training a car or something like that, how to drive a car, you don't necessarily want it just to like go get into a million accidents as it slowly learns, like, what does steering even mean? It needs to be somewhat more directed.

Starting point is 00:11:52 That being said, in the security space, if it's more like anomaly detection, that might be something different because you don't necessarily know what all different reaches could look like. And so you need to kind of do this more unsupervised, like, clustering-based approach. I do want to quickly ask you to define what is optimization, because when I hear that word, I think of, like, the McKinsey word, like, optimization of the workforce. I know you mean it in the context of algorithmic optimization, but could you break it down for us? Yeah, we think of optimization for the more mathematical perspective. So there's inputs to some system and one or more outputs that you want to maximize.

Starting point is 00:12:26 So you have an AI fraud detection system, and there's lots of configuration parameters that make that actually work. Yeah. And so we think of optimization in the sense of how do you set all those configuration parameters, those hyperparameters and architecture parameters of a defective. learning system in order to get the most accurate fraud detection system. And a lot of that by definition has always been trial and error. Yeah. And so typically that's how people solve this problem. This is why Google will pay like a million dollars for someone with 10 years of deep learning experience is that intuition that's built up. But just like how unique data sets are helpful for solving unique problems, unique algorithms and kind of unique configurations can get

Starting point is 00:13:04 you quite a bit better than the one size fits all approach, a lot of times that hard one intuition doesn't actually transfer to a completely new type of problem. You could know everything in the world about a DNN and then you apply it to a recurrent neural network or whatever it may be and it becomes much more difficult to kind of, you have to start from scratch again. And I was going to say, practically speaking, very few startups can afford that type of a $10 million,

Starting point is 00:13:28 10 years of experience type of hire, frankly. So this goes a little bit also to what you were saying earlier, Joe, about how, because when you talk about optimization and the inputs that you were putting in and then you're tuning, the hyperparameters and getting something out, that's when you were talking earlier about this fact that all these companies have these datasets. And the biggest question is how to actually clean and pre-process their data set, because obviously it's garbage and garbage out of the way. You really have to get the data in shape.

Starting point is 00:13:53 Yeah, that is a foundational problem. Data is a foundational problem. Today, it's so easy to go grab a Jupyter notebook, a web front end that you can execute code in cells. I can basically have a Jupyter notebook full of Python and heavily annotate it. Most of the classes these days in deep learning are using Jupyter notebooks. And today, there's just so many Jupyter are notebooks out there that people have built, you know, different solutions or tutorials on. There's this explosion of tutorials and code that's out there, but these largely are all toy examples. And if you want to get serious, if you actually want to optimize these for actual real-world usage, there's probably two really big pieces that someone just can't automate or

Starting point is 00:14:29 really bring kind of a precanned one-size-fits-all solution. We're kind of entering that golden age of applied machine learning, applied AI. You know, five years ago when I was hanging out at Berkeley and, you know, seeing the talks from Detender Malix, Peter Beale, and the folks there. And I would see some of these breakthroughs in computer vision. And Jitendra has that image that he always shows every year about I think there's a beggar on the street, right? And he's got a cup hanging out. And there's someone walking by. And the quotation is always, is this guy going to put money in his cup or not? And the algorithm is supposed to predict that. So there's some really cool things that were happening five or six years ago. Those are being operationalized now at scale. So you're basically saying that

Starting point is 00:15:04 we're at a moment, because I've actually heard the opposite. I think we're both right, though, which is that a lot of the work and the buzz is all algorithms and academia and actually translating it into practice is, and it hasn't been operationalized into use. I think it's been operationalized by big companies like Amazon and Facebook and obviously the larger deep tech companies. I think we're at the precipice here of having all of the kind of foundational pieces automated to the point where I think having the problem that you want to solve in mind is obviously important. having the data in a place where it can be trained. It's clean. It's annotated, especially supervised. When I say we're on the precipice,

Starting point is 00:15:40 I believe it's the supervised, you know, machine learning world is just exploding with applications. Obviously unsupervised, deep reinforcement learning. It's still in the bleeding edge. When you talk about having RRL applied in, say, like, autonomous driving, having a car driving around and learn how to drive, you know, by crashing a million times isn't tractable as a, you know, as an algorithm.

Starting point is 00:15:58 So it's, you know, being able to do all that simulation in the cloud and then transfer domains. It's still not a solved problem. You said there were two areas at startups. So the parameter problem and data? I think the data engineering aspect of things is understated for machine learning and AI. I see like a lot of our partners

Starting point is 00:16:14 that have been helping our customers over the last number of years. There are a lot of big data SIs and they've been using Hadoop and Spark. And they've been dabbling in machine learning and advanced analytics. Over that time, they've actually built up

Starting point is 00:16:25 a good amount of data engineering skills. So I think those are still really valuable. And you think those will transfer to this type of work? thing is they're not experts in the algorithms and in optimization, for example. But I also think that deployment is getting to the point now where it's almost push button with a lot of these APIs we're talking about. You can deploy to an endpoint and do new predictions on any mobile device. It can kind of just bolt on top and be this value add as opposed to something

Starting point is 00:16:50 this rip and replace. We're completely agnostic to the framework you're using, the infrastructure and the objective that you're shooting for. When it comes to AI being practical, I think the two bookends are the following. And the most commonly thought of one bookend is it's still academic. It's not useful. It's not applicable. And then the other bookend is AI as magic. And like the AI is magic bookend basically says, you know, it's about data and these magical algorithms. So whoever has the data and the data and the whatever has the data wins. And then like basically everything's automated. And the reality of practicality is in the middle, which is like it's not you can just like throw data at the problem and throw one algorithm's problem. And like QED are done. You've got a company. You have to have the right data. It's very very domain specific. You have to have the right algorithm or optimizations like every use case of AI requires specific tweaking in a massive, massive problem space to get a useful solution. And so it's not magic. It is absolutely practical. So it's not, you know, academic. But it requires, you know, at some level, I feel like the complexity has moved, not necessarily disappeared, right? It's like you've moved complexity from I'm writing code to now it's basically an optimization

Starting point is 00:17:57 problem and a data problem. How much is like the tweaking and the data. unique per problem? Is it per vertical? Is it per set? Like, how do you even think about that? there's so much work involved with actually getting the data ready, pointing at the right direction and things like that. And over the last two years, we've actually seen this huge transformation where it used to be this toy problem where it was like, we want to do deep learning and whatever that means, we just want to do something. And optimization actually isn't super important there because it's just like, can I stand something up? It was a coding problem before. But once you actually start to apply it to something, then it's how do I extract as much value as possible out of this? How do I

Starting point is 00:18:34 scale this as quickly as possible? How do I deploy this and have it be a reliable system? And so some of these bottlenecks that were historically in making practical AI have started to shift into these more deployment optimization type. That's actually really good to hear because it means it's not just all hype and it's not all like quite here yet. We're actually at a very exciting middle point. So the Fortune 100. Two years ago had fun playing to kind of accelerate the R&D phase of certain things. But now those same systems are in production. And every every little piece of optimization matters. And for every single problem, it's a completely different type of optimization.

Starting point is 00:19:11 You can take a problem that's really good at classifying the Google Street View data set where it's like pictures of houses and you want to be able to read the address off of it. And that's a kind of classic computer science computer vision problem. And then you want to do a different type of classification. You might need to have a completely different architecture for your neural net. So you're saying, just to pause on that for a minute, that while some of the skills may transfer and the mindsets may transfer, and even some of the way you might think of the models may transfer, the optimization tricks you use are custom and special

Starting point is 00:19:44 to each of these cases? Well, yeah, the intuition for how to configure these systems does not transfer, which is why you need to retune, re-optimize, and reconfigure these systems to make sure they're maximizing that business value. Did you tell me a little bit more about why it doesn't transfer? Yeah, so, I mean, back to Martin's point, like, to a certain aspect, some of the system, of these deep learning systems are kind of magical in the way that they work. They're very difficult to explain what's actually happening under the hood.

Starting point is 00:20:11 Right. It's a black box problem. Exactly. But the problem with a black box is if you have a black box with 20 different knobs and levers in it and you're trying to get some result out of the end, when you completely change what's being fed into that black box, you have a completely different data set or you're shooting for a completely different target, like all of that intuition on how you set those knobs and levers is now completely worthless. And so you need a new way to very efficiently and automatically set that for that new problem.

Starting point is 00:20:37 And that's true every time the data set changes, every time you add a new data set, maybe you have an unsupervised learning algorithm, learning a feature representation, every time you add a new feature, pointed out a new problem. And one thing that we found is that like an untuned, sophisticated system will underperform a tuned simple system. You can take a simple machine learning algorithm like a random forest, and it's always going to give you like a B-minus answer. You can take a sophisticated deep learning algorithm, and if you don't tune it properly, it's going to give you a terrible random answer. But if you tune it properly and train it properly, it's going to beat a human.

Starting point is 00:21:14 In a practical podcast, can I indulge a philosophical question? I love it if you did. I've always wondered this, and I'm actually pretty lay about the technology behind this. So I'm going to start with what seems to be probably an entirely different metaphor. But imagine, like, you're cleaning your house, and you're trying to get rid of dyes. So to get rid of the dust, one thing you can do is you can open the door and you can sweep the dust out of the house and the dust gone. Another thing you can do is you can basically just move dust around and you're like, oh, it looks better under the bed. I sometimes think about complexity

Starting point is 00:21:38 this way, which is like, are we just moving complexity from like basically writing code and algorithms to manipulating data and optimization, but there's still the same amount of complexity or have we reduced complexity with AI? Do you see what I'm saying? Are we actually getting dust out of the house? Are we just moving it around into a separate problem? domain. I love that question. I'd say it's actually a combination of the two because it practically when you are removing the dust from your house, you do get it into these little piles and then you move it out of the house. Some of these more sophisticated algorithms are making it better because the distributed dust problem is much harder than cleaning up a pile of dust. So basically you're creating like the

Starting point is 00:22:20 mounds of dust and then you can kind of focus on getting the dust out of the house. It's a two-step process. There is a reduction in complexity to get to the goal. It's not like you're just moving around. the dust in the house. What you're going to see is a lot of the complexity we talked about, get automated. I mean, you'll see us to try and drive those piles out of the door as well. Everybody wants to automate dust. I want to automate dust collection and, you know, in your back, yeah. One of my all-time favorite books is Philip Pullman's His Dark Materials trilogy. And I always think of dust in that context. So I can never not think about dust as like

Starting point is 00:22:48 only dust. Beautiful. Okay. So I have a question for you guys then, especially Joe, given that you work at AWS. And Scott, from your vantage point, is this going to come about because one thing I always hear about is debates about ML as a service and AI as a service and whether that's going to be the way that these services are going to be delivered. And there seems to be a lot of hype around that in and of itself. I'd love to hear your guys' thoughts on that. I mean, I guess, you know, the diplomatic answer is we have to support them all. I mean, we see, you know, layers of abstraction that are valuable to different types of users.

Starting point is 00:23:20 Researchers aren't going to use an API, obviously, because it's, you know, they can't do anything with. There's no knobs. There's nothing to break and do anything with. data scientists are not experts, so they need some level of automation or or helping hand on things like, you know, optimization. Give me the undiplomatic answer. The undiplomatic answer is then there's a whole bunch of guys who don't know what the F they're doing and they need some type of API, but it needs to be flexible

Starting point is 00:23:42 enough where they could start to bring their own data in. Because even today, you know, just being the self-deprecating Amazon guy, we have services that aren't, they're not very flexible. Like our image recognition service is really cool and it does a lot of great things, but I can't actually bring my own data to it. And I can't actually optimize and customize for my problem. Well, in your defense, I will say that when you are delivering a service to people, there are expectations and consistencies and things you have to do to scale.

Starting point is 00:24:07 But clearly, all the experiments that people have to make, you can't actually have this one-size-fits-all. When you build a service like that, you build it for the lowest common denominator, and it's being used by C-SPAN, it's being used by, you know, travel sites. But when it comes down to if I wanted to apply this for biometric security in my corporation, and I want to trade it on a whole bunch of... Fingerprint data. Fingerprint data, you know, pictures of Scott.

Starting point is 00:24:29 And be able to admit entry into a building. Never let this person into the building. Never let this build, you know, or Martin. I can't do that today. It's not good enough to probably be visualizing your data and looking out all the pretty pictures anymore. So business intelligence is kind of moving into data science. You need insights.

Starting point is 00:24:44 That's like a big theme. I don't want to say BI is dead, but it's on its last leg. I think everyone wants to move to more predictions, actionable, prescriptive analytics. and you can't do that when you're just kind of looking at pretty pictures. So I think we're seeing just this mass transition from BI, you know, analysts over to data scientists. Yeah. And they don't know a whole lot about machine learning.

Starting point is 00:25:03 I think they're going to get to the point where they can push a button and they can use an XG boost algorithm and then optimize me like, boom, I'm getting a great result. I know the domain that I'm in. I know the problem I'm trying to solve. Whether it's an XG boost or a deep neural net, I really don't care as long as my end predictions are accurate. You're getting the answers you need. You're getting the answers you need. What's your view on the AI as a certain? debate? I actually think that there's room for both. So there's kind of machine learning as a

Starting point is 00:25:27 service or this kind of generic one size fits all. And then there's these more specialized tools. And the way that we like to think about it is like in the early 90s, when the web was coming online and everything like that, if you didn't have a website, like a one size fits all, just like get me an e-commerce shop online or something like that, that zero to one was very transformative for people. By the way, I would still argue it's still true for small and medium-sized businesses that are like on Shopify. And yeah. So we think Shopify is more of the like machine learning as a service option where it's just like I need

Starting point is 00:25:57 something and I don't really know what I'm doing but I just need something. And that zero to one can be incredibly powerful. That being says as businesses start to differentiate themselves on their AI strategy as they start to hire data scientists and bring this bespoke knowledge and these custom data sets and things like

Starting point is 00:26:13 that, now you don't want the one size fits all solution. You want to build a company like Amazon where you can really optimize every aspect of that website to really make the most out of it. takes them from that one to two, but I still think there's this need, especially in the immediate term,

Starting point is 00:26:27 to help people go from zero to one. One thing that which is different than IT industry in the past is it seems pretty clear because the value is in data and because the value is in optimization, that the infrastructure layer will be free. Or maybe not free, but lowest common denominator. What I mean by that is, in the past, if I was going to give you a computing infrastructure,

Starting point is 00:26:49 I would charge you for that computing infrastructure and the tooling and everything else around that. But it seems like the tooling is something that people are happy to build and offer for free at any layer of abstraction, whether it's a service search or not. And so I think from an industry perspective, that's a very different horizontalization

Starting point is 00:27:06 than we've seen in the past. So what does it mean when you have this horizontal versus vertical AI layer? So in the past, like a lot of times tooling was something you could monetize. Like Purify was a billion-dollar company that basically sold the debugger, right? All the tooling and all the infrastructure

Starting point is 00:27:19 in order to build application was very much monetizable. For AI, because there's so much value in optimization, there's so much value in data, it's almost like this tooling infrastructure layer is something that is being offered or given or, you know, many players are just offering for free and they're out there as libraries

Starting point is 00:27:35 and things on top of services. And then the actual value is the vertical application of those to whatever. So, for example, if I look across AI startups, the ones that tend to be getting the most traction have taken AI and applied it to a vertical problem. They have access to a proprietary data set or they've done a specific sort of optimization,

Starting point is 00:27:52 and now there's a vertical focus towards something as opposed to I've got this very horizontal kind of generic AI layer. I think that's the game of the big players like the Amazon's or the Googles. And so I think from an industry-wide and a startup perspective, I really think vertical focus is how we're going to see the gains of AI as the enterprise as opposed to what we've seen in the past in computer science, which is the more horizontal layers. I completely agree.

Starting point is 00:28:14 You know, over the last three or four years, I've seen a number of startups come to me, whether I was at Intel and working with Intel Capital or not with Amazon, and they would say, hey, I can scale up training of deep neural nets so much better than everyone else. And I provided your algorithms, you know, come work with us or come by us or whatever they were looking for. And I think no longer, your point about verticalization is absolutely valid. And I see the ones that are successful are the ones that have a really nice mixture of kind

Starting point is 00:28:40 of AI research, the ones that have kind of one foot in research on the algorithms, have really deep expertise, but they also are mixed with true domain experts in the vertical that they're trying to work in. So, for example, if it's a medical imaging startup, if you don't have a hospital you're working with to provide you data, if you don't have doctors, if you don't have clinicians that you're working with or have on staff, frankly, I don't see a whole lot of legitimacy to what you're doing. I've seen startups that overfit to a public data set when I was in Intel, and they say,

Starting point is 00:29:09 this is fantastic. Look at us. We're getting 99% accuracy on this medical imaging data set. we're worth, you know, $100 million. But it was one particular data set. Come, buy us or invest in us. That, to me, wasn't a value prop. So I completely see the mixture of domain expertise in a vertical along with the

Starting point is 00:29:25 AI expertise. And I think it's really illustrative that compared back to how it used to be. So I used to have a friend I went to college with. He built applications. And he'd go after different verticals. But he was like a domain in specific application developers. He built like booking systems for like kennels and then veterinarians and then lodging system. He didn't have to know. He'd go and talk to them to see what they wanted.

Starting point is 00:29:45 Then he'd build an application. So like, you know, like he could build basically a horizontal company that would sell into these different verticals. But if you look at AI, like in order to add something of value to these domains, you have to understand the data. You have to understand the use. And you have to understand the optimization. It's almost like these companies are becoming these very vertically focused companies are the ones that are successful. Whereas like, you know, kind of IT folks, we're normally used to thinking these as horizontal layers. We've made the argument on our own podcast on where the machine learning edge for startups will be and how they can compete with the Googles

Starting point is 00:30:14 and the other folks who have these huge in-house teams. And it is essentially along these lines of this vertical part. But I love it also in the big picture of this idea of augmentation and having all these tools give you the superpowers, but you still have this human skill in-house and that's the domain expert you're talking about. Verticalization is going to be a huge part of that

Starting point is 00:30:30 and going after very specific problems with very specific data sets. But specialization along that horizontal layer, I think is going to be key. I don't think there's going to be a one-size-fits-all, everything. But if there's specific parts of that journey to getting to a practical productionized AI system that humans are bad at or that can be automated if you are laser focused on that specific specialization. Like optimization. Data collection would be another. Data

Starting point is 00:31:00 version would be another. Like general things that are scaffolding to do the AI problem. Scaffolding is a perfect way to put it. But then every building that the scaffolding wrapped around is unique. Exactly. Do people know the difference between if it's a data problem or an AI problem and how do they know? Because if you have people who have legacy skills who are coming up to speed, is it obvious? I think the data problem is the first, like, layer in Maslov's hierarchy of AI. Like, you need to actually have the data.

Starting point is 00:31:29 Then you need to be able to understand the business context of what you're aiming for and do a lot of the data engineering to make sure that you can actually leverage that data in some way that is not just in filing cabinets somewhere. Yeah. Then there's the tooling and the infrastructure for actually like training some of these more sophisticated algorithms. And then actually optimization just sits at the very top. That's for it to be an optimization factor. Because when I think of the Mazel's hierarchy of needs in the psychological sense, it's really about like the basics and the survival things and then the aspirational stuff at the top.

Starting point is 00:31:57 But optimization is a means to an end. It's not the end and end of itself. But the way I think about Mazel's hierarchy is like once you get to the top, then you kind of have everything in order and you're like doing it right. And then you can go to getting the answer to need. And the optimization is about like that last mile. of now I'm applying it to a business problem. How do I make as much money as possible as efficiently as possible? I mean, I have a number of partners that provide foundational APIs, for example, for NLP, for

Starting point is 00:32:21 natural language processing. You can take your text, throw it into their API, get sentiment, get named entities, you know, get parts of speech out, basic things for NLP, and integrate that into a larger workflow and not have to go and find a corpus of data, annotate it, you know, clean it, figure out what algorithm to use, train it, figure out how to deploy it. all that is basically a restful API away from most API call from all these. Well, that's a case where you're using the API to like pull in different data streams. But I also think of companies like ship out and they give shipping as a service essentially through an API.

Starting point is 00:32:53 And I think it's really interesting because what I love about this is about democratization of all these things across ecosystem. Because you essentially have all these superpowers you're pulling on. These APIs are giving you this superpower and that superpower and you're consuming it that way in order to do whatever the hell you want as a company. And what you're buying with that API is really data, is the data that they've collected, prepped annotated, trained on, which is, you know, a really great thing for most companies because they're not going to go build a system like that. And the way that we think about this is if you're in business, there's something that you're good at. There's some way that you differentiate yourselves from your competitors.

Starting point is 00:33:28 And you should outsource everything else. Focus on what you're great at and then bolt APIs around that to supercharge it. If I'm in a medical imaging, you know, focus startup, I want to hire guys that are focused on that vertical. You might want to hire the radiologist. Exactly. In the early days of Web, I remember there was this like move of talking about tech innovation as like combinatorial innovation. And I used to think it was kind of a buzzy word.

Starting point is 00:33:52 But I actually think it makes sense in this context. One of my favorite writers is Brian Arthur who wrote The Nature of Technology and How It Evolves. And he did a lot of fundamental work in complexity economics. Anyway, it's really interesting. But this idea that you can take all these different pieces and kind of recombine them in ways that are creating entirely new things. that's always how innovation has happened and now it's happening on a grander scale

Starting point is 00:34:11 with these things which I think is amazing and beautiful. A single person can do now what would have taken a team of researchers a decade ago. Well, thank you guys

Starting point is 00:34:19 for joining the A6 and Z podcast. Thank you. Great. Thank you so much.

a16z Podcast - a16z Podcast: AI, from 'Toy' Problems to Practical Application

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.