The a16z Show - a16z Podcast: AI, from 'Toy' Problems to Practical Application

Starting point is 00:00:00 Hi everyone, welcome to the A6 and Z podcast. I'm Sonal. Given all the ongoing excitement around artificial intelligence, deep learning and machine learning, especially with the NIFs conference this coming week, today we're talking about what happens when we go from so-called toy problems to practical AI in production. The conversation is also part of our ongoing series on AI and practice. You can find other past and upcoming episodes on our website under that tag. But joining us for this episode, we have Joe Spisak, who leads strategic and programmatic partnerships for Amazon Web Services, so has a a front row seat on what's happening with a bunch of companies interested in AI and machine learning. We have Scott Clark, who's a CEO and co-founder of Sigopt, which provides optimization as a service. And then we have general partner Martine Casado. The discussion covers everything from taxonomies of startups and methods for AI to a brief debate about whether AI means the end of theory or not. And we also discuss the problems of data and optimization, as well as the pros and cons of machine learning as a service and touch on the theme of the API economy. But we begin by quickly reflecting on where we are right now. What are we seeing with companies adopting AI beyond R&D?

Starting point is 00:01:04 The first voice you'll hear is Scott followed by Joe. Why now? So I think AI is kind of in this unique position that it hasn't been in historically before. All the pieces are coming together. People have the data sets now. They have the tooling and the open source community. It's been huge in that with tools like MXNet and TensorFlow being widely adopted and productionalized.

Starting point is 00:01:24 And now they have the infrastructure readily available with things like AWS and all these new Nvidia chips. In addition to a whole bunch of APIs to make a lot of the hiccup and difficult parts of the system easier and easier. And so the combination of all these things together means that instead of spending a decade in the R&D lab to try to come up with something,

Starting point is 00:01:43 now a couple of data scientists can make real business impact almost immediately with the AI go to market. I mean, as AWS, I think we have more than 2 million customers now on our platform. You can imagine all the inbound that we get for all these customers want to get into AI. it's today's mobile first, right?

Starting point is 00:01:59 I think Sundar Pichai, actually, even in one of his big talks of the state of the union at Google, called Google an AI first company. Yeah. Which was quite a big shift. Oh, totally. Microsoft has switched from a mobile first over to an AI first now as well. So I think everyone sees there's actual business value. The funny thing that I've seen is a lot of this, what we're calling AI today was really just, you know, statistical predictions or like, you know, using basic regression techniques. Yeah, sometimes AI is a little bit of a buzzword in that context.

Starting point is 00:02:27 to Buzzword, like financial firms, they use really basic techniques and they call it AI. So we see financial services getting disrupted. We see health care, life sciences. Preventative maintenance is looking at all the sensor data on big machinery, on airplanes, on all kinds of equipment, and trying to predict failures in the future. This is actually one of the top use cases that I'm seeing recently. Why does AI uniquely help in that context or in this specific ML deep learning? Because all these sensors now sprinkled all over these big machines, your airplanes, your vehicle now is full of sensors. You could take all that data. You can actually predict a lot further into time or learn a lot more from the data over longer time series of time. It's one of those industries

Starting point is 00:03:06 that is catching up. They're kind of sitting on a gold mine of data, but data doesn't equal AI. I have a lot of those kind of larger customers. They come to me and they say, I have all this data, what should I do now? How do I do this AI thing? I get that so much. And, you know, we have to even step back and say, okay, let's go build a data late because you have disparate data sources. you know, let's talk about the problem you're trying to solve. Like actually start with the question. Start with the question. Start with the question of, oh, my God, we got all this data.

Starting point is 00:03:30 What the hell are we doing with, you know, what is it we're trying to do here? You can talk about hyper-premer tuning where you want. You can talk about, you know, cleansing data and prepping data and annotating data and deploying it at scale and on IOT devices. But if you're not actually understanding the problem you really want to solve and you see business value. And a derivative problem with that is really the ROI. Like, how do you define the ROI on which problems to solve? Because there's infinite problems to be solved. That can be discovered.

Starting point is 00:03:52 I'm tracking something like five or five or something. 600 use cases internally that our sales folks are coming to us and saying, hey, I got this problem, I get this problem, I get this problem, I get this problem, which are the ones that are salient enough to drive an ROI? And this is something that comes up all the time with us. But for you to even apply an optimization algorithm, you need to know what you're aiming for. Yeah. Right. So it needs to be tied to business value. You need to be able to articulate like maybe if I'm building a fraud detection system, maybe naively accuracy is the most important thing you could think of. But if you

Starting point is 00:04:21 catch all the $1 fraudulent transactions and miss the million dollar ones, that's actually bad for the business. So it requires a lot of domain expertise. And I think this goes into needing specialized data sets for every individual application, but also unique targets and goals to shoot for. And then once you have this complicated system and you have a target that you're shooting for, then it becomes an optimization problem. But if you don't have that data and you don't have that target, you need to figure out what it is you're even trying to achieve. Well, what's interesting hearing you talk about. I was like, every time you have like a really hot, frothy space, even the most basic questions aren't answered, like something as simple as like, what is AI good

Starting point is 00:04:58 for? What can you apply to, et cetera? And I think every time you've got these kind of new buzzy things, like people treat it like magic. They're like, you know, I have like standard treacheral thing. I add magic and then I get something amazing. Add AI in a box and boom, you're up and running. So I actually categorize startups that come in in one of four buckets in the AI spectrum from like kind of the most big. basic to the most science fiction. And here are the following. So there are companies that come in that have been doing, you know, hardcore ML stuff for a long time, but they haven't called it AI. They're probably older techniques, probably not the kind of latest DNN stuff or whatever. And then

Starting point is 00:05:34 they start calling it AI because they know that that AIS property. The second one, and it's the one that I tend to focus the most on, they actually understand what you can apply AI to. So they're like, you know, it's good for these things to solve these problems. They're taking that and they're applying it to an existing problem. They're doing a new startup. So those I spend the most time with just because they understand the technology. They normally have the core team. They understand the problem. The third ones are really interesting to me.

Starting point is 00:05:54 And I'm getting more and more interested in them. But it takes a little bit of a range. And I call them the end of theory. Oh, interesting. So what they do is they basically believe that you can apply AI to problems where you don't have to have a theory beforehand. You don't need to know what you're looking for. So for example, let's say you've got a bunch of security data.

Starting point is 00:06:11 You don't know what to look for it in there. We'll tell you what to look at. Or maybe you've got a bunch of marketing data. You don't really know if there's something there, but we'll tell you what to look at. So you don't have to have a theory of what we're looking for, but we'll apply data and give you a theory. And then the fourth, the most science fiction, these are the ones I don't give a lot of credibility to. They basically want AAA to solve their product market fit problem. So they basically say, I don't really know what company to build, you know?

Starting point is 00:06:33 So what I'm going to do is I'm going to enter a space. I'm going to add AI. And then like that'll like basically tell me what company or product to build. And those, I think that's mostly just kind of wishful thinking. I don't think it's going to actually solve like what company to build. I love that taxonomy. And it's funny because what you described as the end of theory, which, by the way, was a cover Chris Anderson wrote for Wired, making this argument that in the age of big data,

Starting point is 00:06:53 you don't need theory because you have so much data. You can essentially mine it to learn what you don't know. And yet you have this chicken egg problem that you're describing where the ideal case for the companies you might work with is that they have a goal or something they're trying to do. So how do you see people actually navigating this? So I think a lot of times in machine learning and artificial intelligence, you can kind of break it into two camps. There's the completely like supervised learning algorithms where you know what you're going for and you just want an algorithm that can do. that better than anything else in the law. And those generally have big data sets. They're discovering themselves versus an unsupervised, which is a contrast. Exactly. So the idea is I have a bunch of fraud data and I just want to minimize fraud. And so I can come up with some sort of metric that I care

Starting point is 00:07:33 about that's correlated with business value and I just want to maximize that metric. Then you have unsupervised learning algorithms. And this can fall into things around like anomaly detection or it's just like, here's a big soup or lake of data. Show me interesting things. Well, the other contexts that I was thinking of when I think of unsupervised, I also think of cases like the recent news about Alpha Zero and the algorithm kind of learning on its own from no data. That was a reinforcement learning case. Is that technically unsupervised?

Starting point is 00:07:59 Well, so then I guess you could say there's a third class. This is a third one then. Reinforcement learning. And I mean, it may have started with no data, but it's generating it as it goes along. And learning and kind of cumulative metric. So that's a third category. And with one shot, because remember there's this big move for a while

Starting point is 00:08:16 with like one shot learning type of algorithms, would that also go in that category? What I'm trying to get at is a difference between small data and big data, basically, and where those fit in your taxonomy. Yeah. So the nice thing about something like unsupervised learning, and the nice thing in general with all of these is you can kind of composite them together, depending on what you're doing. So for something like a natural language processing task, if you maybe have some end goal, like you want to have some sort of conversational AI system, or you want to be able to do question and answer over some like large corpus of data. You're very carefully avoiding the word bot, which I love.

Starting point is 00:08:45 That's another hypey one We did a podcast on that last year But basically what you can do there is take an unsupervised learning approach to kind of learn the features of the language itself and then apply a supervised learning algorithm on top of that feature representation that you've learned. And so this is the nice thing about

Starting point is 00:09:04 you don't have to have the machine necessarily no English to start because you feed it in all of these different examples and it learns ways to represent that which then can be aimed at a specific goal like answering these questions. Traditionally, those systems are thought of as independent, though, and they fall into these kind of three categories of machine learning. But what's really interesting when we're seeing a lot of customers do is start to treat that like an entire pipeline.

Starting point is 00:09:28 This does make it a lot more complicated. It makes it more computationally intensive. So leveraging some of the new technologies is incredibly important for doing that. But it also makes it a harder optimization problem as well, because you've taken a system with a lot of knobs and levers that's normally been optimized via trial and error, and now you're making it twice as large. You're making it competentially more complex. Yeah, the complexity grows exponentially. And so some of the standard techniques that people do,

Starting point is 00:09:52 like trying to solve this tuning problem in their head or via brute force, just completely fall flat. Scott, do you think AI is the end of theory? Like, do we have to, like, stop knowing what I want to know that too? The way we think about it is one of these aspects of machine learning is unsupervised learning. You just throw a bunch of data at the problem and try to have the data learn on its own. You're not aiming for a specific goal. That's the end of theory view. Yeah. This goes in with that end of theory in the sense that you're not asking a question. You're

Starting point is 00:10:19 not shooting for a specific objective. You're just trying to create these patterns and formulate some sort of, maybe it's anomaly detection or you're just trying to look for something interesting. Yeah. Find a signal on the noise. Exactly. And it's clustering algorithms. It's all these sorts of things that are incredibly useful. And now we have large enough data sets that is becoming an incredibly necessary tool because you can't necessarily ask every single question and hope to get an answer. Or you don't want to do what some of the psychology research is suffering with right now is this key value packing of like if you ask enough questions, then just like statistically you might get a spurious answer. It's actually the false positive problem in that context. Exactly. And so the idea

Starting point is 00:11:02 is if I can just give it a bunch of data and then it comes up with the questions for me or I can go back and say, oh, this is really interesting. This actually, I know how I'm going to leverage this to then ask this question that I would have never asked before. I think that's becoming extremely powerful. You painted a three-level taxonomy of supervised, unsupervised and reinforcement learning. So where are you then on the end of theory? I think there's going to be need for all of it, to be honest. When it comes down to solving a very specific business problem, like fraud detection, you don't want the algorithm to learn on its own. Just let a lot of fraud through as you slowly come but the idea of what the world looks like,

Starting point is 00:11:38 you want to solve this very specific supervised learning algorithm. Or when you're training a car or something like that, how to drive a car, you don't necessarily want it just to like go get into a million accidents as it slowly learns like what does steering even mean. It needs to be somewhat more directed. That being said, in the security space, if it's more like anomaly detection,

Starting point is 00:11:57 that might be something different because you don't necessarily know what all different reaches could look like. And so you need to kind of do this more unsupervised like clustering-based approach. I do want to quickly ask you to define what is optimization because when I hear that word, I think of like the McKinsey word, like optimization of the workforce. I know you mean it in the context of algorithmic optimization, but could you break it down for us? Yeah. We think of optimization for the more mathematical perspective. So there's inputs to some

Starting point is 00:12:23 system and one or more outputs that you want to maximize. So you have an AI fraud detection system and there's lots of configuration parameters that make that actually work. Yeah. And so we think of optimization in the sense of how do you set all those configuration parameters, those hyperparameters and architecture parameters of a deep learning system in order to get the most accurate fraud detection system. And a lot of that by definition has always been trial and error. Yeah. And so typically that's how people solve this problem. This is why Google will pay like a million dollars for someone with 10 years of deep learning experience is that intuition that's built up. But just like how unique data sets are helpful for solving unique problems, unique algorithms

Starting point is 00:13:02 and kind of unique configurations can get you quite a bit better than the one-size-fits-all approach, a lot of times that hard-won intuition doesn't actually transfer to a completely new type of problem. You could know everything in the world about a DNN, and then you apply it to a recurrent neural network or whatever it may be, and it becomes much more difficult to kind of, you have to start from scratch again. Right. And I was going to say, practically speaking, very few startups can afford that type of a $10 million, 10 years of experience type of hire, frankly. Exactly. So this goes a little bit also to what you were saying earlier, Joe, about how, because when you talk about optimization and the inputs that you're putting in and then you're tuning, the hyperparameters and getting something out, that's when you were talking earlier about this fact that all these companies have these data sets. And the biggest question is how to actually clean and pre-process their data set because obviously it's garbage and garbage out.

Starting point is 00:13:51 That's right. You really have to get the data in shape. Yeah, that is a foundational problem. Data is a foundational problem. Today, it's so easy to go grab a Jupyter notebook, a web front end, that you can execute code in cells. I can basically have a Jupyter notebook full of Python and heavily annotate it. Most of the classes these days in deep learning are using Jupyter notebooks. And today there's just so many Jupyter notebooks out there that people have built, you know, different solutions or tutorials on. There's this explosion of tutorials and code that's out there, but these largely are all toy examples.

Starting point is 00:14:19 And if you want to get serious, if you actually want to optimize these for actual real world usage, there's probably two really big pieces that someone just can't automate or really bring kind of a pre-canned, one-size-fits-all solution. We're kind of entering that golden age of applied, applied machine learning, applied AI. You know, five years ago when I was hanging out at Berkeley and, you know, seeing the talks from Detender Malix, Peter Beale, and the folks there, and I would see some of these breakthroughs in computer vision. And, you know, Jitendra has that, that image that he always shows every year about, I think there's a beggar on the street, right? And he's got a cup hanging out. And there's someone walking by, and it's, the quotation is always, is this

Starting point is 00:14:55 guy going to put money in his cup or not? And the algorithm is supposed to predict that. So there's some really cool things that were happening five or six years ago, those are being operationalized now at scale. So you're basically saying that we're at a moment because I've actually heard the opposite. I think we're both right, though, which is that a lot of the work and the buzz is all algorithms and academia and actually translating it into practice. It hasn't been operationalized into use. I think it's been operationalized by big companies like Amazon and Facebook and obviously the larger deep tech companies. I think we're at the precipice here of having all of the kind of foundational pieces automated to the point where I think having the problem that you want to solve in mind is obviously important.

Starting point is 00:15:33 Having the data in a place where it can be trained, it's clean, it's annotated, especially supervised. When I say we're on the precipice, I believe it's the supervised, you know, machine learning world is just exploding with applications. Obviously unsupervised, deep reinforcement learning, it's still in the bleeding edge. When you talk about having RRL applied in, say, like, autonomous driving, having a car driving around and learn how to drive, you know, by crashing a million times isn't tractable as an algorithm. So it's, you know, being able to do all that simulation in the cloud and then transfer

Starting point is 00:16:03 domains is still not a solved problem. You said there were two areas at startups. So the parameter problem and data? I think the data engineering aspect of things is understated for machine learning and AI. I see like a lot of our partners that have been helping our customers over the last number of years. There are a lot of big data SIs and they've been using Hadoop and Spark. And they've been dabbling in machine learning and advanced analytics.

Starting point is 00:16:24 Over that time, they've actually built up a good amount of data engineering. skills. So I think those are still really valuable. And you think those will transfer to this type of work? I think so. The thing is, they're not experts in the algorithms and in optimization, for example. But I also think that deployment is getting to the point now where it's almost push button with a lot of these APIs we're talking about. You can deploy to an endpoint and do new predictions on any mobile device. It can kind of just bolt on top and be this value add as opposed to something that's rip and replace. We're completely agnostic to the framework you're using the infrastructure and the objective that you're shooting for.

Starting point is 00:16:59 When it comes to AI being practical, I think the two bookends are the following. And the most commonly thought of, one bookend is it's still academic. It's not useful. It's not applicable. And then the other bookend is AI as magic. And like the AI is magic bookend basically says, you know, it's about data and these magical algorithms. So whoever has the data and the algorithm wins.

Starting point is 00:17:18 And then like basically everything's automated. And the reality of practicality is in the middle, which is like, it's not It's not you can just, like, throw data at the problem and throw one algorithm's problem and, like, QED are done, you've got a company. You have to have the right data. It's very domain specific. You have to have the right algorithm or optimizations. Like, every use case of AI requires specific tweaking in a massive, massive problem space to get a useful solution. And so it's not magic.

Starting point is 00:17:45 It is absolutely practical, so it's not, you know, academic. But it requires, you know, at some level, I feel like the complexity has moved, not necessarily disappeared, right? It's like you've moved complexity from I'm going to writing code to now it's basically an optimization problem and the data problem. How much is like the tweaking and the data unique per problem? Is it per vertical? Is it per set? Like how do you even think about that? There's so much work involved with actually getting the data ready, pointing at the right direction and things like that.

Starting point is 00:18:11 And over the last two years, we've actually seen this huge transformation where it used to be this toy problem where it was like, we want to do deep learning and whatever that means, we just want to do something. And optimization actually isn't super important there because it's just like, can I stand something up? It was a coding problem before. But once you actually start to apply it to something, then it's how do I extract as much value as possible out of this? How do I scale this as quickly as possible? How do I deploy this and have it be a reliable system?

Starting point is 00:18:38 And so some of these bottlenecks that were historically in making practical AI have started to shift into these more deployment optimization type. That's actually really good to hear it because it means it's not just all hype and it's not all quite here yet. We're actually at a very exciting middle point. So the Fortune 100. Two years ago, had fun playing to kind of accelerate the R&D phase of certain things. But now those same systems are in production.

Starting point is 00:19:03 And every little piece of optimization matters. And for every single problem, it's a completely different type of optimization. You can take a problem that's really good at classifying the Google Street View data set where it's like pictures of houses and you want to be able to read the address off of it. And that's a kind of classic computer science computer vision problem. And then you want to do a different type of classification. You might need to have a completely different architecture for your neural net. So you're saying, just to pause on that for a minute, that while some of the skills may transfer and the mindsets may transfer, and even some of the way you might think of the models may

Starting point is 00:19:40 transfer, the optimization tricks you use are custom and special to each of these cases? Well, yeah, the intuition for how to configure these systems does not transfer, which is why. why you need to retune, re-optimize, and reconfigure these systems to make sure they're maximizing that business value. Did you tell me a little bit more about why it doesn't transfer? Yeah. So, I mean, back to Martin's point, like, to a certain aspect, some of these deep learning systems are kind of magical in the way that they work.

Starting point is 00:20:07 They're very difficult to explain what's actually happening under the hood. Right. It's a black box problem. Exactly. But the problem with a black box is if you have a black box with 20 different knobs and levers in it and you're trying to get some result out of the end, when you completely completely change what's being fed into that black box. You have a completely different data set or you're shooting for a completely different target.

Starting point is 00:20:27 Like all of that intuition on how you set those knobs and levers is now completely worthless. And so you need a new way to very efficiently and automatically set that for that new problem. And that's true every time the data set changes. Every time you add a new data set, maybe you have an unsupervised learning algorithm, learning a feature representation. Every time you add a new feature, pointed out a new problem. And one thing that we found is that an untuned, sophisticated system will underperform a tuned simple system. You can take a simple machine learning algorithm like a random forest, and it's always going to give you like a B-minus answer.

Starting point is 00:21:03 You can take a sophisticated deep learning algorithm, and if you don't tune it properly, it's going to give you a terrible random answer. But if you tune it properly and train it properly, it's going to beat a human. In a practical podcast, can I indulge a philosophical question? I love it if you did. I've always wondered this, and I'm actually pretty lay about the technology behind this. So I'm going to start with what seems to be probably an entirely different metaphor. But imagine like you're cleaning your house and you're trying to get rid of dust, right? So to get rid of the dust, one thing you can do is you can open the door and you can sweep the dust out of the house and the dust gun.

Starting point is 00:21:33 Another thing you can do is you can basically just move dust around and you're like, oh, it looks better under the bed. I sometimes think about complexity this way, which is like, are we just moving complexity from like basically writing code and algorithms to manipulating data and optimally? But there's still the same amount of complexity? Or have we reduced complexity with AI? Do you see what I'm saying? Are we getting dust out of the house? Are we just moving it around into a separate problem domain? I love that question.

Starting point is 00:22:00 I'd say it's actually a combination of the two. Because practically when you are removing the dust from your house, you do get it into these little piles and then you move it out of the house. Some of these more sophisticated algorithms are making it better because the distributed dust problem is much harder than cleaning up a pile. of dust. So basically you're creating like the mounds of dust, and then you can kind of focus on getting the dust out of the house. Exactly.

Starting point is 00:22:23 So it's a two-step process. There is a reduction in complexity to get to the goal. It's not like you're just moving around the dust in the house. What you're going to see is a lot of the complexity we talked about, get automated. I mean, you'll see us to try and drive those piles out of the door as well. Everybody wants to automate dust. We want to automate dust collection and, you know, in your back, yeah. One of my all-time favorite books is Philip Pullman's His Dark Materials trilogy.

Starting point is 00:22:44 And I always think of dust in that context. So I can never not think about dust as like only dust. Beautiful. Okay. So I have a question for you guys then, especially Joe, given that you work at AWS. And Scott, from your vantage point, is this going to come about, because one thing I always hear about is debates about ML as a service and AI as a service and whether that's going to be the way that these services are going to be delivered. And there seems to be a lot of hype around that in and of itself. I'd love to hear your guys as thoughts on that. I mean, I guess, you know, the diplomatic answer is we have to support them all. I mean, we see, you know, layers of abstraction that are valuable to different types of users. Researchers aren't going to use an API, obviously, because it's, you know, they can't do anything with. There's no knobs. There's nothing to break and do anything with. Data scientists are not experts, so they need some level of automation or helping hand on things like, you know, optimization.

Starting point is 00:23:35 Give me the undiplomatic answer. The undiplomatic answer is then there's a whole bunch of guys who don't know what the F they're doing. and they need some type of API, but it needs to be flexible enough where they could start to bring their own data in. Because even today, you know, just being the self-deprecating Amazon guy,

Starting point is 00:23:49 we have services that aren't, they're not very flexible. Like our image recognition service is really cool and it does a lot of great things, but I can't actually bring my own data to it. And I can't actually optimize and customize for my problem. Well, in your defense,

Starting point is 00:24:01 I will say that when you are delivering a service to people, there are expectations and consistencies and things you have to do to scale. But clearly all the experiments that people have to be, have to make, you can't actually have this one-size-fits-all. When you build a service like that, you build it for the lowest common denominator, and it's being used by C-SPAN, it's being used by, you know, travel sites.

Starting point is 00:24:19 But when it comes down to if I wanted to apply this for biometric security in my corporation, I want to trade it on a whole bunch of fingerprint data. Fingerprint data, you know, pictures of Scott and be able to admit entry into a building for Scott. Never let this person into the building. Never let this build, you know, or Martin. I can't do that today. It's not good enough to probably be visualizing your data.

Starting point is 00:24:39 looking out all the pretty pictures anymore. So business intelligence is kind of moving into data science. You need insights. That's like a big theme. I don't want to say BI is dead, but it's on its last leg. I think everyone wants to move to more predictions, actionable, prescriptive analytics. And you can't do that when you're just kind of looking at pretty pictures. So I think we're seeing just this mass transition from BI, you know, analysts over to data scientists.

Starting point is 00:25:01 Yeah. And they don't know a whole lot about machine learning. I think they're going to get to the point where they can push a button and they can use an XG boost algorithm and then optimize me like, boom, I'm getting a great result. I know the domain that I'm in. I know the data. I know the problem I'm trying to solve. Whether it's an XG boost or a deep neural net,

Starting point is 00:25:16 I really don't care as long as my end predictions are accurate. You're getting the answers you need. You're getting the answers you need. What's your view on the AI as a service debate? I actually think that there's room for both. So there's kind of machine learning as a service or this kind of generic one-size-fits-all. And then there's these more specialized tools.

Starting point is 00:25:32 And the way that we like to think about it is like in the early 90s, when the web was coming online and everything like that, if you didn't have a website, like a one-size-fits-all, just like get me an e-commerce shop online or something like that, that zero-to-one was very transformative for people. By the way, I would still argue it's still true for small and medium-sized businesses that are like on Shopify. Yeah, so we think Shopify is more of the like machine learning as a service option where it's just like, I need something and I don't really know what I'm doing, but I just need something. And that zero to one can be incredibly powerful. Yep. That being says, as businesses start to differentiate themselves on their AI strategy, as it's,

Starting point is 00:26:08 start to hire data scientists and bring this bespoke knowledge and these custom data sets and things like that, now you don't want the one-size-fits-all solution. You want to build a company like Amazon where you can really optimize every aspect of that website to really make the most out of it. It takes them from that one to two, but I still think there's this need, especially in the immediate term, to help people go from zero to one. One thing that which is different than IT industry in the past is it seems pretty clear because the value is in data and because the value is an optimization, that the infrastructure layer will be free.

Starting point is 00:26:42 Or maybe not free, but lowest common denominator. What I mean by that is, in the past, if I was going to give you a computing infrastructure, I would charge you for that computing infrastructure and, like, the tooling and everything else around that. But it seems like the tooling is something that people are happy to build and offer for free at any layer of abstraction, whether it's a service or not.

Starting point is 00:27:03 And so I think from an industry perspective, that's a very different horizontalization than we've seen in the past. So what does it mean when you have this horizontal versus vertical AI layer? So in the past, like a lot of times tooling was something you could monetize. Like Purify was a billion-dollar company that basically sold the debugger, right? All the tooling and all the infrastructure in order to build application was very much monetizable. For AI, because there's so much value in optimization, there's so much value in data, it's almost like this tooling infrastructure layer is something that, you know, is being offered or given,

Starting point is 00:27:31 or, you know, many players are just offering for free, and they're out there as libraries and things on top of services. and then the actual value is the vertical application of those to whatever. So, for example, if I look across AI startups, the ones that tend to be getting the most traction have taken AI and applied it to a vertical problem. They have access to a proprietary data set or they've done a specific sort of optimization, and now there's a vertical focus towards something

Starting point is 00:27:54 as opposed to I've got this very horizontal kind of generic AI layer. I think that's the game of the big players like the Amazon's or the Googles. And so I think from an industry-wide and a startup perspective, I really think vertical focus is how we're going to see the gains of AI the enterprise as opposed to what we've seen in the past in computer science, which is the more horizontal layers. I completely agree. You know, over the last three or four years,

Starting point is 00:28:15 I've seen a number of startups come to me, whether I was at Intel and working with Intel Capital or not with Amazon, and they would say, hey, I can scale up training of deep neural nets, so much better than everyone else. And I've provided our algorithms, you know, come work with us or come by us or whatever they were looking for. And I think no longer your point about verticalization is absolutely valid. And I see the ones that are successful are the ones that have a really nice mixture of kind of AI research, the ones that have kind of one foot in research on the algorithms, have really deep expertise.

Starting point is 00:28:46 But they also are mixed with true domain experts in the vertical that they're trying to work in. So, for example, if it's a medical imaging startup, if you don't have a hospital you're working with to provide you data, if you don't have doctors, if you don't have clinicians that you're working with or have on staff, frankly, I don't see a whole lot of legitimacy. to what you're doing. I've seen startups that overfit to a public data set when I was in Intel and they say, this is fantastic. Look at us. We're getting 99% accuracy on this medical imaging data set. We're worth, you know, $100 million.

Starting point is 00:29:15 But it was one particular data set. Exactly. Come bias or invest in us. That to me wasn't a value prop. So I completely see the mixture of domain expertise in a vertical along with the AI expertise. And I think it's really illustrative that compared back to how it used to be. So I, listen, I used to have a friend I went to college with.

Starting point is 00:29:32 He built applications. And he'd go after different verticals. but he was like a domain in specific application developers. He built like booking systems for like kennels, and then veterinarians, and then lodging systems. He didn't have to know. He'd go and talk to them to see what they wanted. Yeah.

Starting point is 00:29:45 Then he'd build an application. So like, you know, like he could build basically a horizontal company that would sell into these different verticals. But if you look at AI, like in order to add something of value to these domains, you have to understand the data. That's right. You have to understand the use.

Starting point is 00:29:57 And you have to understand the optimization. It's almost like these companies are becoming these very vertically focused companies are the ones that are successful. Whereas like, you know, kind of IT folks we're normally used to thinking these as horizontal layers. We've made the argument on our own podcast on where the machine learning edge for startups will be and how they can compete with the Googles and the other folks who have these huge in-house teams. And it is essentially along these lines of this vertical part.

Starting point is 00:30:18 But I love it also in the big picture of this idea of augmentation and having all these tools, give you the superpowers, but you still have this human skill in-house and that's the domain expert you're talking about. Verticalization is going to be a huge part of that and going after, yeah, very specific problems with very specific data sets. but specialization along that horizontal layer, I think is going to be key. Like, I don't think there's going to be a one-size-fits-all, everything. But if there's specific parts of that journey to getting to a practical,

Starting point is 00:30:48 productionized AI system that humans are bad at or that can be automated if you are laser-focused on that specific specialization. Like optimization. Data collection would be another. Data version would be another. Like general things that are scaffolding. Exactly. To do the AI problem. Scaffolding is a perfect way to put it, but then every building that the scaffolding is wrapped around is unique.

Starting point is 00:31:12 Exactly. Do people know the difference between if it's a data problem or an AI problem and how do they know? Because if you have people who have legacy skills who are coming up to speed, is it obvious? I think the data problem is the first like layer in Maslov's hierarchy of AI. Like you need to actually have the data. Then you need to be able to understand the business context of what you're aiming for. and do a lot of the data engineering to make sure that you can actually leverage that data in some way

Starting point is 00:31:38 that is not just in filing cabinets somewhere. Then there's the tooling and the infrastructure for actually training some of these more sophisticated algorithms. And then actually optimization just sits at the very top. That's for it to be an optimization process. When I think of the Mazel's hierarchy of needs in the psychological sense, it's really about the basics and the survival things and then the aspirational stuff at the top.

Starting point is 00:31:57 But optimization is a means to an end. It's not the end and end of itself. But the way I think about Mazelv's hierarchy is like once you get to the top, then you kind of have everything in order and you're like doing it right. And then you can go to getting the answer to you need. And the optimization is about like that last mile of now I'm applying it to a business problem. How do I make as much money as possible as efficiently as possible?

Starting point is 00:32:16 I mean, I have a number of partners that provide foundational APIs, for example, for NLP, for natural language processing. You can take your text, throw it into their API, get sentiment, get named entities, you know, get parts of speech out, basic things for NLP, and integrate that into a larger workflow and not have to go. and find a corpus of data, annotate it, clean it, figure out what algorithm to use, train it, figure out how to deploy it. All that is basically a restful API away

Starting point is 00:32:41 for most API call from all these. Well, that's a case where you're using the API to, like, pull in different data streams. But I also think of companies like ship out and they give shipping as a service, essentially, through an API. And I think it's really interesting because what I love about this

Starting point is 00:32:56 is about democratization of all these things across ecosystem, because you essentially have all these superpowers you're pulling on, these APIs are giving you this superpowers, power and that super power. And you're consuming it that way in order to do whatever the hell you want as a company. And what you're buying with that API is really data, is the data that they've collected,

Starting point is 00:33:13 prepped, annotated, trained on, which is a really great thing for most companies because they're not going to go build a system like that. And the way that we think about this is if you're in business, there's something that you're good at. There's some way that you differentiate yourselves from your competitors. And you should outsource everything else. Focus on what you're great at and then bolt APIs around that to, supercharge it. If I'm in a medical imaging, vertical, you know, focused startup, I want to

Starting point is 00:33:38 hire guys that are focused on that vertical. You might want to hire the radiologist. Exactly. In the early days of web, I remember there was this like move of talking about tech innovation as like combinatorial innovation. And I used to think it was kind of a buzzy word, but I actually think it makes sense in this context. One of my favorite writers is Brian Arthur who wrote The Nature of Technology and How It Evolves. And he did a lot of fundamental work in complexity, economics. Anyway, it's really interesting. But this idea that you can take all these different pieces and kind of recombine them in ways that are creating entirely new things. That's always how innovation has happened and now it's happening on a grander scale with these things,

Starting point is 00:34:12 which I think is amazing and beautiful. A single person can do now what would have taken a team of researchers a decade ago. Well, thank you guys for joining the A6 and Z podcast. Thank you. Great. Thank you. Thank you so much.

The a16z Show - a16z Podcast: AI, from 'Toy' Problems to Practical Application

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.