a16z Podcast - a16z Podcast: The Self-Flying Camera

Episode Date: February 22, 2018

with Adam Bry (@adampbry), Chris Dixon (@cdixon), and Hanne Tidnam (@omnivorousread) Now that we've finally reached the age of the truly autonomous commercial small drone -- and in this case, a self-f...lying camera -- what happens when you take the pilot out of the loop? And what becomes possible that wasn't possible before? That's what this episode of the a16z Podcast covers, with Adam Bry, co-founder and CEO of Skydio, and a16z general partner Chris Dixon, in conversation with Hanne Tidnam. Beginning with the evolution of the technology that got us here and then going deep under the hood into the tech that makes this possible from propellers to perception, the conversation also covers what it's like to use a drone that follows you around seamlessly; how autonomous drones are different from autonomous cars; and finally, how our relationship and interactions with computers of all kinds will change as they become increasingly powered by AI. ––– The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.

Transcript
Discussion (0)
Starting point is 00:00:00 The content here is for informational purposes only, should not be taken as legal business, tax, or investment, advice, or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com slash disclosures. Hi, and welcome to the A16Z podcast. I'm Hannah, and we're talking today with Adam Brie, co-founder and CEO of Skydeo, and A16Z general partner Chris Dixon, about what happens now that we're reaching truly autonomous drones, in this case, cell flying cameras, what it means when you take the pilot out of the loop and what becomes possible that wasn't possible before. We take a deep
Starting point is 00:00:38 dive under the hood into the tech that makes this work from propellers to perception, then talk about what autonomy will enable and what we can build on top of it, to finally how our relationship and interactions with computers of all kinds will change as they become increasingly powered by AI. Let's start by maybe talking a little bit about the evolution of drone technology. Where are we now? What did it take for us to get here? In a lot of ways, the products that we have today essentially grew out of RC airplanes and RC helicopters. This is stuff that's been around for a long time, 30 or 40 years. Electric power was a huge transition point, and then the combination of cell phone sensors with an imaging device just turned out to be very powerful,
Starting point is 00:01:17 and that's what's caused kind of the recent spike in attention in the space. So the paradigm with existing drones that are available today is essentially they're manually flown devices. But still, it required a pilot, above all. There needed to be a human learning how to use it and directing it. Yeah, exactly. You buy the thing, you hold joysticks, you're essentially the pilot, and you're responsible for flying and controlling it. And if you're an expert, if you're proficient at that,
Starting point is 00:01:42 you can do some pretty cool things, but it's a very difficult thing to master. And I would say a typical experience for a lot of people is they take it out of the box and they crash it into a tree. So it feels like an industry that is still very, very early compared to what's physically possible. There's all these great notions of what you, you might want to do with a drone, you know, on the consumer side, having this camera that
Starting point is 00:02:01 makes it easy to capture amazing video, and for commercial applications like inspection, mapping, monitoring, security. But most of them, we think at their core, in order to really work, and in order to really scale, need an autonomous foundation. And so our focus is on giving the drone the ability to fly itself intelligently, which is a very simple idea, but it's a very technically challenging thing to deliver on. Can you explain why the autonomy is so important, what the human failing is that makes that change everything. I mean, I could use an analogy here. Imagine if on your iPhone, if you wanted to take a picture with it and you push the wrong button at the wrong time, it would irreparably destroy itself. I think we'd all agree
Starting point is 00:02:42 that that would be a pretty big barrier. Not really workable. To using iPhones, and there probably wouldn't be nearly as many of them out there, and people probably wouldn't like them as much as they do. And that's basically where we are in the drone industry now. We think autonomy is a huge enabler. Just at that basic level of being able to trust these devices to do reasonable things in every situation, it removes the need for the operator to be paying attention to it or flying it at all times, which depending on what you're doing can be pretty huge. I also think going back to your iPhone analogy, one thing the iPhone enabled, because it abstracted away, you know, things like GPS and accelerometers and made it really easy for software developers to access those things, it allowed, for example, Uber and Lyft
Starting point is 00:03:23 and Instagram and all sorts of things to focus just on the kind of application development side of things. So allowed app developers to be app developers and provided the phone operating system provided all the functionality of the box. And so longer term, in my view, that, you know, sort of autonomy, knowing the drone will avoid obstacles, be able to do kind of high level functionality like identify people and objects will eventually allow software developers to build applications in the same way they did on the phone. Yeah, exactly. The iPhone, in a lot of ways, it's basically a big touchscreen that you can. put whatever you want on and it abstracts away the hardware and makes it possible to just use that screen to interface with the user for a bunch of different applications. With drones, autonomy is likely to be sort of that key layer that separates sort of the underlying physics, worrying about collisions, worrying about a lot of the nitty-gritty robotics navigation problems
Starting point is 00:04:11 with the application so that if you wanted to do roof inspection, you could be an expert in how roof inspection works and the workflows there and write for that rather than having to worry about what the drone's doing. For commercial application, you don't actually want a pilot flying every drone. You just want to push a button, have the thing do its mission, or have many of them do their missions, and then get the data back that you care about. And that's what autonomy will enable, sort of taking that core layer and making it possible to do a lot of different kinds of things on top of it. Well, let's talk about the autonomy. Is it the same kind of autonomy that goes into, say, self-driving cars? How is it different from different modes of autonomous robotics across the board? So in order to make a fully autonomous system, we've really designed it for autonomy from the ground up. So it's not sort of like an afterthought where we strapped on some extra sensors. There are a lot of similarities to self-driving cars, and we call our first product a self-flying camera for that reason
Starting point is 00:05:06 because it's taking control itself. But the way that the algorithms are designed are specifically tailored for flight. So whereas a car relies on the structure of the road and staying at its lane. Right. There's certain givens that you can't. Yeah, there's a lot of givens on the road that you can't. you can't take for granted in the air. With a car, your goal is basically just to follow the rules.
Starting point is 00:05:26 The perfect self-driving car is just going to follow the rules and be predictable. Whereas with drones, we do a lot of just sort of general 3D reasoning, understanding the scene in every direction in any time. And it's kind of a different set of challenges. In some ways, it's harder, in some ways it's easier. But it's also a regime that sort of allows for a lot more inherent freedom. And there's just sort of more of a creative aspect to it. It's interesting when you talk about the givens on the road, you know,
Starting point is 00:05:50 that you think about like a static road that you're, I mean, do you guys have to factor in things like wind and breezes? Or is that just a much older obvious technology? Because it's basically like a helicopter. You know, how do you think about moving through space like that? Yeah, you're right. I mean, there are a lot of variables in play. And there's a lot of things that are sort of taken for granted in manually flown drones that, oh, if that comes up, the pilot's just going to be able to deal with it. Like what? Like a giant gust of wind, for example. Oh, you feel it. You can see it. Yeah. If you're flying it manually and a big gust of wind comes up, you'll see it and you'll respond appropriately. And if you're next to a building, you'll probably fly away from that building to make sure you don't get blown into it. And if you couple that with you have a low battery or something else, like there's just a lot of things happening. And one of the challenges has been building a system that can reason intelligently in all these different situations. And are you using cameras to do that too? Or what are the other ways you collect instant data about physical movement? So actually when we register through a sensor called an IMU, which is an inertial measurement unit,
Starting point is 00:06:50 So this is an example of where phone technology has kind of transferred over into robotics. Your phone has an iMU in it, so it can tell which way is north, and it can tell what its orientation is. And so we use basically exactly the same component, and you can actually use that to sense the force that's being applied to the vehicle externally. And so it's constantly seeing the force that it's experiencing from the world, and then using that to estimate what's causing that and what it should do about that. And the drones are constrained on the sensors in a different way than with cars. Yeah, there's much stricter limits around the sensor. and compute that you can carry. And that's because of the size of it.
Starting point is 00:07:23 Because of the size and the weight. Everything that you're carrying, you have to have propulsion to lift off of the ground. So, for example, LIDAR is very popular with self-driving cars and is not feasible for drones. Well, it's feasible for drones, but it's not going to work very well. We did a lot of work with LIDAR, but the vehicles were super heavy, super expensive. So when we started Skydeo, we made a big bet on vision because we felt like the progress that's happening in computer vision now, especially with deep learning, but even in sort of traditional geometric vision is just super fast, and the amount of information that's in images is incredibly
Starting point is 00:07:56 rich. Extracting that information is challenging, but the tools to do that keep getting better, and we're really riding that wave. We have 13 cameras on the device, they see in every direction, and then at its core it has a super powerful computer. We're using the NVIDIA-TX-1, which is basically like a deep learning supercomputer that uses the same GPU design, the same architecture that's found in their cloud compute systems. And, All of that is necessary to run the software and flight algorithms that give it the autonomous behavior. So break it down for what does that actually mean in Skydeo's case? You're using cameras plus intelligence instead of LiDAR, right?
Starting point is 00:08:31 What are the layers that make that work? We've developed what we call the Skydeo autonomy engine, and that's basically the complete system that does all the perception, decision-making, and control of the vehicle. So it's a technically very complex thing. But it's actually fairly intuitive to understand what it's doing, because it's similar in a lot of ways to what people do. Like, it basically processes visual information.
Starting point is 00:08:53 It uses the visual information to figure out the 3D structure of the scene, so where everything is. And then it builds up sort of a deeper understanding of what the different kinds of objects are. So in particular, we care about people, we care about objects that we might run into. And then all of that information
Starting point is 00:09:08 goes into a planning system that balances a bunch of different objectives about what the drone's trying to accomplish. It's trying to capture smooth video. You definitely don't want to run into things. We need to obey the maneuvering limits of the device. So all of these things are constantly being traded off for it to decide what it should do. Thinking about how this tool is navigating when it's following you around and
Starting point is 00:09:28 mapping its way through space, a different kind of space, right, because it's aired. How does it actually do that? How does that work? How does it choose where it's going? So one of the keys to getting intelligent behavior is you actually have to predict what's going to happen. Because if you don't predict what's going to happen, you end up being purely myopic and you just react to whatever just happened. The path planning algorithm is actually predicting about four seconds into the future. So it's looking at what you're doing. It's looking at the environment around it. It's reasoning. And then it's using all of these sort of future possibilities to figure out what it should do. And even though it's predicting four seconds into the future, it's not waiting to the end of
Starting point is 00:10:04 those four seconds to then decide what to do. It's constantly doing that. So many times a second, it's making these predictions with the latest information about what you're doing and what the environment around it is doing, and then updating its notion of what it should do based on that. How about some of the design choices you're making as you're combining these different types of technology, right? The propeller aspect and the weight and the cameras and how they're all fitting together, the whole stack, what were your considerations? So I think one of the overall things here, this general idea of like the RC helicopter that's gotten better and better. But then there's this other sort of paradigm of like, what if this thing's
Starting point is 00:10:37 a flying computer? And we've kind of come at it more from that direction. And that's reflected in a lot of the design choices we made. So we really do think about it more like a flying computer. And we've tried to make aesthetic design choices that reflect that. So one example of this is that we built in this perimeter blade guard structure as a first class citizen of device. So it's made out of carbon fiber to be super light and stiff. And it means that you don't have exposed propellers around the perimeter. Right. You don't shop off your finger. Yeah. So it gives it a significant safety benefit for flying around people. But it also, I think, is just closer to the kind of of things that we're used to sliding into and out of backpacks because it's just sort of this
Starting point is 00:11:16 single compact thing that you can like hold anywhere you want and is easy to manage yourself as opposed to having propeller blade sticking off and landing gear hanging down and things like this. So we've talked a bunch about the under the hood technology. How about on a more basic level what the personal experience is like of how the technology actually works when you first begin using it? So like the goal is to make it really easy to capture amazing videos from this dynamic perspective that you wouldn't otherwise be able to get of you doing your favorite activities. So there's a lot of complex technology under the hood to make it all possible, but the end user experience is pretty simple. The way this works is you take it out of your
Starting point is 00:11:53 backpack, you turn it on, and you're controlling it from your smartphone. So you can hold it in your hand and swipe up to take off, and it'll take off from your hand, it'll fly away from you, and turn around and look at you. We have a deep neural network that's been trading against a bunch of different datasets, some of it are own, some of it open source data sets. to recognize people robustly. So when it looks at you, it'll know that you're a person, it builds up a unique visual identifier of what you look like based on your appearance,
Starting point is 00:12:18 your clothing, things like this, and it'll use that to tell you apart from other people in the scene. And after that, you can put your phone away and you're done, and it'll just follow you, it'll avoid obstacles, it sees in every direction, and you can go for a run, you can go hiking, biking, skiing, things like this, and it will move itself in a nice, smooth way,
Starting point is 00:12:37 and give you video that literally looks like it was filmed by a professional film crew. No longer have to be staring up at it. Yeah. I mean, it looks like you have a Hollywood crew there with like booms and dollies and these things, like moving a camera around, but it's just doing all of it itself through software.
Starting point is 00:12:51 It's like a two-step magic trick. The first step, people see the drone fly around and follow you. And when the drone does that, it's moving a lot, right, to navigate. With all the stabilization, both hardware and software stabilization, people are shocked about how the video looks afterwards. It's not just the tracking. It's the actual quality of the image that you're getting.
Starting point is 00:13:07 How do you think storytelling starts to change when we're now enabled to be at the star of your own movie like that? You know, when the point of view just becomes sort of you in your life. And it's technologically possible to just let that unfold. How do you think that starts to change? I think it's really exciting. I mean, I think we're still very early days on all this stuff. But if you just think about how fundamental cameras are to our daily lives now, like the major use case for smartphones is taking pictures and videos, this is the feature that gets the most attention there. we have, you know, huge social networks that are built up primarily on top of sharing picture
Starting point is 00:13:42 and video. And the ability of a camera to understand what it's looking at and move itself and autonomously capture footage, I think, is going to be like a really powerful ingredient in the world in a lot of different ways. It's a totally new tool. Yeah. And, you know, the most interesting pictures in video are generally of people. Over the last six months of development, we've had a bunch of prototypes around the company. We've done a lot of internal testing. We've also had external beta testers. And one of the exciting things to us is seeing the footage that comes back. There's these wacky, creative ones where... Yeah, what stuff do people do? So we had an employee who basically made a music video in his driveway where, like, he had the camera flying,
Starting point is 00:14:21 he was dancing around, and there was this interaction between him and the device that resulted in this video that's kind of captivating to watch that you really wouldn't get any other way. Like, it's not like he was going to hire a film crew to come and film this thing in his driveway, But with the device, with almost no pre-planning, he could just put it in the air and get this kind of amazing thing. I want to talk about how this actually feels, you know, how it's first recognizes you and the relationship you kind of develop with your drone, right? Because it's really now a relationship thing, kind of. How does that play out? What does it feel like? I think it's impossible not to personify it, really.
Starting point is 00:14:56 And we like that. I mean, it responds to you. It acts intelligently based on what you do and what's around it. And so there is this kind of like fun aspect of it's like, it's sort of your companion, the real element of the product experience. Well, it also makes it part of your life in a different way, right? You're not just a bug under a microscope. It's also interesting that you talk about the point of view changing, right? There's a perceptual distance that you're getting.
Starting point is 00:15:19 There's a perspective. There's a perspective shift. It seems to me like that's an interesting kind of like mental consciousness shift that happens when you start seeing yourself from that other perspective. The broader trend is kind of a new thing in computing where, basically for the entire history of computers they've been these static objects that you like type on or tap on and at the end of the day
Starting point is 00:15:39 almost all computing results in just showing you an image on a screen that then you react to and so the ability of a computer to sense its environment and act in the physical world that's basically what a robot is but we're entering this new sort of world where these kinds of things are possible
Starting point is 00:15:55 it's a much more dynamic relationship just from the technology point of view that people started to see it for like Alexa and products like this And generally, there's sort of, you know, there's been all of this incredible progress in AI in the last couple of years. But I think still a lot of it hasn't reached, you know, production products yet. But we're going to see more and more over the next couple of years computing devices that do feel like active agents. Active agents.
Starting point is 00:16:18 That's a really good way to put it. And most public is thinking now of AI as this like dialogue. But the relationship with you is silent. You're not talking to it. You're not telling it to follow you. It's all like implicit and silent. Yeah. I mean, it's kind of like physical.
Starting point is 00:16:31 Like, it's in your space, and it's responding to what you do. I mean, in a lot of ways, you control it through your motion, like, not through gestures, but through how you're actually physically moving. So right now the drone is designed primarily for consumer use cases. Imagining down the road, what do you think other possible use cases for autonomous drones are? There are a lot of exciting things on the commercial side, and in a lot of ways, I would say it's kind of less mature and wide open than consumer even. and general possibilities of using drones to automate collecting data
Starting point is 00:17:01 that either isn't collected today but should be or is kind of difficult and manual and slow to get. So we mentioned roof inspection. There's this fairly manual, slow, painful process where people actually have to get on a ladder, climb onto a roof, look for physical damage to inventory it. For insurance purposes, you know, it's not a particularly efficient thing.
Starting point is 00:17:19 It's pretty dangerous to the people that do it. This could be done super efficiently by a drone, but flying it manually to do that would be hard. People's roofs have trees and power lines, and things like this around. And so you don't want, like, the world's best drone pilot to have to go out to your house to safely fly this thing around.
Starting point is 00:17:33 That doesn't scale. To make it work. There's a lot of other things like that where autonomy could have a really big impact. Construction sites have become actually a popular case for drones. The drone flies over and looks at the state of the construction, make sure things are put away safely, you know, sees how the progress is going, check the inventory levels.
Starting point is 00:17:52 Yeah, I mean, I think, you know, infrastructure inspection, mining, bridges, you know, make sure it's not going to fall. These things are very hard to inspect with humans. A lot of these things are basically like efficiently digitizing the physical world in some way so that that data can be tracked over time and issues and errors can be found. There's something in the order of like 10 to 20 million jobs in the U.S. where people have to climb things. Like for example, the most dangerous job in America is climbing cell towers to inspect them.
Starting point is 00:18:19 They have to be inspected every like six months or something to see if the equipment's working and if it's like for safety issues, things like this. there's 100,000 oil rigs in the world and the salt water can corrode the steel and then you have gas and fire and you can imagine the bad things that can happen if you don't inspect it properly and today it's a very manual intense process
Starting point is 00:18:37 I met a company recently using drones to do that in all these cases they're all using kind of manually piloted drones and so the big constraint is they have to have these expert pilots do this right because imagine if obviously if you hit something it could be catastrophic it's precisely for all the things
Starting point is 00:18:51 that the human scale is not enough but we've still until this point been held back by the human scale. Now suddenly that's out of the picture. So I'm just going to ask, when those humans are out of the picture, in this new autonomous drone ecosystem, where are the roles for the humans in that? I think there's still humans. It's just the human doesn't have to do the dangerous job and you can do it more efficiently,
Starting point is 00:19:10 but you're still going to need all sorts of things around, like, an insurance case, around kind of analyzing the claim and paying it out and doing all the kind of work and bringing the drone out there. I think it's all leverage. So it means that, like, the people who are involved are working at a much higher level of abstraction and commanding a lot of resources to get the data in an efficient way. It's about strategy of data, kind of, an analysis. Yeah, and then making decisions afterwards about, like, what do we need to do, like, based
Starting point is 00:19:37 on whatever we've learned. The history of technology is, you know, it's, there's a sort of fundamentally asymmetry where it's easy to imagine the jobs that go away and hard to imagine the ones that are created. Yeah. But the history of it shows that, you know, with every new technology wave, yeah, there are things that were done before that become obviated, but there's a whole new set of things that come along. Drone pilot wasn't even a job description five years ago.
Starting point is 00:19:57 And fundamentally, I think drones are likely to enable a lot more new areas of creativity, new kinds of value to be created. If one of the big limits to the drone space, which we're now achieving is autonomy, what's the next state of the art? What are we pushing up against next? I mean, I think you can look at birds as an example of what's possible. Like, how far is a drone that you buy today from what a bird is capable of? I would say pretty far.
Starting point is 00:20:21 These things are kind of like bulky and difficult. to manage. And autonomy and intelligence seems like one key element of that. But a lot of the hardware design aspects also I think are like relatively early and immature. And it hasn't been possible until now to build a small lightweight device with a bunch of onboard intelligence, a bunch of sensing, an electric power system all packed into one. There are some really powerful new combinations of those things that we're going to start to see emerge over the next few years. This is part of a broader wave, which is, if you just go back and look at the history of computing, we had the main frames, where he had sort of a one computer for every 10,000 people or 100,000 people. Then you had PCs, sort of one for every 10 people, and then you had, you know, smartphones.
Starting point is 00:21:04 We now have 3 billion smartphones, sort of a computer per person, or we're going to get to that point. And now we're going to start to see kind of 10 plus computers per person. And that means, you know, computers embedded around your house at the office, in the air, in drones, in your car, you know, VR, AR headsets. I think we're on the cusp of this kind of Cambrian explosion of computing devices. A little galaxy around each of us. Yeah, a little galaxy around, all powered by AI. AI is the critical ingredient, which lets these devices understand and interact with the real world. And new interfaces like speech and gestures and the ability just to walk around and have it follow you. Justers you might not even be aware of, right? That's right. And then just the ability to understand
Starting point is 00:21:46 the environment. And the second order implications of this, I think, are really profound, which is, you know, we don't even know. yet what people are going to do with these devices and all the applications. You know, if you said to somebody in 2005, there's going to be these amazing smartphones, I don't think people would have predicted some of the applications that people develop. So, for example, ride sharing, it wasn't a widespread prediction, right? And so I think there will be things like that. Once developers and creative people have drones in their hands, drones that are fully programmable sort of flying computers,
Starting point is 00:22:14 what are all the new things they come up with, it's going to be, I think, a really exciting time of the next three to five years, as we discover that. Well, if you had to take a wild guess and say, okay, this is the human behavior that I think is going to change. What would you say
Starting point is 00:22:27 one of the first ways we're going to start seeing human behavior change because of this self-flying camera? People call me an optimist, but I think a lot of what computing does is democratizes things that were only available previously
Starting point is 00:22:36 to, you know, super wealthy people. Today, someone with an iPhone and Google has more information than the President of the United States did 20 years ago, right? Similarly, like you could film things like you could do with Skydeo, but you had to have a Hollywood film crew,
Starting point is 00:22:48 right? And now you're democratizing that in the same way that iMovie sort of democratized film editing what happens then right what happens sort of second order to that well i guess we're on the way to soon finding out thank you both so much for joining us on the a16z podcast

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.