The Data Stack Show - 24: Demystifying AI with Duc Haba

Episode Date: February 10, 2021

On this week’s episode of The Data Stack Show, Eric is joined by Duc Haba, an AI researcher and enterprise mobility solution architect consultant who most recently did AI consulting work with Cogniz...ant. Their discussion revolves around demystifying artificial intelligence and why so many people either fear AI or place too much trust in it. Duc talks about some of the AI projects he has worked on, some successes and some failures, and points to how the data biases that humans bring into the models can radically alter the outcome of those endeavors.Highlights from this week’s episode include:Duc's background with AI and getting to work with LeVar Burton (1:44)Demystifying AI and coming up with a definition for it (3:34)Misplaced fears of AI (7:53)Misplaced trust in AI (10:36)Public versus hidden AI (13:58)Acquiring the data needed for to train AI models (23:11)Examples of interesting AI projects Duc has worked on (27:58)Where to go to learn more about AI (35:06)Thinking of AI as something that can help your business do something better with what it's already been doing (39:53)Anticipating the near-future of AI (44:16)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show, where we talk with data engineers, data teams, data scientists, and the teams and people consuming data products. I'm Eric Dodds. And I'm Kostas Pardalis. Join us each week as we explore the world of data and meet the people shaping it. Welcome to the Data Stack Show. Today, we are going to talk all about AI. We have a very accomplished AI practitioner, Duke Haba. He was most recently at Cognizant doing AI consulting work. And my burning question for
Starting point is 00:00:41 Duke is around how people perceive AI. And he has some really interesting thoughts on the way that AI is perceived by the public. And he talks a lot about sort of misplaced fear and misplaced trust. We'll dig into that. Kostas sends his regards. He had some trouble with his internet connections, so he'll join us on the next show. So let's dive in and talk with Duke. Welcome back to the Data Stack Show. Eric Dodds here. Costas had some internet issues,
Starting point is 00:01:11 so unfortunately he can't be here to co-host, but I'm really excited to talk to our guest, Duke Haba, who is a consultant in many things AI and came into consulting from Cognizant. Duke, thank you so much for joining us on the DataSec show. I'm glad to be here. All right. Well, give us, you have a varied background in AI. You've worked at small startups. You've worked at big companies.
Starting point is 00:01:39 So just a quick one or two minute overview of your background would be great. Sure, sure, sure. So I started school in Omaha, Nebraska. I did CS majors, go to Xerox, the Upstate New York, then go to Xerox PARC where I do most of my research on AI. Back then we called it Expert Systems. I had a big dream on a rule-based system, but it didn't turn out so well rule-based system wise. Then after a while joined Oracle working as the principal architect in their web groups. From there on they do bunch of startups, some very successful, some not so successful. Oh I forgot to mention one of the ones who is somewhat successful is like I can work with Lavalar Burton, which if you tracky friend know,
Starting point is 00:02:26 that would be Jody from Star Trek called Reading Rainbow App. Reading Rainbow. I was going to say LaVar Burton. He's a part of my childhood. Same here. Literally, I'm Star Trek. When I work with them on that project,
Starting point is 00:02:41 I like, oh my God, LaVar Burton. And the crazy thing is you know every time we go we in san francisco time we go to lunch is that i can take him to the place i go 10 times before no one ever know who i am he come in and just everybody stopped every just line up talk to him i like dude he's like you know there's enough time no one know who i am but it's just amazing the guy's super nice the guy guy is just super, super nice. Love all. Yeah.
Starting point is 00:03:07 So anyway, yeah. So anyway, after a bunch of startups, I'm not working so well. Join Cognizant, find help you're doing senior solution architect, well, senior mobility solution architect and AI scientist. Then recently, work on my own with a bunch of friends doing pretty much neural networks or artificial neural network type of work, data cleansing, building model for image recognitions and what have.
Starting point is 00:03:34 Well, Duke, when you and I connected before the show, you brought up the phrase demystifying AI, and I couldn't be more excited to do an episode on this because, you know, I'm far, far less technical. And I think I fall into a group of people for whom, you know, you sort of know a little bit of what's going on under the hood with AI. And then you see all of the publicity around AI. And those two things don't line up a lot. And somehow AI has developed a little bit of a shroud of mystery. And one of our previous guests who does some AI stuff, we talked about AI briefly on the episode.
Starting point is 00:04:16 But he said AI is kind of developing a bad brand because people don't necessarily understand it so first question what is AI give us the non mysterious definition that you know both technical and non technical people can understand yep the same I've been working in export system in AI for years and years now, so in college onward, is that there is no clear definition of AI. That means no one agrees on what AI is. So the short answer is anything is AI, from chess program to, you know, from Facebook to Alexa to Siri to self-driving cars, because we have no clear or conclusive definition of what AI is. So anything
Starting point is 00:05:06 is AI. In colleges and universities, a lot of professors and a lot of us who write white papers will use the term AI also very broadly. And the thing to AI is really broad as a field. I'm not expert in AI. I'm expert in artificial neural network, which is a sub-branch of that whole big field. And so the clear definition is that if you have something as a baseline, and if this machine can do better than your baseline, therefore it's an AI. Let me give you an example, clear example. Let's say you and I play flipping coin, right? The baseline is that there are 50-50 chance we can call head or tail. If I build a program or a neural network or just computing algorithm programs, that can predict 75% correctly.
Starting point is 00:06:02 That's pretty good because that's better than what we human perceive as intelligence. And using the same example, if I actually can do that, I can take the same idea, go to Las Vegas, and play roulette. I can put in black and white. Well, it's not exactly 50-50. The house has a slight percentage over it. But I still, if I got 75% correct predicting red or black, I can make a lot of money.
Starting point is 00:06:30 Yeah, that's the type of AI that everyone wants. Exactly, exactly. And people are like, why can you do that? Well, in my field, I need a lot of what are called, first I need a baseline, see what a baseline is. The second big thing I need is called label data. I need a bunch of label data if I can train on that field. So, I mean, the thing that makes it so confusing, it's not only Hollywood make it like Terminator 3, AI kill everybody, to business who claim right now, everybody claim to have AI.
Starting point is 00:07:03 And nobody can tell them they're lying because there's no clear definition. You know, it's not like apple pie. I have apple pie. Well, it can prove it, what an apple pie is. And the thing that make consulting like myself or a lot of work in business, when we do build AI for other people, become a big problem. Because you cannot tell me,
Starting point is 00:07:25 hey, do go build me an AI portal site on AI curated podcast. Well, what exactly is the baseline? You know, what I'm trying to do. So a lot of business, when I do consulting work, I always have problem with, I have to tell the client, well, first let's start with what the baseline is, what the base model can do, and it's data for me to do better
Starting point is 00:07:52 than the baseline. So I know it's a field that got bad rap because there are so much things, sort of mistrust and misplaced fear put into AI just like last november i'd uh on tech talk a really knowledgeable person come up and talk about what his topic is
Starting point is 00:08:15 intimate danger of ai and i listened to the whole thing i like he write in so many ways but he misplaced all the fears so his whole thing for example is in with with facebook news feed right yes yes there's an ai magazine who recommend hey duke you should watch more of this video hey duke you should watch more of this more sports or more do yourself videos byi so it's AI. And he said that actually changed people's behavior. And it changed how people learn about new things as well as how people behave in society. And he is right. But his conclusion, there you go.
Starting point is 00:08:59 See, that's why I was afraid of AI. Sure. I don't think that's totally misplaced, right? Because that model of AI, the newsfeed, the decision to make, make sure people spend more time on YouTube and make sure people click through more of the related products next to it.
Starting point is 00:09:23 So that was the thing that the AI trained to do on YouTube. But that is a human decision. That's some DP who made that decision. It's not the AI who made the decision, right? Exact same setup, the same model. If I put into different data and I label differently, let's say my goal is for Duke to explore new educational stuff on YouTube. If that's my goal, I can use the exact same model to train it. And the result will be totally different. Right. So the fear is misplaced.
Starting point is 00:09:57 It's not the AI fault. It's the VP fault who make that is their goal. That make sense at all? Really interesting perspective, right? It's the intention behind the technology, but people sort of place the, they ask the technology to bear their responsibility for what happens. Okay, so misplaced fear, and then you mentioned misplaced trust. So that was a really great example of misplaced fear. Give us an example of misplaced trust in AI.
Starting point is 00:10:36 What does it look like on the other side of the coin? Yes, so the other thing too, because of Hollywood, we keep thinking of AI, artificial intelligence. It's super intelligent. Like, you know know it's big giant brain somewhere in Google. So people always expect the highest behaviors, right? So there are AI systems that are able to look at skin lesion and tell you whether it's cancer or not. And it's better than human. And Google translation, also done by neural network, is on par with expert humans, right?
Starting point is 00:11:17 So the mistrust is when AI stumble, make a mistake, like self-driving car, right? The thing is that we should not have that high expectation for for what I call digital intelligence. Right. I play chess, for example. Yes, I can write a program that beat me in chess and I'm nowhere close to master level. I mean, I like the junior league level, but I am intelligent. So machine can beat me. Therefore, it's also intelligence. But I don't expect it to be beating the grandmaster of chess. So I think the mistrust came when AI made a mistake. Again, AI made a mistake because people like me don't, don't code data well enough or didn't
Starting point is 00:11:59 do some data biasing. So two big things I keep going back and forth is data biases and baseline. Because of how I feed in data, how I teach the models, it's not perfect. So if your expectation always something is perfect, if you make one mistake, you throw it out. I think that's so unfair to the digital intelligence, right? Sure. And our expectations always, like you have to be better than human, have to be like perfect, perfect, perfect,
Starting point is 00:12:30 which is impossible for an AI to live up to that expectation. Yeah, you know, it's reminding me, you know, I come from a background in growth and marketing and there is a lot of talk in that arena about applying AI to advertising, and that will solve all your problems. And people charge an immense amount of money for products that do that. And there are a lot of people who have been very disappointed because of all the factors that you mentioned.
Starting point is 00:13:04 You have to have a lot of the right data. You have to set your expectations. There's time for the model to learn. I mean, there are just so many factors that go into it. It's not necessarily plug and play depending on the context. But I will say, thinking about AI as plug and play, you had mentioned before the show, this concept between sort of publicly available AI and then hidden AI. And that to me was, I think, a really insightful point on people group AI into sort of a single term, but the ways that we experience it and sort of the, you know, the products or experiences that leverage AI are really, really fundamentally different. And could you just tell us a little bit about sort of public versus hidden AI? Yeah, yeah, absolutely. Public
Starting point is 00:14:00 AI, for example, like Apple series, Alexa, self-driving cars, and those are other that we do on a daily basis. And in most cases, Siri is pretty good, right? I mean, beginning all wrong now, it's pretty good, but people still can trip it up, right? Yeah. Oh, the expectation is still high. It's like something like, you know, you have to be perfect. Like, if I call to you, Eric, and ask you a question, I expect you to give me the right answer all the time. Sure.
Starting point is 00:14:28 That would be like, I can't put expectation on you, but yet I put expectation on Ceres. However, on business side, we do what I call Hint AI, with AI run behind the scenes. Actually, you'd be surprised to learn how much AI actually take over the business world in terms of some key decision. And let me give you a clear example.
Starting point is 00:14:50 I worked with one insurance company who trying to build basically an adjuster. When you're not got to call accident, got all data together, first design, who 50 but 1% more from which side of the parties and working all to the plan that I buy and how much I should pay and how much a company should to reimburse me on it. And that is behind the scenes. That actually, it's a neural net who's designing that goal. And it's something people scare because that's something you cannot opt out. You and I can't opt
Starting point is 00:15:26 out from Facebook or Twitter, but we can't opt out when it comes to payer insurance, right? Interesting. Sure. And the thing is, a business like they know that it's not perfect systems. That's why they have people in between. So it's not like we submit the data and outcome your payment and you have to follow it. Someone, a person in between, So it's not like we submit the data and outcome your payment and you have to follow it. Someone, a person in between, we look at that recommendation and okay or not okay based on that.
Starting point is 00:15:52 But the expectation in business and even to Facebook recommend feed or YouTube recommend feed, they don't expect to be 100% correct, right? But behind the scenes, companies say, well, that's better than what we thought now the better now baseline right if our baseline just randomly picked out what popular
Starting point is 00:16:12 out there and so that's a baseline but if this can do better than that we'll use it instead we don't expect it to be 100 correct we don't expect you to see that anything feed to your Facebook is exactly what you needed, when you need it. Same thing with YouTube. But the expectation for public AI is that we want it to be perfect. I want Siri never make a mistake when I ask the questions. I want self-driving car never crash or have fender benders or never speed above the speed limit. So I think the two different have different expectations and there are more work in the hidden AI side on the public AI side. It you know it sounds like and it makes sense as I think through
Starting point is 00:17:00 this that fear seems to be more associated with hidden AI, public AI. I'm just thinking about Siri and I mean, it is shockingly good, right? And Alexa is shockingly good, right? I mean, just things, I mean, it's weird because you think about, well, you know, of course, you know, you know, I can understand different voices, right. From people in my family. Right. But, but the fact that Alexa can do that is pretty amazing with the level of accuracy, but it isn't perfect, right. It gets stuff wrong. And so that I haven't, I haven't thought of this before this conversation, but that actually makes me more comfortable with it because I know that it's not perfect, right? Whereas something
Starting point is 00:17:45 behind something that I can't see, I don't necessarily know when it's getting things, you know, right or wrong, or even where it's at relative to the baseline, whereas that's like a visceral experience with public AI. Yeah. And I think that mistrust is in place of fairness, right? So it's on to how I train the data biases, right? If I know my data is biased, some clear example gave, you know, like skin colors, people look in training data only with Caucasian. But actually the more realistic example
Starting point is 00:18:24 is that right now there are people training AI to do triage. When you first come in, you would say for COVID-19 or regular triage or in war, let's figure can Duke be saved? Can Duke get enough pain?
Starting point is 00:18:39 Eric, more pain, get him in first. Triage is a really important thing, right? And really life and death situation in some time. Can we trust a neural network to do that, right? So my point is that if you set up a baseline, how well can the nurse or the doctor do that in the hospital?
Starting point is 00:19:00 Let's say, I don't know percentage. I'll just make one up. Let's say the 80% are correct. If an AI network can build up to 90, 93%, I think we should use it because it's better than work out expert. And it doesn't walk down with the day-to-day problem that might affect a person. But let's say, what if the data is not trained correctly?
Starting point is 00:19:27 Let's say, I never train the data with old people. I mean, that's obvious bias. Let's say I never train them with old people. So when opal come in, they're always diagnosed as wrong. So I think the mistrust is not of the AI. They should mistrust on how well the data represents when you drink that bottle. Does that make sense? Sure. Yeah, you know, it's interesting.
Starting point is 00:19:53 And I want to move on to a couple of real-world examples just from different contexts. But I will say, just thinking about sort of mistrust, trust, public, hidden AI, it is really interesting. AI seems to sort of press on moral or ethical questions way more than lots of other technologies. And it seems I'm not an expert on this, and I haven't spent a ton of time thinking about it, but when we generally experience technology as something that we have full control over, or at least sort of have the ability to configure at a very detailed level, but when technology begins to make decisions, it really pushes on sort of this, these moral and ethical questions that we have, right?
Starting point is 00:20:45 Because it's, it's, it is making decisions, right? And even the way that I talk about it, right? I'm personifying code, right? Like, which is crazy. Yeah. And I think the thing is, I want to demystify AI. A lot of people know, like, you know,
Starting point is 00:21:02 we, the AI scientists, well, we're the same as everyone else, make the same mistakes people do. We de-mystify, so that way you can say, hey, wait a minute, this system is biased, biased against me, you know, for any number of reasons you think. And you can write them and say, I think your data is biased. Did I have you train your data with my profile, with people like me? If something dealing with, you know, personal matter, like me if something dealing with you know a personal
Starting point is 00:21:25 matter like medical or something like that so the more people understand about how ai build then the less fear you have about it right and sure yeah i could see why i could see why people mistrust ai for me is that you know the the casing point on YouTube is that it's a senior VP who make that decision. You know, it's not AI magically. And AI doesn't have consciousness, by the way. Sure. Yeah. Well, you know, before we jump into the examples, I actually think it's worth digging into the technical side a little bit because one thing that I think, and I mean, we have our audiences full of really smart developers, many of whom are immersed in the world of AI and AI practitioners.
Starting point is 00:22:14 But we also have people who are more on the data engineering side, right, or maybe less exposed to the specifics of AI. But one thing when you talk about training a model is that there's a threshold for an amount of data that's needed in order to produce sort of what you like the desired results on the other end, right? And so there can be in many ways a quantity problem, right? I mean, a lot of if you're're a startup and you want to do AI with your data, I mean, many startups are limited simply because they just don't have enough of their own data to train a model, right? It takes a lot of data. You do this type of work every day. So help us understand what is the economy of scale that you need in terms of data, right? And we can even use specific examples. So let's say, you know, if you want to build a triage system or insurance claim adjuster,
Starting point is 00:23:09 how many data points are we talking? Yep. So that actually was a good question. That question also happened in business all the time. And the real answer is, I don't know, we don't know until we build your system. Oh, they have enough have enough data. They didn't converge. But the good news is that we have something called transfer learnings. For example, let's take a real example of something really easy, like whether the mole on your skin, you know, does cancel or not, or recognize faces in your family, make sure all these people are from your family. In those cases,
Starting point is 00:23:46 sometimes you don't need a lot of data at all because when you transfer learning, meaning you like a ResNet-34, ResNet-50, ResNet-101, those models have been trained over millions and millions of images already. So it already knows a whole lot of things about color, edges.
Starting point is 00:24:04 It just knows a lot. So enable for me to train, I take that to the model. By the way, if a simplistic term, a model is like a big function. So I take that big math function. I already have preset parameter for me already. I could train it a little bit on what I need. For skin cancer, you probably need something less than two or 3000 images with label images before you can actually train it
Starting point is 00:24:32 and convert correctly. And of course you set up your base model and I'll show you base model better than that. More data or we can get your model more accuracy above beyond that. So the number of data you need is really important. It's how the model, how the accuracy model come out to be. But with today's technologies, especially transfer learning
Starting point is 00:24:54 and fast AI make it really easy to use transfer learning, you can surprisingly build a lot of fun things. For example, for fun with my niece and nephew, we want to build something recognized, form animal. So you can walk around with a phone, take up a picture and say, there's a cow, a goat, a duck, and what have you. Oh, fun. Yeah. So I did that. And I got all the niece and nephew run around with their camera, take pictures, and they can do some searching some searching on on google and and bing search that's something but mostly from their own pictures that and all we need i think
Starting point is 00:25:30 all together we have a little bit over 8 000 images and we can tell different between uh between nine different animal form animals and it converts to 98 accuracy, wow. So to let people know is that the easiest way for business to start out with AI is just try it. Yes, data is important, but you don't know how much data you need until you try to train it. And your model will quickly can tell you, oh, it will not converge. It needs more data than that. Or you find out the data you have wasn't labeled
Starting point is 00:26:05 correctly or it have too many biases. So the whole point, if I take only pictures, for my example, of the farm animal, if I only take only picture from Google or Bing, they're all perfect photograph picture of chickens and ducks. But that is how kids take pictures, right? Kids picture never perfect. They never lightning put anything. So if I train only on those model, it may converge while giving high accuracy, but in real world test, it's just not well. I mean, they are too biased for professional pictures. Does that make sense? Yeah, it makes perfect sense. And, you know, it's, I am actually, just during this conversation, just becoming more aware of my own sort of biases when it comes to how I view AI, right? Because like many things, and you didn't exactly say it this way, so I'll
Starting point is 00:27:02 extrapolate a little bit. But like many things, the answer to a lot of these questions is it depends, right? And your baseline and then the specific thing that you're trying to accomplish drastically, you know, can drastically change the way that you're sort of leveraging AI, you know? And so it's just interesting, right? It's very easy to think of AI as, well, you just give it anything and it'll figure out how to make use of it, right? But that's not right. And I know that from a technical standpoint. But as I hear you saying that, I'm like, yeah, well, can't AI just figure that out, right?
Starting point is 00:27:41 But then in the back of my mind, I'm saying like, of course not. I'm having so much fun in this conversation. I skipped over some examples. So I want to be sensitive to time here. That was a really fun example, but why don't you give us maybe one of the most interesting AI projects you've worked on in a business? Yes. Actually, I think before this, I wrote down three.
Starting point is 00:28:01 Let me pick one of those three. Okay. I'll pick, this is a project, a little background of project. So I was with this car company, really big car company in Detroit, automobile in Detroit. So working on some, you know, data transformation,
Starting point is 00:28:18 IT work behind the scene with them. And when I meet one of the marketing person, just during lunchtime, and we're talking like hey and they know I'm in AI and what have you say hey you know we need something fun with AI on on our trade show worldwide card trade show that do every year and I was like hey you know we'll do some fun and we just do long we're talking over and I was like hey what is something like this just on the spot if you we have a big screen display on on the tray show if you walk by it will say hey eric you are the full focus guy dude walk by and say hey look you are the four tundra pickup jeep guy and it got a
Starting point is 00:28:59 crack so people like oh my god you know this is like yeah this ai of intelligence ai wow that's fun yeah it will be a great idea right then then he said well give us some thoughts actually she said give us some thought and come back maybe a real proposal so i i gave some thought and say hey you know is i know this company will will not only require all their employees to buy their own brand of car. You cannot buy a different brand of car. But even a consultant, if I drive a different kind of car, I couldn't park in the parking lot at the park. I don't know where, somewhere way off. So I said, what if you gave me the data?
Starting point is 00:29:43 I don't need to know the name. I just need to know a picture of them, their family, their faces, however they want to pose with their vehicles and send it to me. So I got pictures of themselves posing with their car, their family in the car, however they want to take it, give me all their pictures
Starting point is 00:30:02 and give me the label of the car that that they actually on and i'll train it and and basically see how well the model converge surprisingly the model converged really well so we do a few tests and get this interesting part of it right so we do a few tests within their power plant we set up camera and everything people and in the in the power plant walk by and say hey look at this they look and they smile at it and say ah i think you are driving a full focus you know and and we take note on it when it's right when it's wrong so in our model we got 89 percent accuracies then later on we're trading to 94 percent accuracies which is tremendous. And everyone was happy. She was so excited. The whole time working well, I was like, go ahead, do it. We'll pay for developments of that. And we need it for
Starting point is 00:30:53 this day for our trade show. Wow. That is really fun. Yes. But here's interesting part, it failed. Oh, really? Yeah, yeah, yeah. And the story is that that so i'm in a trade show and we're really happy i spent uh four days before that prep in the system they even okay for me to to be there at a car uh tracer which is crazy crazy big new little people walking by so we set up the system and we do it and of course we we ask people politely there's a button on the bottom you say am i correct am i wrong you know just two buttons so let's see we can collect some data so turn out to be we literally about only 46 percent correct in the threshold but in the test
Starting point is 00:31:38 case we prove a 94 percent accuracy so it's like, oh, what do you do wrong? Did you do something wrong? And it didn't occur to me until I looked at, you know, the image that got wrong to the image that got correct. Turned out to be in their bias, Eric. Because the ethnicities and the people working in debt companies in Detroit, pretty much similar in terms of ethnicities and race. But in a trade show, they got all kinds of pure truth, all kinds of different ways. Oh, interesting. My data set does not account for that.
Starting point is 00:32:22 Interesting. I wouldn't train on multi-international data set. I trained on data set that people live and work in Detroit. Wow, yeah that's fast. I mean again going back to the point of you know even though you had a model that was working well with a certain data set like you can't just apply that in any context and make it work yeah and i should have known that i mean oh it was such a tense time after what they you know my company i worked for at a time was not very happy with me the client not very happy with me no one happy with me but but legally i'm okay because you know we set a baseline and during the during the the show, we, the baseline
Starting point is 00:33:05 is 76% accuracies. We're doing a lot of math and statistics in front of that, very good numbers. Sure. I beat the baseline. So, you know, it's a successful project deliverable per customer specs. And I just think the odd thing is I, I know, so I have to find out the problem. I talk to my team, my team's like, no, don't, don't tell them that's the problem. Because I'm willing to tell, hey, your company never diverse.
Starting point is 00:33:28 No company want to hear that. Right, right. Sure, sure. So, you know, I give them some general excuses on the model wasn't trained long enough and I need more extensive data, blah, blah, blah, blah, blah, blah. But it's always stayed with me as it as one of the real projects I'm working on that I thought is really successful. And I, I, I dream of it at the beginning.
Starting point is 00:33:52 I follow it through, I code it. I work with some friends who deploying it and in general have all a lot of fun throughout the whole process. Right. Because it's a fun thing. It's not like life and death thing that I worry about. Sure. But that's for sure a success.
Starting point is 00:34:09 Then like, and that leads to the mistrust of AI again, like, oh, AI sucks. Sure, sure. AI didn't suck. Duke sucks. He didn't think about international data he's looking at in real world.
Starting point is 00:34:21 Sure. Well, I mean, I can just tell you that, Duke, that I really appreciate having an inside view into sort of the human elements behind AI and you just being very transparent and vulnerable and sharing that story. I appreciate that. Yeah. Well, we are on time here, so let's get a little bit tactical. So again, some of our listeners, you know, what we've talked about will be old news to them, but we probably also have some listeners who have maybe dipped their toe in the water of AI, but would love to learn more.
Starting point is 00:34:58 Where should they go to learn more? If you want to get into the practical, sort of, I want to try this out, how do you do that? Yes, yes. I will definitely answer the question, but I want to say something before that, which is AI is more difficult than web or mobile development
Starting point is 00:35:17 because during those phase, anyone can go to a bootcamp for three to four months and become proficient in programming web or mobile app. AI is a little bit more difficult than that. So anyone telling you, go to my boot camp and three months later you become an AI scientist, most likely not true. With that said, the way I look at AI is more fun when you pick up a course or language that you can write code and program in from day one. A lot of people teaching AI say, hey, you need
Starting point is 00:35:58 four years of college background information or doctorate degree before you can jump in AI. Learn all the theory first. And as a researcher way back when in Zod Park, that's such a long way to do AI. So pick up something like AI normally all start with Python. On top of Python, you have either TensorFlow or PyTorch. TensorFlow from Google, PyTorch from Facebook. On top back, you have another layer you can build. You can use that directly to build out your layout. But I2 Fast AI would sit on top of PyTorch and it's easier because Jeremy Howard has an excellent course on that. And they are free, by the way, online from San Francisco. They teach you more than just memorization,
Starting point is 00:36:44 what all the terminologies and memorize how to do things. They teach you more than just memorization with all the terminologies and memorize how to do things. They teach you how to do it exactly in the coding from coding point of view. So the way I'm keep explaining this to my colleague is that I say, this is not learning chess by memorizing all the move, this is by learning chess by learning
Starting point is 00:37:00 how each the pieces are moving. Now the pawn go up to the root go horizontal vertically, the bishop go diagonally. So these fast AI or pie torch will teach you how the thing actually moves. Then you can learn, you know, oh so this is a chess, you know, Romanian defense. I'm, oh now I can draw that makes sense. You can learn, oh, so this is why different from linear algebra doing the forest assessments. So I think the best way to learn them is get to a good course
Starting point is 00:37:34 and make sure that course is have more hand-on practice than theories. Because theories do go outdated very quickly. Give an example in the real technical example. In neural network, the fit rate is something super important. If you got wrong, your mind never got to be using. Three or four years ago, all AI scientists, we just like educated guests on that, and it's just really bad. Like some success not bad but now with fast ai using uh known no method how to drive at the fit rate and it's always converged so something like you learned
Starting point is 00:38:13 three years ago it's outdated or sometimes when you update it before you can use it again with more success so for anyone out there who want to learn AI, I wish everyone should learn AI because it's a lot of fun to work with, but the field is so new that nothing you can do is wrong. I mean, you can do whatever you want to do. You can invent new things. You can apply to different fields. Once you learn that, you can apply to your own professional field field from music to, you know, to environment conservation. You apply to all those fields. I want to learn, you know, the basics, how to build an AI model. Sure.
Starting point is 00:38:54 Well, that is, that's really helpful. It makes me interested in doing it. If I had gobs of extra time, I might look into it. Two more questions for you. So one is something that we talked about a little bit before the show, but some advice on hiring a data scientist or an AI scientist. You've just seen so much of it is one. And then after that, the last question I want to ask is about the future of AI, but we'll save that. Help us understand, especially for companies, and I'm kind of thinking about maybe some of our listeners who are leading tech companies or who are in tech companies, and they know that data science, having a data science practice in their organization could really help them accomplish some things. But they don't have a data science
Starting point is 00:39:46 practice inside the organization yet. So what are your top pieces of advice for a company that's just starting on an attorney? Yes. Before I actually answer the question, I'd also like to say that if you're a company, you should start working on AI product now. And don't think of that product as something that revolutionizes your business. Think of AI as something that can help you do better what you've already done, even with your internal IT. So don't wait for AI to be omnipotent and all super smart and Terminator type. Start AI now. Try to apply AI in your company. Back to answer your questions, I have on so many interviews, Eric, that so many people ask what I call the unicorn candidate. They need to know, I need to know everything about AI. And I keep telling them I'm only expert in artificial neural network, a subset of it.
Starting point is 00:40:40 And they keep throwing term out to me. So a lot of time I just do interview on my laptop, my older laptop, typing on Wikipedia. So I can, oh, I remember that term now. I give them the answers. But that's no way to interview AI scientists. Look at their history, how much they've been worked on, what they've been published, what they've been writing about, you know, their project that they put on GitHub. They give you a sense idea, but more importantly, looking for someone who you can communicate with. Because the first thing someone should say, should ask you when you say, I want to do AI in my company is that, what's your baseline? You know, what is the thing you're trying to do? Do you have a baseline? Like flipping coin,
Starting point is 00:41:25 do you have 50% in your baseline, 75% in your baseline? You know, so what is your baseline? What are you trying to do better than that? And AI science can help you come up with that baseline first. Then you think about, okay, how you're going to be on top of it. Hire people in diverse teams is always better than every from academics or every from consultant field, for example. Have diverse team of people. You do need a few who are hands-on coding, right? The lab AI, we talk about data, for example. Anyone can talk about data and figure what data biases is that.
Starting point is 00:42:02 Our data set has these biases. And all data have biases. So never try and get a data without biases but all people will have input in what that data is. So you should hire someone who, yes, who have technical programming experience but also all kinds of people who have experts in their field to apply to AI because once you come up with a valid baseline, the perceived success of your AI when you launch it in public, it really is a perceived success.
Starting point is 00:42:36 It's not my model, you know, has been validated for 94% accuracy or 95% accuracy. It's how people are perceiving that. So having a diverse team in your group is really, really important when you put together a team for AI. Does that make sense? Very helpful. Very helpful. Yeah, and I mean, that just sort of builds on your previous point around or your example from the trade show at the automobile
Starting point is 00:43:06 right the diverse perspectives are really going to really going to help you oh yeah i really lost my job on that one i mean that is an incredible story i was you know i was waiting for this you know grand conclusion but you know i i always say that failure is a much better teacher and the lessons last way longer than success. From now on, I never do a project without thinking, oh, what's my data bias? How people use it in the real world, right? It's not popping my head, it's not. Sure. Well, we're about at time, but I think we can squeeze one more question in here. AI is, as you said, a fairly new field. What is it going to look like in 10 years, 20 years? And I know that technology is going to advance, but we have really emphasized sort of the
Starting point is 00:43:57 human components of AI and the people behind it. So I'd love your perspective on, as a consumer, how am I going to experience AI? You know, what's going to be revolutionary for me in my day-to-day life as a result of AI in 10 or 20 years? I think, for me personally, I would love to have more AI-enabled system or robots who help me to do my work. For example, to do the work I don't like to do, like cleaning the backyard, those sort of things.
Starting point is 00:44:37 But at the same time, I want to have input into the future development of AI so everybody has their input. So it won't be blindsided on just data bias on one group or one ethnicity so bad that by the time it launches, unfixable. So I think it should be AI would build by really by communities, right? So I would post my thing, I put it up there, like hey I tried this it looked good but guess what you completely forgot about this part or it didn't build for this I like I don't tell them no it wasn't built for that it built for this so I think the future AI definitely are here and I it will be it will force upon us behind the scenes a lot of hidden ai gonna happen because i like it or not the business needs to be more efficient so they will use ai for that so the outside my control and i use that company services you know like google youtube twitter what have i use the company services so
Starting point is 00:45:37 i will have to ingest those but i hope that in the future, more people get involved in the human aspect of AI. So it's not built by, you know, by a giant group somewhere who build Terminator and boom, they take over the world the next day. It's more like, hey guys, let's fully understand what AI, how to build it. And there are systems right now in Amazon SageMakers to Microsoft Azure AI platform. They let you almost like drag and drop your component and they build AI for you. The best thing about that at this stage is that it's really confusing to use and so limited.
Starting point is 00:46:26 I wish in the future that someone had to build an AI behind it. So when I'm reading in a dashboard, I talk with an AI to build me an AI. Because right now I'm talking with a GUI to build an AI, which is really not working out well. Sure. So I think more for my work too, right? I will have more impact with those AI that I can tap on to do most of my work for me. Enable me to do more creative things, more spend more time with family, do more fun things.
Starting point is 00:46:57 Yeah. So, I mean, one quick note, in the time I did try to write an to nlp to answer all my email but i have like 10 plus year of email save up right so i thought i can train a model how to answer my email oh interesting is that right yeah you know the bad thing is eric i wrote horrible i mean just email alone i don't pay attention to grammatical error or it's full command error. Like this system is not really smart. My AI is stupid. Well, it talked like Duke that way. I'm like, hey, how about you e-order me to an API called, a system called Grammarly, which I use on a daily basis to correct the grammar files in our email, what have you. How about I e-order through that, I come back with perfect English. Then I said, well, that's no longer me talking then, right?
Starting point is 00:47:47 Sure. Which brings us right back to the human element. Yeah, yeah. So I think in the future, I really hope more people not afraid of AI, not mistrusting AI, and to get involved in it. You don't have to be an AI scientist to get involved in it. You can get involved in gathering data, do data laboring, to look at biocelum data, to think about how this can
Starting point is 00:48:08 be misused or what other more important, what can be more useful in this sort of environment. So, hey, guy, if you can do that, can you do something like this in environmental world would help. Sure. Like for another quick example, my friend who wanted me to
Starting point is 00:48:23 engage into conservation of the Saga, which is one of the endangered species, people cut up their horn and sell them in the market. Can you do a quick AI that anyone can hold up their phone and say, hey, I identify this is a Saga horn. They're not supposed to be here. No, it's illegal. And then let people know about it. Yeah, I think that's doable. So a lot of those,
Starting point is 00:48:48 I want people to get more involved in AI. Don't think of AI can do for you and reach out to people like myself who can build an AI or teach people how to build an AI and come up with those models. So that will be more good AI, I feel called ethical AI, than the evil terminator AI behind the scenes.
Starting point is 00:49:07 Sure. Over our brain. Sure. Well, you know, we're inside of a decade away from 2030, which is, I think, the year in the Terminator movies. So maybe we'll see where we're at, you know, closer to the end of the decade. Well, Duke, this has been a really wonderful show.
Starting point is 00:49:28 It was really refreshing. I mean, we love talking about the technical stuff. And I know we got a little bit technical today, but I think it was really fun just talking about the theory and just sort of public perception around AI. And I think it's a really healthy conversation to have. And I think you bring a really helpful, informed and balanced perspective. So thanks for joining us on the show. And keep us posted on
Starting point is 00:49:51 what you work on. If you come across anything cool, let us know and we'll have you back on the show to tell us some more good stories about AI. I would love that. Thank you so much. Well, that was a fascinating conversation. Duke is very engaging and has some great stories to tell. I think that my biggest takeaway was that the more that you learn about how AI actually works, I think the more that you can appreciate the difficulty of doing important things with it and sort of wielding the technology. And the fact that there are lots of humans behind it. That's a point that Duke repeated multiple times.
Starting point is 00:50:31 It was that there are people behind the actual technology and code that's running the model. And that's really important to keep in mind as we think about AI. We'll talk more about AI and data and tooling in future episodes. Subscribe on your favorite podcast platform to get notified of new episodes of the Data Sack Show, and we will catch you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.