Programming Throwdown - A Chatbot with a Brain

Starting point is 00:00:00 programming throwdown episode 105 a chatbot with a brain with peter voss take it away jason hey everybody so um yeah I think this is gonna be an amazing episode. I think a lot of people are amazed when they work with chatbots. So either you call them on the phone and you get that automatic response or, you know, even just, you know, having a putting a search query into Google a way, is a very kind of short-form chat. And so this idea of, like, this human-computer interaction is something that's really taken off. I mean, so many people have Alexa in their home, right? And so, you know, I'm super excited about this area. And we have Peter Voss here, who's the founder and CEO of iGo.ai.

Starting point is 00:01:02 And we're going to just dive really deep into this topic, how this actually works, cover a lot of different sort of areas here. And so thank you so much for coming on the show, Peter. Yeah, thanks for having me. Cool. How are you surviving with COVID and everything else? How are you dealing with the quarantine? It's actually worked surprisingly well for our company. We, you know, we used to work in a very small office crammed together and do a lot of brainstorming.

Starting point is 00:01:33 And now everybody's working remote and I thought it would really be very detrimental. And we somehow managed to adapt. So I've got an awesome team and so far so good. We still miss some of this close collaboration brainstorming, and I think we're missing out on some of that. So hopefully we'll be able to get back to that. Yeah.

Starting point is 00:01:55 Have you tried any of these whiteboarding, like collaborative whiteboarding apps? We've tried some over, you know, sort of over the months, but nothing. Yeah, I don't think we have quite figured out how to get the same dynamic as people sitting around the table. Yeah, we've had the same experience where we've tried a few of them. There's actually a physical one that Microsoft sells, or maybe it's Google. I don't remember, but there's a physical board where you write and as people write, it shows up. Yeah. Hugely expensive. Yeah. Yeah. It's insanely expensive. And there's delay. So some people who are listening to the show probably can't see this. But in this case,

Starting point is 00:02:39 I'm in my house. Patrick's on a spaceship. You can imagine the kind of delay that that's going to have, you know, so many light years. But, you know, you feel it, you know, when you're working with your team and you write something or you see someone write something delayed, it's kind of this uncanny valley where it almost would be better if it just had an artificial delay. It wasn't just kind of always following you a little bit behind.

Starting point is 00:03:00 Anyway, it is what it is. So we'll, you know, have'll have to just live with that. But yeah, we're doing okay. Cool. And so iGo.ai focuses on conversational assistance. And so how does this work? Is it kind of like a single assistant that you build that many people use, or do you have custom assistants know assistants for different you know consumers yeah so our product is actually now in sort of second

Starting point is 00:03:31 major generation we launched the first generation actually in 2008 12 years ago and there we focused on automating call center calls intelligently you know everybody hates calling into a call center and having a robot, you know, press one for this or, you know, doesn't understand you. So that's the market. We basically, these things are called IVRs, interactive voice response. And basically what we're offering in a company called Smart Action is an IVR with a brain. So it remembers what you said earlier in the conversation. You know, it has deeper passing, deeper understanding, has some reasoning ability and so on. So they're much better conversations.

Starting point is 00:04:10 That was our first generation focusing on automating call centers. The second generation with Aigo AI, we just launched commercially last year. Here we're focusing actually on chat channels initially, because there's huge demand. That's the biggest sort of growth area in customer interaction. And we are focusing on enterprise applications right now. So, you know, large companies that want to give hyper-personalized customer support. But we're also targeting things like a personal assistant for somebody who wants to manage their diabetes, for example. For coaching, we're working with a company that does VR coaching. And they basically need an intelligence engine to to simulate the conversation that you're having so there are many different applications but right now our focus is on enterprise working

Starting point is 00:05:12 with large enterprise companies getting into consumer market we'd obviously love to have our product available to to the consumer but it's just very hard to break into that market. You know, it's very expensive, very risky. So we're choosing to first do, you know, commercial applications. Yeah, that makes sense. And so how I imagine this, and tell me if I missed the mark here, but so you go on, let's say, Amazon. Actually, we did this the other day where we had to return something to Amazon, and we had a bit of a complicated return.

Starting point is 00:05:45 And so they said, would you like to chat to a customer service representative? And so we did, and we had this quick chat within the Amazon website. And then they were able to sort of sort it out for us, right? And so this is kind of the market where you say, someone says they're going to chat with a person on a website, but it's actually this bot. And then at some point, maybe it hands off to a person if it needs to.

Starting point is 00:06:09 That's exactly it. And basically, we're taking that whole experience to another level, because the current chatbot technology just has severe limitations for doing things like that. And I think we'll talk at length about what those are and why they are. But yes, that's exactly the kind of applications we have. And moreover, we look for applications that are hyper-personalized to the individual. Now, what I mean by that is repeat. It's more likely that they actually repeat engagement so that you can also have build on the loyalty of the customer. Or it's like, you know, as I said, diabetes management where you'd actually be using it every day. But it could also be, you know, your your bank or or a company that you I mean, Amazon would be a good example that you you may be interacting with on a regular basis. And it gets to know you. And it's hyper-personalized to you as an individual, not as a statistic of a demographic

Starting point is 00:07:10 group. But it remembers what conversations you've had and takes that into account. It uses that as context, what you said earlier, what you said last week, what, you know, what kind of interactions you've had before. And that can be omni-channel as well, that, you know, our conversational AI can take into account, you know, other interactions you've had via email or, you know, other channels as well. Yeah, that totally makes sense. I mean, one thing that I was thinking about, actually, when we were having this chat with this either customer service representative or AI. We don't we don't know because we probably failed that Turing test. But when we were having this chat, one thing that went through my mind was how, you know, when I was, let's say, early in university, I had tons of time. And so I wouldn't mind sort of like sort of kind of optimizing these kinds of

Starting point is 00:08:07 things like okay let's figure out the best way to do this let's um uh you know we could do different things we could try different things you know like send us one and if it doesn't work then you will send it back he says and now you know things are just so chaotic right now there's just so many things going on you know we have kids we have all these other people sort of in the house and so um now it's like you know just send me all three things and we'll just pick the one that works and said like time is much more important and uh um you know and that's an example of where sort of the customer context really matters because different people have different sort of states of mind when they might be doing the same thing, such as returning a product and trying

Starting point is 00:08:49 to get another one. Yeah, absolutely. Context is so important. So even, you know, even if you talk to a human, they, I mean, they wouldn't remember what other conversations you've had. You know, they might look it up in their CRMM but normally there isn't time for them to do that but with an intelligent assistant like like we providing it's it's as if you had a dedicated support person on the other side that actually remembers the last conversation you had in a good example there is you know for example telco companies where you might have problems with your cable or whatever. So you called yesterday and you had a problem and then, you know, suggested, well, try this, you know, try moving the router and put it in a different position or whatever.

Starting point is 00:09:35 So, you know, next day you call in and then it can say, well, did that help? You know, and say, no, well, you know, I know you've already tried rebooting it. So we want to ask you to do that again, you know, and say, no, well, you know, you've already, I know you've already tried rebooting it. So we want to ask you to do that again, you know, and so that's the kind of, it's even better than with a human because, you know, I go or whatever chatbot with a brain that you have will remember the conversations you've had and what you've done and what you've tried and what you've spoken about and what equipment you have, you know, whether it's been, is on the latest version and, you know, it can do tests, automated tests. It can say, hey, let me test the line.

Starting point is 00:10:17 And, you know, all of that can happen smoothly and efficiently. Whereas I know we've been working with companies like that. And usually these calls, these service calls, take half an hour, take an hour with a human agent. And it's just like super frustrating, you know? Yeah, yeah. Even resetting your password at a bank, you know? It's like insane of what you have to, what hoops you have to jump through, you know?

Starting point is 00:10:41 Yeah, yeah. Yeah, we had something recently where, yeah, I mean, the short story was yeah we we kept getting asked for different information and we were in the car and we were kind of in a hurry so we had some mixture of the information but not all of it and then that caused you know some fraud thing to get triggered which meant that then you know they shut down the whole account and then things got a hundred times worse. It would have been much better to just know, you know, you could even through the phone, you could know that, I mean, we were calling through the app, through the bank app. So the bank actually could know that we're in the car.

Starting point is 00:11:18 They could look at the accelerometer on the phone and say, okay, maybe we shouldn't be asking for documents that this person just wouldn't have in their car right now. So let's dive into that from a technical standpoint. You know, this is a really complicated area. This is one of the most interesting areas for me personally, this idea of knowledge representation, right? And so, you know, what we've seen over the years are sort of two distinct approaches, both kind of evolving separately, in my opinion, separately, and people constantly trying to sort of harmonize them. But I haven't seen anything. And these two representations are, you know, the symbolic representation. So imagine, you know, expert systems like Open Psych, the what was it called, the knowledge graph at Google, DBpedia. So all of these sort of

Starting point is 00:12:09 graphical structures for knowledge. And you can use statistical methods to sort of generate these structures. But at the end of the day, what you're left with is kind of this sort of structure or database, right? And then on the other end, you have this sort of embedding idea. So if you look at, you know, GPT-3 from OpenAI, if you look at, you know, BERT or any of these other models, you end up with just layers and layers and layers of embeddings. And so these embeddings are kind of projections

Starting point is 00:12:39 into some latent space. And so, you know, I think both of these think both of these are extremely powerful, right? I mean, we've seen extraordinary things come out of language translation, for example, where they feed in a sequence of words in one language, it goes into this huge soup of embeddings, and then a decoder emits words in the other language and it works remarkably well amazingly well um but you know to to the point you made earlier it's very hard to go into

Starting point is 00:13:13 that soup and say what do you know like it's very hard to extract from this embedding like oh this person uh you know had an issue with their router earlier And so it's sort of like once you go into that embedded space, it becomes this kind of black box, right? And so I noticed on the, you know, on iGo.ai, they talk a lot about, you know, deep learning and the challenges there. And yeah, I totally, you know, can level with that. I think that you end up with this system that's very expressive, but not very interpretable. And so what are your thoughts on that whole dynamic? Yeah, so there's a lot, quite a lot to unpack there. It, you know, it really depends on the application that you have, how useful, let me call them statistical models are. And, you know, for sort of a simple stimulus response

Starting point is 00:14:06 type thing, like a search or FAQ or, you know, something like that, that can be very powerful if you have the right training data and, you know, it's tagged correctly and so on. So that can be, you know, just very, it's the right tool for that. But there are some really fundamental limitations of statistical methods, certainly the way they're employed right now. And one of the key ones I can highlight is that it cannot learn interactively in real time. You basically train the model at the factory, as it were, and then you deploy it. But when you deploy it, it's read-only. And that's just a death blow for any intelligent conversation that you have. I mean, if you had a personal assistant, if you were talking to another human and you

Starting point is 00:14:57 said something, and the next sentence, there'd be no recognition of what they've said, or it would just be some gobbled kind of thing you know i mean let's take something that is trivial for a five-year-old child if i say just five words six words you know my sister's cat spock is pregnant you know so a five-year-old child wouldn't immediately know okay peter is speaking i have a sister a sister has a cat the cat's name is spock you might think it's male from the name, and then you hear it's pregnant. I know it's female. So that information is immediately available to a human and to any intelligent conversational AI. It should be as well.

Starting point is 00:15:40 So the next sentence might be, she's really big. Okay, well, she, the cat's big, because she's pregnant, you know, or we might ask, when will the kittens arrive? You know, and that sort of thing is just impossible. And it's trivial. I mean, it's trivial from from a human point of view, what you expect in a conversation to remember what to remember, and not statistically, but to actually integrate what you hear to immediately integrate into your knowledge base knowledge graph that information and with statistical systems a they cannot even learn uh certainly the the bulk of the ones that are out there they simply cannot learn uh do one-shot learning interactive learning um, you know, in real time.

Starting point is 00:16:30 So you're already dead in the water, basically, you know, from that. But even if they could, they would need a reasoning engine as well to say, okay, where does that knowledge that I've just heard actually fit into my knowledge graph? Right. And, of course, I'm being a black box i mean it's like impossible to know how you would even you know where you would put that into in in your in your network it's like an impossible task to do that so so it's really a complete dead end for having intelligent conversations you really do need um a knowledge graph that is not opaque and that can you have a reasoning engine that you can have one-shot learning and i mean we see the same limitations in uh in image learning even not

Starting point is 00:17:11 that it's not an area we're focusing on now but you know again uh a child can learn a giraffe by seeing one picture of a giraffe uh and it'll be able to recognize toy giraffes you know pink giraffes and you know baby giraffes and and and so on no problem with one one picture you know with deep learning you need you know hundreds or thousands of examples and and even then it can easily be be fooled so there's something something just fundamentally wrong with or limiting using big data statistical approaches that don't have one-shot learning, that don't have reasoning involved. That makes sense. If I could interject for a moment and just explain to the audience what one-shot learning is. It's this idea where imagine sort of a competition where you can bring any machine learning model you want,

Starting point is 00:18:06 and you could have done anything to this model. You could have trained it on every image in Google. You could do whatever you want. But the catch is when you go to this competition, they're going to show you, let's say, five pictures of animals and tell you what those animals are. And they're all going to be different. Say giraffe, cat, mouse, dog, right? And then that's it. So you're going to have to disambiguate, you know, a ton of giraffes, cats, mice, and dogs from each other just from seeing one example.

Starting point is 00:18:36 And so I don't know too much about this area either. I'm not a computer vision person, but I did see some work from Fei-Fei Li at Stanford on this one-shot or few-shot learning. And I think what they do is, as I suggested earlier, they train this giant model that tries to understand sort of the, I guess, the core essence and structure of everything that could be seen in the universe. I mean, I don't really, it's hard to wrap my head around that. But, you know, they take this sort of soup approach, right? And again, I think it might work in terms of you might, it might, the one-shot learning for image might be doable, but then again, there's zero interpretability.

Starting point is 00:19:19 Right. Yeah. And even you get zero-shot learning, which basically means you figure things out, you learn new things just by thinking about it, basically by cognitive processes, you know, sort of essentially inference. You have a number of pieces of information and then from that you synthesize them and you actually come up with some new piece of information. And that's also related to automatic concept formation being able to you know see a few examples of of something you know i've never seen a trike before or something you know and then you see something with three wheels and you have just you know very few examples and you can form this new concept you may have seen bicycles before and

Starting point is 00:20:02 cars and you know whatever and you can form that concept very quickly. Somebody could even explain it to you that you've never seen one before. You know, you've never seen a trike before. Somebody says, yeah, it's something like a bicycle or motorcycle, but it has, you know, two wheels at the back or at the front or whatever the case may be. So, yeah. Now, of course, the other side of things, and actually, interestingly, DARPA, a few years ago, started giving some presentations that they talk about the third wave of AI,

Starting point is 00:20:37 so that basically there are three waves of AI. And we find that quite a useful model. And the first wave of AI is what people call good old-fashioned AI, which is basically all the stuff that was done for decades in AI. That would include Deep Blue basically being the world champion chess game. It's basically centered around expert system, formal logic, and things of that nature. And, you know, a lot of AI was basically done that way, narrow AI. And that's the first wave.

Starting point is 00:21:11 And the second wave hit us like a tsunami about eight, nine years ago when people finally figured out how they could use neural networks in a very, very impressive way, partially down to just the massive amounts of data that large companies had accumulated that they never had before and the massive amount of computing power. So two of the ingredients that were missing before is having that massive amount of data and computing power.

Starting point is 00:21:43 And then a few tweaks to the algorithms. And that really is what brought us the revolution of deep learning, you know, the specialized field of machine learning. And that's been a real revolution. And that is the second wave of AI, according to DARPA. But the third wave is basically what we would call cognitive architectures or cognitive. I can't call it cognitive computing because IBM stole that term and messed it up. Yeah, that's right. But, you know, it's basically having cognition. It's being able to learn and reason, which, interestingly enough, is sort of full circle going back to the original ideal of AI, when the term was coined 60-odd years ago, that's can basically, you know, learn interactively,

Starting point is 00:22:46 use context, you know, dynamic context as it changes and learn and use interactively. Yeah, I think one thing would be really useful is explaining the difference between, you know, what you've just described and just online learning, right? So you could do, you could have a deep learning system that is, you know, at serving time, it's batching serving requests up into mini batches, and there's some loss there, and so the model is updating in real time. But it sounds like that's not, you know, that's not sufficient to cover the kind of adaptability you're talking about. No, I mean, the model updating in real time, I mean, it's really you're building a new model. I don't, you know, there's obviously, you know, whatever I say, whenever you talk about

Starting point is 00:23:33 limitations of machine learning, deep learning, there are obviously examples where people are working on overcoming those limits. So, you know, there's a paper here and an experimental system there and so on. So, you know, you can't say it can't be done. Yes, there's probably some one-shot learning. But there really isn't truly incremental real-time learning. I don't know of any system that could do that because, you know, you already called it as batching up.

Starting point is 00:24:00 So you're already putting things into your bucket. It becomes a statistical soup uh you know so you know my sister's cat spock isn't is going to get lost there you know uh yeah that's right in that training model and and it has to happen uh not just in batches but in real time i mean just the conversation we're having now any conversation you you, you know, we may not remember everything that the other person said, but generally, you know, the conversation has a trajectory based on what was said, you know, what we think, what the objective is, and so on. And so, you really, the model needs to be updated instantaneously with everything that you hear.

Starting point is 00:24:45 And if you can't make sense of it, so, you know, if you let's say my sister's cat, Spock, is pregnant and the person, the child didn't know what pregnant was, you know, the child might stop and say, what is that? Yeah, right. It couldn't integrate that, you know. So we have an example on our website, for example, where, you know, we talk to, we have our own sort of version of Alexa. Basically, the brain behind it, you know, we say, put guac on my shopping list, you know. And then Aigo says, well, what is guac? I don't know what guac is, you know. Oh, you know, guac is the same as guacamole. So it's all the system needs to be smart enough to know what it doesn't know and to then be able to, you know, automatically disambiguate. And not some hard-coded thing when you say, you know, call Bob and it says, do you mean Bob this, Bob that from your, you know, from your list because somebody hard-coded it, but it needs to be an inherent cognitive capability

Starting point is 00:25:45 that when the system gets input and it doesn't know how to interpret that, it basically doesn't understand it or there's an ambiguity, it needs to deal with it there and then, like we have to in a conversation. If you have an important conversation, sometimes we all gloss over it and we don't understand what the person is saying.

Starting point is 00:26:05 OK, fine. We'll move on. But if you really wanted to have a useful conversation, you better understand what the person actually says and what they mean. And if it's ambiguous or you don't understand the term, you really need to sort that out there and then. So in that sense, a good old-fashioned approach of having a knowledge graph, an ontology that gets updated, having a reasoning engine, sort of cognitive architectures are in that sense the right approach. But of course, we also know the limitations of good old-fashioned AI of expert systems. And that's really why we need a third wave. It's the brittleness, primarily of formal logic, that's really the killer there. Yeah, that makes sense. So there's a lot to unpack there. I think,

Starting point is 00:26:57 yeah, actually, well, let's kind of work backwards. So the last thing you said, the brittleness, I think that, yeah, that seems to be one of the big issues is, you know, with the statistical models, what you get is sort of this integration over millions of different experiences or millions of different documents. So if you imagine, so just some context the way google image search at least used to work is is um they would look at images and they would look at words on the same web page and they would say these things are weakly connected and if you have enough web pages um you know that weak connection can get stronger and stronger it's and so know, you might have a dog on a page about cats or just some random page, a page about astronauts or something. But chances are you have a lot more dogs on pages where you have the word dog and the word canine and things like that.

Starting point is 00:27:58 And so versus, you know, in the expert systems, at least the, you know, the first wave, let's say expert systems, at least the first wave, let's say, expert systems, they were brittle in the sense that they couldn't, at least in my opinion, they couldn't handle mixtures very well because a combination of, you know, it causes this combinatoric explosion in the different possibilities. And also because there was a lot of sort of human in the loop. You know, I think for OpenPsych, a lot of sort of human in the loop um you know i think for open psych a lot of that is written by people um i know some dbpedia and some of these other ones have been kind of scraped off off the internet and um um but but you know there's a lot of hand

Starting point is 00:28:36 engineering there and so how do you think they're going to sort of we're going to sort of overcome that and maybe sort of i guess take the best of these sort of symbolic systems have been traditionally curated by hand and uh i like word net comes to mind right and and some of these statistical methods where you're using a lot of electricity but you're not using a lot of of human effort right right right um so the i mean i i believe the answer lies in what darpa calls the third wave you know and one way to describe it is as a cognitive architecture so you basically start off by asking yourself what is intelligence what does intelligence entail and you know you can kind of make a practical laundry list of things and you know we've already spoken about um some of the key ones is to be able to

Starting point is 00:29:33 learn interactively to be able to understand deeply understand and know when you're not understanding stuff and to be able to reason about things. So those are maybe three key things. And your architecture needs to inherently be designed to be able to cope with that. Now, some of the cognitive architectures have been around for a few decades and they haven't really worked that well. But primarily the reason they haven't worked,

Starting point is 00:30:01 well, I just want to interject here, neural nets nets also for decades people worked on it and you could say well neural nets don't really work you know yeah they don't they don't work until they do you know yeah until somebody figures out how to get them to work and i i believe that's sort of the where we are with cognitive architectures you know people have tried to get them to work and they haven't worked for reasons which I believe we understand quite well. So one of the things is they are based on formal logic. Most of them are based, or pretty much all of them are based on formal logic. And formal logic, like the problem that Psych had, can't handle contradictory information.

Starting point is 00:30:43 So what Psych ended up doing having micro domains where within the micro domain it had to be consistent but the the domains themselves didn't necessarily have to be consistent that's sort of roughly how they they dealt with that but that's that's not nearly fine-grained enough you know you you can um you know you can you can read that google bought deep mind for you know 400 million and and another article, Google bought DeepMind for $600 million. And you say, well, and your system really just needs to be able to take it in stride as we can. Okay, well, there's something not resolved until you find out the one was a newspaper report from the UK and the other one from the US. So the one was in pounds and the other one was in dollars.

Starting point is 00:31:23 But in the meantime, you need to basically capture that knowledge and form a logic system which just fall apart, you know, if it had this contradictory information. You have fuzzy information, you know. You know, some guy is a short basketball player, you know. Hey, the poor guy is only 6'5 or something, you know. Yeah, right. So he's a short person, you know, hey, the poor guy's only 6'5 or something, you know. Yeah, right. So he's a short person, you know. So being able to contextualize what, you know, the meaning of things, those kind of things. So formal logic hasn't been able to deal with that.

Starting point is 00:31:58 So the answer lies in basically having a knowledge representation and a logic engine, a reasoning engine that can inherently deal with contradictions, with fuzziness in an effective way. So I think that that's really the way to go. Now, the question you asked about not having a human in the loop to get the knowledge, this is really, really tricky. We as humans just pick up such an enormous amount of common sense knowledge, you know, even a two-year-old has it, about how the world works. And to get that knowledge represented properly in an AI is really, really hard. And especially if you don't have embodiment, you know, if it's not actually a robot that crawls around or walks around in the real world and learns. Do you feel like it's genetic?

Starting point is 00:32:58 Do you think that a lot of this information is multi-generational and it's encoded in that way? Or do you think that we are able to really kind of bootstrap? Because, yeah, the two-year-old thing blows my mind, or even just a baby is able to have so much common sense reasoning. You know, I don't have any data, but just like intuitively, it feels like there's got to be some kind of genetic stamp there. Yeah, I think there is. And that's, you know, it's kind of a very difficult thing to unpack, basically, nature versus nurture.

Starting point is 00:33:32 And so we're clearly not a blank slate. You know, babies are not a blank slate. That's for sure. But is there, you know, are we born with knowledge? I don't think there's evidence that we are born with knowledge per se. But again, it's very tricky. Where do you draw the line of what you call knowledge? I mean, a baby knows where it's going to get fed, what it needs to do to get milk.

Starting point is 00:33:59 So it has those instincts that are obviously obviously in DNA as they are for animals. But I think the biggest part is, so yes, there are some built-in instincts that, you know, allow us to survive and kind of bootstrap ourselves. But I think the bigger thing is sort of built-in feature extraction. And I think the very good example, a very obvious example there is, you know, in our visual system that we basically clearly have feature extraction for, you know, lines, for vertical or horizontal lines. If we didn't have that, we'd have a much, much harder time being able to, you know, recognize objects and to basically to get our visual system to be effective. You know, I'm not an expert on what all the feature extractors are,

Starting point is 00:34:48 but clearly there is hard-coded feature extraction that helps to get us going. So from an AI point of view, to get back to that, it'll be a combination of having to kind of hard code, you know, preload the system with enough information to bootstrap itself. And, you know, that's kind of one of the things we're doing in our company. I've been working on the project now for more than 20 years. And, you know, the approach we use, we call artificial general intelligence, as opposed to narrow AI. I actually coined the term AGI together with two other people in 2001,

Starting point is 00:35:34 to make that distinction that narrow AI is really not what we're after. You know, narrow AI, you take the intelligence that the human has, we were talking about human in the loop, you take the intelligence that the human has and you're turning that into code, or you take the intelligence, the insights that the human has to help you build a model that you can then deploy. But it's really the human intelligence that's being turned into code or model. The intelligence doesn't reside in the code or in the model, at least not to a significant extent. So artificial general intelligence is all about building systems that can learn more

Starting point is 00:36:14 and more by themselves. But you need a certain level of bootstrapping that basically the system can hit the books, as it were, and hit Wikipedia and actually have enough of of a background understanding to make sense of what is actually reading you know yeah i mean i think in the you know in the in the deep reinforcement learning or let's say reinforcement learning community you know there's there's world building models there's there's sim to real right so in this case there's examples where you say, you know, given the current state of, let's say, my robot, I'm going to predict if I take this

Starting point is 00:36:52 action what the next state's going to look like. And so I'm going to build this state to state function, right? And then there's even embed to control where I'm going to say, okay, the state is kind of difficult to output because maybe it's some sparse state. Maybe it's a graph or something like that. I'm going to embed this state in this latent space. And in this latent space, I can have simulators. I can hallucinate in this latent space all these alternatives, right? And so people are trying to do that, you know, that sort of sim to real idea.

Starting point is 00:37:26 But again, it's this soup, right? I mean, what you're left with is this really odd, you know, embedding of, let's say, Pac-Man and the Pac-Man universe. And it's completely uninterpretable. And it's whether it adapts or not is unknown unknown i mean it's it you can't know that in advance um and you can't extract uh you know a knowledge graph or anything meaningful from it at least not not not uh directly right um and so yeah i i guess you know how does the cognitive architecture allow you to sort of build upon allow it to be sort of agglometric and continue to build on itself? Right.

Starting point is 00:38:10 You know, without falling back to this embedding soup thing. Right, right. I got from philosophy, epistemology, theory of knowledge, that is really crucial in understanding, trying to figure out how to build the systems, is the importance of concepts. And that is basically how your knowledge is represented. Deep learning machine learning systems basically optimize on whatever features they can use, minimal features they can use to to categorize something. So, you know, if orange pixels are the thing to categorize it, then so be it. But it may be that all your examples were orange things, you know, and when you get across the green, there's the famous army tank example. Right. You built a model for this to detect Russian versus U.S. tanks. When they took pictures of the Russian tanks, it was at night.

Starting point is 00:39:12 And so the model ended up being a nighttime predictor instead of a Russian tank. Exactly. That's right. Whereas humans, we we extract. What are the essential features in a contextual basis and, you know, what we want to form the concept for. And yes, in your example of the tank, yes, the fact that it's day or night is totally irrelevant. So therefore, that's not a feature that we would encode for a tank. And we know that because of the background information that we have and the purpose of why, you know, this is all subconscious, of course. We know what this concept is supposed to be for, used for, and that can adapt, that can change over time.

Starting point is 00:39:58 You know, you may, as a child, the concept of a dog or a cat may be, you know, I had something to play with and, you know, can bite you or whatever, you know, the case may be. But then you become a vet and suddenly the concept changes very, very dramatically, you know, in terms of what kind of cat is it, you know, what what it looked like and can you detect a particular ailment or whatever. So the concepts themselves can also change and adapt to the use that you want to put the concept to. But the concepts themselves need to represent the key features, the essential features for the purpose that you're trying to put it to. But in addition to it, you also embed it in that you have the concrete examples so you can fall back on the concrete examples you know when you when when for example that and you know we don't have photographic memory and but certain things we do remember well so you know if that first picture of a giraffe really made a big impact on a child,

Starting point is 00:41:11 then it can probably recall that picture. It'll still have the example. And yet at the same time, it'll have a representation of a giraffe that has extracted the essential features, you know, the long neck, the long legs. That's the essential feature that makes it obviously different from other animals. You know, it obviously different from other animals you know it's an animal but you know what is different one elephant with a big trunk and big ears you know those are really the key features that makes sense and so the way you figure out the way you it's the frame problem right so you look at a giraffe and there's a million things you could say about this giraffe but until you look at a bunch of other animals it's hard to know what separates this giraffe from the other animals like if you're in the savannah you might start by saying oh well giraffes are yellow but then you

Starting point is 00:41:55 find that lions are more or less yellow or i guess brown and tigers are or not lionesses are brown maybe warthogs are brown it's okay maybe you say okay color is maybe not the defining characteristic of the giraffe right um and and so how i imagine this and i'm definitely not an expert in the cognitive architecture although i want to be after this after hearing the talk so far it sounds really exciting but but uh it sounds like you have this sort of this like quantum, almost like quantum superposition of all of these things that could be interesting. And over time, you're having to sort of refine that and continue to update that. Yeah, exactly. I wouldn't put it anywhere as mysterious as, you know, quantum phenomena. Yeah, right.

Starting point is 00:42:45 But absolutely, you can basically, as you see, also one of the other problems with deep learning statistical systems is they throw away the examples. The instances are not encoded. So whereas in a cognitive architecture, it's cheap and easy, you know, with memory to actually have a photographic memory and remember the original sentence that you heard the utterance in or the sentences that you heard. So you have the original instance or with images, you have the original image. So you could then retrospectively refine or redefine what the essential features are. You know, you might, as you say, the giraffe might be the first yellow animal you see. And so you think maybe yellow is like really key. But, you know, as you see other animals, you know that yellow is not a distinguishing feature. And then you see toy animals which could be purple or, you know, purple unicorns or whatever.

Starting point is 00:43:45 Right. Pink elephants. You say, well, yeah, actually, you have that subconcept of, yeah, anyway, you have that overlapping concept of toys, which, you know, automatically have some different kind of weird features that can modify the other features. You know, they can be a completely different size, they're much smaller, they're soft, they don't bite you, and they can be a weird color. But you can still tell

Starting point is 00:44:15 a giraffe from an elephant, a toy elephant, by just extracting using different features within that context. Yeah, that makes sense. So one thing, so a lot of these systems you're building, you have to have conversations or are designed to have conversations with real people. And so humans have set up kind of our own vocabulary, our own sort of common sense

Starting point is 00:44:43 reasoning. And that might not be optimal or even close to it, but it's sort of it is what it is. If you want to talk to a human, you're going to have to use the sort of common sense atoms that a human would use. Right. And so how do you sort of reconcile that? I imagine if you if you really started from scratch saying we're going to build this kind of knowledge graph, then it's going to say, oh, you know, giraffes have, you know, a bend in their ear and other animals don't. But no person would ever think about a giraffe in that way, right? How do you deal with that disconnect? Right, right. It's a very interesting question because when I started on the project, initially we were in pure R&D mode.

Starting point is 00:45:27 We basically took the ideas that I'd come up with of cognitive architecture and built various prototypes. So we had a virtual mouse living in a virtual environment and learning, you know, it had, you know, virtual ears and nose and whiskers and so on, and to basically to learn in the environment. And we then had different prototypes with different animals, dogs, and then we had child, and we tried that. And then we decided we would focus on natural language, you know, but at adult level, basically. And so we spent several years teaching the system a whole lot of different things

Starting point is 00:46:04 that we thought it should know, the kind of background knowledge it should know. But then when we commercialize, when you try to now use the system in a real life situation that's going to do a job, a useful job, you actually in a way have to dumb down the system. Yeah. job, you actually, in a way, have to dumb down the system. But it's actually not that different from you might be offshoring your call center operations, you know, and you don't want the person in wherever they are in India or Philippines or whatever say, hey, I know how to sell this product much better than my boss told me to do. And, you know, I'll go off script and just do that. No, no. You want the person to stay on script, you know. Yeah. That's what the company has learned. So

Starting point is 00:46:50 we need to basically constrain Igo to stay on script, as it were, and to say things that people would expect and to listen for things that we would expect people to say at that time and not try and learn new things, you new things and form new concepts. So in that sense, you're kind of dumbing the system down for that. So the kind of research passed to give a true open intelligence and the commercial practicalities do tend to diverge, and that's one of the challenges I've had since commercializing it is to make sure we can still continue cranking up the IQ and make the system more and more

Starting point is 00:47:33 intelligent, but without it going off script, so to speak, you know, without it trying to be too clever and obviously not clever enough. So that's kind of one of the challenges. And I actually call it the AGI trap, that to build a commercial system, it's almost always easier to hard code things and to do them in a narrow way than to try and do them in a sort of really cognitive architecture way. And so we need to strike that balance.

Starting point is 00:48:02 So yeah, you pointed on a very important issue there that general intelligence at a baby level is not that useful. Yeah, right. And I think there's that actually, you could extrapolate that to a lot of different domains. And look at robotics, uh you know people are so worried about robots taking over jobs but it's it's not just that the robot has to be able to do the job but it has to be able to be more efficient than a person i mean if if you have a robot but it's going to require uh you know i don't know twenty dollars an hour worth of you know maintenance and maybe it fails every now

Starting point is 00:48:45 and then. And well, then then and if it's doing a job that you could, you could have someone do for $15 an hour, then that's no good, right. And so I think you can find almost any domain where what you said is true. I mean, even if you look at video games, so so video game AI, you could spend a ton of time, you know, I mean, my PhD was on playing various games and you could spend a ton of time making a really complicated video game opponent. zone and it's much easier to do that by having a knob it's like if the person's doing too well then you just crank up the knob and there's just more of the same dumb enemies show up next time you know right right yeah no it's i mean the way our the way we're thinking about it is to our goal is to get closer and closer to human level capabilities, because I believe it's going to be enormously beneficial to humanity to have that. And we're sort of bootstrapping it through kind of a skeleton where the system, we hard code, well, we're not hard code, we don't code it,

Starting point is 00:50:00 but we teach the system sort of rough common sense knowledge in an ontology. But we know that knowledge is not the way it should be represented, but it's kind of just to help it bootstrap. So think of it as like a scaffolding that the system should then build its real knowledge around the scaffolding, and eventually the scaffolding itself just dissolves because it now has, it's acquired the knowledge in a way that it really should have acquired it in the first place you know so that's kind of how we we see the bootstrapping process to higher and higher levels of intelligence but it's it's hard yeah i mean i make sense i mean i think you know you see this in the real world i mean in in with real human

Starting point is 00:50:41 beings and you see this in uh industry settings as well, where in the real world, you don't let your child put his hand on the stove. You don't let them learn that the hard way, right, or stick a fork in a socket or something. So what that means is they're not going to fully understand what it means to sort of put your hand on the stove until they're older and they understand heat and what heat could do to things and so on and so forth. But the alternative is someone gets seriously hurt. So you kind of have to sort of bias someone's knowledge, right? And what you see in industry sort of decision-making systems is something similar where, like, for example, if you were to build a system to decide how many ads to show on a Google landing page, right, it might want to try to fill the whole page full of ads, right? Because that could be interesting. But that would kind of hurt your brand, right? And so you kind of have to start with, you know,

Starting point is 00:51:46 something that's very focused, very guided, and then allow kind of a little bit of creativity. And then, as you said, eventually, you know, as the system starts to learn and reason, especially starts to reason counterfactually and starts to be able to understand, oh, if I filled the page full of ads, no one would come back, right?

Starting point is 00:52:05 If you can reason that from a limited set of experiences, then you can sort of, you know, take the training wheels off. Right, right. Yeah, absolutely. Those guardrails need to be there. But it's having an architecture, having an approach that inherently allows you to increase the level of intelligence and to, you know, that has the features that are needed for intelligence. Yeah, so one last sort of technical question, and we should definitely jump into iGo as a company, as a product, and as a place to work. So, you know, how do you sort of,

Starting point is 00:52:41 it sounds like the cognitive architecture is, it's an evolution, but it's also kind of a unification of the past. Right. You have these knowledge graphs, these these sort of symbolic systems. You have the sort of primordial embedding soups. And this is sort of, you know, taking kind of the best of all of that, where you're using statistics and reasoning and and things that that you've observed to sort of have this living knowledge graph and then are you the part that i wasn't totally clear about yet is are you also then using uh the expert systems approach like using search and ir methods on that graph that you're that you're updating in real time is Is that part of it? Yeah, I mean, from a practical

Starting point is 00:53:27 point of view, and that could be, you know, other people may have a sort of, what should I call it, a cleaner theoretical approach than we have. You know, I'm as much a businessman as I am a scientist or researcher. And so to me, I look at it as an engineering thing, is what's the best way of getting it done? You know, what's the best piece of technology or algorithm or whatever I can use to get a particular component to operate efficiently? Now, if you took a theoretically cleaner approach,

Starting point is 00:54:07 you might say, well, no, we can't use a brute force approach because the brain doesn't do it or it shouldn't really be using it or because ultimately not going to be scalable or whatever. So I'm less concerned about that. And we'll use search for some of our algorithms. So we'll use whatever tricks in the book. This is hard enough. We don't want to say that because this is not theoretically as clean as you might like to have it.

Starting point is 00:54:39 We've got to get it to work. So, yeah, so we'll use, we'll use whatever we need, but the, the, the core foundation that we found is, is really having a knowledge graph and having the right knowledge representation, knowledge representation that has the flexibility to deal with contradictions and with fuzziness with limb, you know, and, and, and um and context and and so on and that knowledge graph has to be super high performance uh that's uh just i'll talk a little bit about why cognitive architectures haven't worked up to now um and i can mention three reasons uh for it one of them is the brittleness of good old-fashioned ai of of you know formal formal logic most of them is the brittleness of good old fashioned AI of, you know, formal logic.

Starting point is 00:55:26 Most of them tried to get away with formal logic and they just hit a wall on that. The second thing that we're finding, we've spoken to quite a few labs that are trying to do conversational AI and to use knowledge graphs interactively in the conversation, you know, to basically get context and to update the knowledge graph and to basically use it interactively in real time. And they found they had severe performance issues. So we developed a knowledge graph technology that's two orders of magnitude faster than any commercially available graph database, 100 times faster. And that makes all the difference that basically, you know, where you could have a sub-second response to something that somebody says, you know, versus having a, you know, 500, you know, 50-second response, you know, a minute response, which clearly is unacceptable.

Starting point is 00:56:24 Because we might hit that knowledge graph hundreds of times or thousands of times while we're trying to make sense of an utterance. So those are two reasons that the performance of the knowledge graph and not being constrained by formal logic approaches, those are two reasons. But the third one is actually more an accident of history. And that is that nobody's working on cognitive architectures. And why is that? Well, primarily because deep learning has been so incredibly successful in so many areas. So it sucked all of the oxygen out of the air. I remember when I was studying deep learning, it was in the deep learning winter. So I got my PhD in 2009, and you actually couldn't publish a paper in deep learning.

Starting point is 00:57:13 You couldn't submit a deep learning paper to NIPS. They would reject it. It would have to be truly remarkable to get into NIPS. And now all of NIPS has just neural networks applied to everything. And to your point, I mean, I think that's the goal of academic or, you know, in this case, like a company that's really forward thinking is to do that thing that nobody else is doing. So I think that's pretty cool. Yeah. And it's, I mean, we had, as a concrete example, we had a brilliant intern from Germany working on our project quite a few years ago. And he really got the cognitive architecture concept and, you know, he, you know, and understood the thing.

Starting point is 00:57:54 Was with us for a year, went back to Germany to do his PhD on that. He couldn't get a sponsor. He ended up doing a PhD in deep learning, you know. Yeah, yeah. Want to earn the big bucks. You want to get your paper published. You want to, you know, get a PhD. You want to earn the big bucks, you want to get your paper published, you want to get a PhD, you want to get funding for your company, deep learning is the only game in town.

Starting point is 00:58:12 So all of the big decision makers at the big companies and the VC firms and that, they're obviously all deep learning, that's the only game in town. So that's actually, quite frankly, one of the big reasons I believe that cognitive architectures haven't made much more progress. And, you know, I've just been blinkered. I mean, I spent five years doing my own research before I started on this project to really deeply understand what intelligence entails. You know, how does our intelligence differ from animals? How do children learn? What do IQ tests measure? Are they meaningful? I studied epistemology, you know, theory of knowledge, and even what is reality? What is our relationship of our knowledge and

Starting point is 00:58:57 certainty to reality? And really deeply understanding cognition from, you know, all these different angles from cognitive psychology and and of course the different fields of ai and um and then you know i came to conclusion okay we need you know what i now call a cognitive architecture and it's just been so obvious to me that that's required there's no shortcut you know that it has to learn interactively in real time one shot. That's not negotiable, you know. And if you're working on any system, if you're not solving that problem, you're not going to get an intelligent system. It's as simple as that, you know.

Starting point is 00:59:36 So a generally intelligent system and a few other things like that. And that's why I've been blinkered working on this with limited budget. I was fortunate enough to have a good commercial success with a software company I formed, did an IPO on that. So that gave me enough time and money when I exited that company to actually embark on this project and i've been been able to fund it you know with with about a dozen people uh doing research and then we commercialized the first generation and now for the last six years we've been working on the second generation on i go ai and but it's uh you really have to be certain that this is the path. Like some of the pioneers of

Starting point is 01:00:29 deep learning, we're sure that deep learning or neural nets of some sort were really, really powerful and useful eventually, and they turned out to be right. Yeah. I mean, one of the things I've been talking a lot about at work for years, and we haven't really been able to make progress, is this idea of if you have an embedding, you're recovering like an English explanation of that embedding. So imagine if you have an embedding of all of a person's activity on Amazon, could you write a bio about that person, right? Like, could you just take that embedding and output, emit a bio, this is Bob, Bob loves soccer products, and Bob buys things in the evening, right?

Starting point is 01:01:16 And again, of course, you could do it with a lot of human engineering, but the thought experiment here is, could we just organically you know go backwards like you can clearly you know there are encoders that could clearly embed english uh and translate it into french we've seen that um but but could you do more could you embed could you encode experiences and translate them into english right and i And I think that is something that's proving to be extremely difficult. And I think what we're starting to realize

Starting point is 01:01:49 is maybe something you've realized for a long time, which is that symbolic representations, they're not things that can be just omitted from embeddings. And these are things that have to kind of work together in harmony. Yeah, well, well i mean the example you giving is actually a very very powerful one is could you do it and i think the way um statistical systems are set up i mean even the fact that they're statistical immediately negates that possibility because you are trying to average out of, you know, many instances of what is the average thing that happens.

Starting point is 01:02:30 So you have to lose the original instance information. So therefore, you couldn't reconstruct it. You could only reconstruct the average. And one of the, you know, huge failures of or limitations, I shouldn't call it failure because they're not trying to do that. But limitations is you can't deal with the outliers. You know, if my profile says, you know, my personal profile of what I buy or what I do, my activity, let's just for argument's sake, say it has a 99% confidence that I would be interested in cricket. I lived in South Africa before.

Starting point is 01:03:12 Oh, okay. Well, there you go. Whatever. I drink tea. I love tea. So whatever. My profile, let's just for argument's sake. But I don't care for cricket at all.

Starting point is 01:03:22 But my profile would suggest that the statistical system can simply not deal with that. I mean, all of them require regularization to perform. I mean, that was one of the big, one of the hallmarks of deep learning was dropout and was batch norm and ways to eliminate outliers. It's their strengths. But therefore, you can then never deal with these outliers, the people that don't fit the statistic. And this is also one of the reasons why it can inherently not work in an ongoing conversation. I mean, apart from the sort of example I gave that you have to learn in real time. But even putting that apart, setting that apart, you know, if you have a lot of sort of the normal kind of conversation, if a statistical system has an 85% accuracy or something, you know, on what response

Starting point is 01:04:12 to give, well, if you have a conversation and you three or four or five sentences into the conversation, you multiply the 85%. Yeah. Yeah. You know, you're down to gobbledygook, basically, you know. Yeah. It's not designed toledygook basically you know yeah it's not designed to recover either as you said it's not designed to ask the pointed questions to yeah let's say hey i'm lost here you know you know explain that to me so um yeah some and and there's there's actually kind of one last thing i'd just like to come back because you've mentioned a few times the relationship between embeddings and and symbolic and yeah so they're the problems of how information is encoded in deep learning systems that it's not encoded in a conceptual format but it's you need to have it in a conceptual

Starting point is 01:05:02 format to be able to then link it to the symbols because the symbol refers to the concept but the concept itself has to be represented in the in the right way with the right features you know and that's why um yeah it will think that uh you know this stop sign is a fridge or whatever, you know. Yeah, right. One thing that I saw from Google that really resonated with me was early on in the embeddings, this was maybe 2011, you know, deep learning was still kind of catching on.

Starting point is 01:05:38 And the person, I can't remember their name. Patrick, you would know this. The person who is the MacGyver of Google, totally drawing a blank. There was a whole page. Jeff Dean. Thank you, yeah. So Jeff Dean came and gave a talk on embeddings, and they actually were trying to draw symbolic inferences from embeddings. So, for example, you could minus king from queen and you could minus lion from lioness and you would get

Starting point is 01:06:11 the same vector, right? And that seems so powerful, but in a way, you know, now in hindsight, you know, we have 10, 9, 10 years of hindsight, that was quite the red herring, wasn't it? Because it really suggested that you had the symbols there, you just had to do some algebra to get them. And here we are 10 years later with almost no progress, right? Yeah. I'm actually so glad you mentioned that because I love talking about that exact example that it was seen as a revolution. And I always called it a parlor trick because that's exactly the examples I gave. You could take Paris, subtract France and add England and you'd get London or whatever. When this first came out, we actually tried to use – this is word to vector basically.

Starting point is 01:07:03 We actually tried to use it in our system to see you know could we kind of would it help us bootstrap the system and we very very quickly found out what that it actually wasn't not only wasn't it useful it was actually counterproductive and i can give you a simple counter example that in vector space cats and dogs are much closer together than dog and puppy. Yeah, probably. Because in text, you'll find cats and dogs mentioned and not dogs and puppies, usually in the same sentence. And, you know, you very quickly find those kind of limitations. So even, yeah, it's sort of just a parlor trick just a politic that's wow yeah you yeah it seemed amazing

Starting point is 01:07:47 you you hit another point which yeah substitution is devastating to embeddings because it it it uh if one word can be substituted for another you'll rarely see them together and then that drives them apart but they're actually the same exact concept right right yeah but interesting times we live in you know yeah i mean this is super cool i mean i i definitely you know what you said really resonated with me uh having having having been a big proponent of of of uh of you know multi-layer perceptrons back when they weren't popular and i do think that do think that we do need sort of, we do need to stepwise advance. I think we're starting to hit the limit.

Starting point is 01:08:31 I mean, GPT-3 is so many gigabytes, right? Every time I see models get really large, it's a sign that we're starting to run out of low-hanging fruit. And so I'm really excited to see, yeah, I'll definitely be doing a bunch of reading on this. And I think you have a bunch of posts on Medium with some really good content. So what we'll do is we'll put that in the show notes so that folks can read more about this area. Great. Yeah, that'll be good.

Starting point is 01:09:01 And welcome any questions, feedback on that. I have quite a few articles on medium.com, including risks of AI and ethics and so on. Cool. So tell us a little bit about the company. So you told us it was founded, I think you said, 20 years ago? My first R&D company was in 2001 that I founded, first R&D company. We then launched a commercial, our first generation commercial product in 2008, a company called Smart Action. Smart Action is now about 100 people or so. Okay.

Starting point is 01:09:40 And, you know, providing an IVR with a brain. But I found that the commercial pressures of providing a SaaS service, you know, reliably, you know, with security and redundancy and performance and integration and all of that, that kind of sucked up all of the time and energy. And we really didn't spend any more time on cranking up the IQ because that wasn't sort of the bottleneck in the company. So I exited that company seven years ago and started iGo.ai so that I could concentrate again on cranking up the IQ. So I hired a new team of 12 people. I have an awesome team now. And we basically, for five, six years, we were just in development mode, not in R&D mode because we'd already done the research, but we knew what needed to be done

Starting point is 01:10:35 to basically make the system more intelligent, more capable. And then about a year or so ago, I got to a point where I said, okay, we're now ready to commercialize the second generation. And that's basically what Igo AI. And we just launched late last year officially. And we're now fully in commercial mode. We're 16 people right now.

Starting point is 01:10:57 We're actually aggressively hiring right now. So we hope to be hundreds of people in the not-too-distant future. Cool. So where's the company located? We're in Los Angeles. Okay, cool. Yeah, we're an exciting time. So while I said earlier, the commercial thing sort of pushes you in a different direction from just focusing on

Starting point is 01:11:27 artificial general intelligence iq and so we very working very hard to balance that but the commercial activity is also a very very good reality check because you can have all of the theoretical things in the world you could write papers about it or beat benchmarks or do whatever, you know, ultimately the rubber hits the road when you're trying to use it in the real world. So the commercial aspect I find invaluable, but we just need to, from what we learned previously, is we need to make sure that we have enough resources to keep our development team going full speed while we also, you you know commercializing and getting that that that feedback and the money yeah yeah yeah oh yeah totally yeah did you have two sort of separate branches like a fundamental research branch and uh i guess the company is

Starting point is 01:12:18 still smaller but as it grows yeah right right now i mean the people we have, we're all hired for the development roles. But as we've gone commercial, they've all been very excited to be part of the commercial team. But yes, we'll be separating it more as we move forward. But we don't want to silo them. We want to make sure that there's a lot of cross-communication for sure. Yeah, one of the biggest follies I've seen in research labs is where the communication only goes one way. So the idea is you have this fundamental research team, and they're going to write papers, and they're going to email the papers over to the applied.

Starting point is 01:13:01 Yahoo is like this. So there's Yahoo Research. They would send the paper to Yahoo Labs, who would like this. So there's Yahoo research. They would send the paper to Yahoo labs who would then try to fit it into different products. But then no communication went the other way. And so that actually hurt the research team more than anybody because they were just operating kind of in the dark. Yeah, in a vacuum. Yeah. So we're in a fortunate position that, you know, I'm said as much of a businessman as I am cognitive scientist or AI scientist. So I'm really super excited about both aspects. And, you know, so being able to guide that from the top, you know, that to make sure they don't get siloed will definitely help us. Yeah, totally. So if there's folks right

Starting point is 01:13:47 now in university, are you hiring? What kind of roles in general are you looking for? Do you have internships? Yeah, so we have had interns in the past as well, so potentially. Of course, it's a little tricky right now because doing interns without physical presence is hard. So we don't know yet how that's going to pan out. But the interesting thing in our company, which is very unusual, 70% of our staff are what I call AI psychologists. It's actually a profession I invented. But they're basically either linguists, they either have linguistics training or cognitive psychology training. They're not programmers. And this is again, because in our cognitive architecture,

Starting point is 01:14:36 a lot of the work is being done through training IGO, not programming IGO. So, you know, giving it a curriculum, building the ontologies, teaching and having tasks. And then one third is basically engineering, you know, where we do the actual coding and integration and so on. That makes sense. Yeah. And typically, I actually prefer to, and this is a generalization, it's not a, you know, not as I do, is I generally actually prefer people who don't have a lot of AI experience because otherwise there's too much to unlearn.

Starting point is 01:15:13 Yeah, right. Yeah, everything will be framed around, you know, what's the loss function, what's the embedding? I mean, you just get asked that every day. Yeah, exactly. You know, so we're really just looking for people that are smart the loss function what's the embedding i mean you just get asked that every day yeah exactly you know so we're really just looking for people that are smart and motivated and that are particularly excited and working on on on this kind of project you know that don't want to work on some ad optimization algorithm or you know whatever you know or uber for dog food or whatever.

Starting point is 01:15:45 Yeah, Uber for dog food. Yeah, so I think the... So can you kind of walk through what the job... I mean, actually, what you said really resonates with me, this idea of training the AI. So someone who... I mean, clearly there's different roles. There's DevOps, there's people who are optimizing the database. And so those roles are, you know, pretty well defined. I think most people who've been following the show know

Starting point is 01:16:16 what that's going to be like. But tell me more about the training. I mean, so this AI psychologist, you said? Yeah. So what is that person, what is a day like, a typical AI psychologist, you said? Yeah, AI psychologist. So what is that person, what is a day like, a typical day like for that person? Yeah, so there are quite a lot of different sort of specialities within that, but they all kind of revolve around analyzing different kinds of conversational scenarios. So, I mean, on the the commercial side before we were commercial we would make up our own scenarios like you know my sister's dog cat spark is pregnant and being able to understand that now it's more like uh you know i'd like to change the delivery address on my

Starting point is 01:16:58 order you know that yeah yeah you know or i'm having trouble with my internet service so it's Yeah, yeah. ambiguities, hopefully automatically through context, basically saying, okay, here's something ambiguous, but do I have the context that can disambiguate that? And then basically, does the system already have the knowledge and the algorithms to do the disambiguation effectively? And that disambiguation could also be done through reasoning. It could be, you know, send an email to bob um now you might say okay i know two different bobs the context may not tell me but there may be other things that you can reason about say well it can't possibly be that bob because whatever he died yeah right you know yeah so there could be reasoning uh or context or something so disambiguation deep

Starting point is 01:18:03 understanding um one of my articles is understanding understanding there's actually or context or something. So disambiguation, deep understanding. One of my articles is understanding, understanding. There's actually, it's a very interesting topic, just, you know, what understanding entails. So a lot of the work we do is in deep understanding of what the person is saying. But then on the other action side of things is, okay, how do you respond to it? There's natural language generation and, you know, or doing certain skills,

Starting point is 01:18:28 doing certain tasks, which may entail subtasks. Does the system know how to do those subtasks? Do we need to teach it? You know, does it need to gather more information to do it? So, you know, for example, if you say, you know, I want to change the delivery address on my order, well, then it can do that task, but only if it knows what the new delivery address is. So it automatically, without us having to program it, it should know that here there's a piece of information that's missing to do this task.

Starting point is 01:18:58 I need to ask, well, what would you like to change the address to so it's basically working through that and making sure that Aigo has the right knowledge and then highlighting fundamental cognitive deficiencies you know where we say hey Aigo should already be able to figure that out himself itself um but um you know the cognitive mechanisms aren't strong enough and then you know, the cognitive mechanisms aren't strong enough. And then, you know, we sit down and figure out, OK, how can we make it smarter so that next time we don't even have to teach it that it can basically already figure it out itself or ask for the information. So that's kind of the iterative process. And, yeah, it's super interesting stuff. Yeah, that makes sense. I mean, a lot of what you

Starting point is 01:19:45 said reminded me of, you know, active learning, right, where you're kind of looking for, and I guess I'm doing the thing that I shouldn't be doing. I'm going back to loss functions and all of that. But it's sort of like you look at, you sort of look for things that are on the hyperplane. Or in this case, you for things where um the system is entirely ambiguous and you you try to sort of uh uh you try to sort of tease apart those examples explicitly right exactly yeah cool wow that's uh that's totally wild um so so uh so the team right now is in is kind of centralized this its office in L.A. And is there any plans to grow?

Starting point is 01:20:36 So if someone is, you know, are all the positions you're hiring for in L.A., are there any remote opportunities? Yeah, right now there are. We haven't yet figured out how to effectively interact and manage remote people, but I'm sure we will. We just need a bit more management structure and so on. So yeah, right now we have everyone in Los Angeles. Cool. But that'll change hopefully in the near future. Yeah, that makes sense. Cool. And so if people have questions or they want to write to you about, you know, different, you know, career, I'm sure there's a careers page on iGo.ai. So people check that out first. But if they have anything they wanted to chat to you about, how can they reach you? Yeah, medium.com would, they can always respond there. But I'm also easy to find on Facebook.

Starting point is 01:21:25 I'm on Twitter. You know, Peter Boss, I think you'll be able to find me quite easily. Cool. Great. Thank you so much. This was an absolutely fascinating talk. I definitely have a lot of research to do. And I'm sure the listeners out there have a lot to do. There's a lot to unpack.

Starting point is 01:21:46 But, you know, we have a lot to do. There's a lot to unpack, but we have a really motivated community and so I think you really kick-started them into the next year, so I appreciate it. Great. Well, this was fun. I enjoyed the conversation. Programming Throwdown is distributed Thank you. to Patrick and I and share alike in kind.

Your Ad Here

Programming Throwdown - A Chatbot with a Brain

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.