Big Technology Podcast - AI's Research Frontier: Memory, World Models, & Planning — With Joelle Pineau

Starting point is 00:00:00 Where is the cutting edge of AI research leading today, and how are some companies already putting it into action? Let's talk about it with Cohere Chief AI Officer, Joel Pino, right after this. This episode is brought to you by Qualcomm. Qualcomm is bringing intelligent computing everywhere. At every technological inflection point, Qualcomm has been a trusted partner helping the world tackle its most important challenges. Qualcomm's leading edge AI, high performance, low power computing, and unrivaled connectivity solutions have the power to build new ecosystems, transform industries, and improve the way we all experience the world. Can AI's most valuable use be in the industrial setting?

Starting point is 00:00:41 I've been thinking about this question more and more after visiting IFS's Industrial X Unleashed event in New York City and getting a chance to speak with IFS CEO Mark Muffett. To give a clear example, Mufford told me that IFS is sending Boston Dynamics spot robots out for inspection, bringing that data back to the IFS nerve center, which then with the assistance of large language models can assign the right technician to examine areas that need attending. It's a fascinating frontier of the technology, and I'm thankful to my partners at IFS for opening my eyes to it. To learn more, go to IFS.com. That's IFS.com. Welcome to Big Technology Podcast, a show for cool-headed and nuanced conversation of the tech world and beyond. Today we're going to look deep into the state of AI.

Starting point is 00:01:27 research where the cutting edge is leading, whether there are limitations with the current methodologies, and how some companies are already putting this technology into action in a practical way. We're joined by the perfect guest, Joel Pino is here. She's the chief AI scientist at go here. Joelle, welcome to the show. Thank you. Glad to be here. So for those that don't know, Joelle, she is a, you know, a researcher who's been at this for a long time. You and I met actually Maybe a month after ChachyPT was released and everybody was asking whether AI was sentient. You at that time were the head of the Fundamental AI Research Division at MEDA. You're also a professor at McGill, and currently you're the chief AI officer at Kohir.

Starting point is 00:02:13 We've had Aiden Gomez on the show. He founded the company in 2019. He's also one of the authors of the attention is all you need paper, which basically kicked off the gender of AI moment. So Cohere is seven years old at this point, six, seven years old for the kids out there. It's raised $1.6 billion. It's worth $7 billion, and it sells AI to enterprise. So that sets the stage. Yes.

Starting point is 00:02:39 Let's talk a little bit about AI research. There's so much discussion. People have been talking about whether AI research is going to hit a wall and whether these new methodologies, things like putting reinforcement learning on top of large language models, going through reasoning, teaching the models to use different tools. There's so many different opinions of where to focus right now. So what, in your opinion, is the cutting edge of AI research and where do you think it's going to lead?

Starting point is 00:03:10 Well, I'm certainly not worried about research hitting a wall. There's so many questions that we need to work on right now. And I'd separate it into two interesting angles, right? What are the right problems to be solving right now? What are the things that the models, the current generation of models we have can't do? And then there's a question of like, how do we go about it?

Starting point is 00:03:33 Right? Like what's the hypothesis that may give us the clue of how to solve some of these problems? So in terms of what problems to solve, I think an important one is what do we do about memory? Machines have the ability to remember tremendous amounts of information.

Starting point is 00:03:49 You're just like stocking it in there. The hard part is knowing like when to pull on what piece of information to make a prediction, to generate information, to reason. And so having this ability to be a lot more selective about all the information you've seen in context is super important. And already transformers were an important piece of that. You know, attention is all you need. Well, it turns out it's not all you need. You need a little bit more than that. You need the ability to reason about information at different.

Starting point is 00:04:19 time scales, a different granularity, and so on and so forth. So there's definitely a good piece of work to be done there, which really involves, and now we talk about the how, you know, the choice of architecture, the choice of learning mechanisms, the type of data sets, the type of use cases that we need to look into. Another big research theme is on building world models. We hear a lot about world models, which are essentially the ability to take in all this information and predict. the effect of actions. So when we talk about causality,

Starting point is 00:04:54 how are actions transforming the world? This is what a world model should be able to do. World models are absolutely essential when you want to build agents because these agents are going to take actions, which is going to change the world. You want to be able to predict these effects. So whether you're building robots,

Starting point is 00:05:09 and then we talk about physical world models, but also the agents getting deployed on the web. You need to build digital world models so that these agents, you know, whether they're making financial, decisions, communicating on your behalf, organizing meetings, that they have the ability to predict the consequence of their actions. So that's a big theme, and there's a lot of different hypotheses about how to go about building these world models. And the third theme I'll highlight,

Starting point is 00:05:37 and there's many more, but at least pick out in the top three choice, is about how do we build in reasoning efficiently. And right now, a lot of the reasoning methods are still quite thorough based on sort of forward search methods and learning the right reward function. But I do think there's, you know, like the transformer moment for reasoning and choosing action and being able to plan at different levels of granularity. We're still far away from doing that. And so there's all sorts of ways that it's being baked in, you know, LLM as a

Starting point is 00:06:14 judge and things like that where AI systems get feedback to AI systems in order to train them is still a very early days. Okay, I want to dig into a lot of what you just said. Let's start at the beginning. Let's start with memory. Is memory and continual learning two sides of the same coin? I mean, there's this idea that the models can search the web and they can find something in a session, but as soon as you close that session, they forget it. And I guess the reason why I'm going there is because a way that some people have suggested solving both of these is just making the context window massive and then just becoming efficient in the way that you navigate that. What do you think about that hypothesis?

Starting point is 00:06:54 The two concepts are related, but they're not exactly the same. And so memory is really about how do you address sort of what information to pull in in the context of the task you're trying to solve? continual learning makes the assumption that the context keep on changing, therefore what you've learned keeps on changing. So there's a notion of non-stationarity that is really key to continual learning. I confess I have a little bit of trouble with continual learning as a concept because I feel the community has never been able to nail like how do we articulate the problem in a way

Starting point is 00:07:31 that we all agree on it. And so everyone who does work on continual learning takes a different flag. of it, which makes it at least in my eyes, and I haven't worked a lot in this area, but makes it a little bit hard to know whether we're making progress or not. On memory, it's a little bit more standardized. The tension really is about, it's a question of efficiency and relevance, so the way to measure whether you're doing that is a little bit better standardized, and you don't want to be just sort of remembering everything, and so it's a little bit better standardized how we articulate

Starting point is 00:08:04 the tasks. Let's go. We're going to touch on both of those now and then we'll keep going down the list. With continual learning, maybe I'm, you know, so far removed from it, I'm not struggling with it. So I'll give you my caveman thought about what this is and you can help us break it down a little bit. I mean, the problem has been articulated that the models, they are, they don't change as they go about all these. I mean, think about how powerful it would be. If, let's say the GPT model, which is speaking with 800 million or maybe more by the time this. comes out, a million people a week could, I mean, might be scary actually, but could internalize

Starting point is 00:08:42 those conversations and learn from the discussions that it's having. That would almost, you know, I agree with you that the wall, we're not at the wall, but the question is, is there going to be enough data to keep making these machines smarter? And, you know, as they have these conversations, that opens up that ability to continue to grow and learn. But the model stays static despite all the fact, all these conversations that it's having with people. Isn't that the problem? I mean, don't get me wrong, right? Like, I absolutely believe we need to address the fact that these models need to keep on evolving. I have no doubt about that.

Starting point is 00:09:14 I just mean right now the progress in the research community that's working on continual learning isn't necessarily connecting to the work that's going on on scaling. Now, the models that are released, right, you know, they keep on evolving. I would say, you know, the generative models we have today, whether it's Chad GPD, whether it's Gemini, whether it's the command models that the coherer team is building. These models keep on improving. It's just we don't necessarily let them improve online, but we ship at different times, like a release of a model, which has a particular characteristic.

Starting point is 00:09:47 The advantage of doing that, frankly, is you can really test the model before you put it out there. You can put it through its paces in terms of performance, in terms of safety, and so on. And I would be a little bit reluctant to just let the model keep running on its own, because the learning can go very, very fast, and you can switch out of a mode that seems completely reasonable very quickly, which we have seen a few times in the past. Yeah, I think we might be thinking about

Starting point is 00:10:14 one of the same instances when Microsoft had this bot called TAY. I'll tell you a story. I actually broke the news of TAY that Microsoft was going, had this great bot, spoke with the people, wrote the first story about it when I was at BuzzFeed. Which is exciting.

Starting point is 00:10:28 I pinned it to my Twitter profile. I went to sleep on the website. West Coast, and I woke up with all these messages being like, hey, that chatbot that you wrote about, the fun teen chatbot is actually espousing Nazi ideology. You might want to unpin that tweet. And it was because it kept learning. So, okay, maybe continue learning. You know, if it's done, because it has to also be done with some sort of fine-tuning where you want to make sure that behavior, maybe it's preemptive fine-tuning. Well, it's not release continual learning until we've achieved continual testing. That sounds like a very reasonable

Starting point is 00:11:00 plan. All right, memory. What makes it so difficult? I'll tell you one story. My Friday co-host, my Friday co-host, Ron John Roy and I, we both went into Gemini on Google and on Gmail. And we asked, can you find the first email that I ever sent with my wife? Couldn't do it. Is that because there's, is it just because there's so many emails in there that I've, actually like applying AI to try to figure out like what conversations have been had. Is that difficult or is it kind of a product problem from Google? Like where, why is memory so difficult and how are we going to end up? Like, how is the research community going to tackle this?

Starting point is 00:11:48 I mean, like it's a little bit difficult to diagnose just from your description. I feel like I'm a little bit like, you know, a surgeon who's, you know, on the phone hearing the description of the patient. So have you asked chatche, pee about your symptoms? So I won't necessarily venture a precise diagnosis for your case. But nonetheless, I don't think it's that difficult to figure out. I mean, I'd have to know what information is the butt pulling from, right? Just in terms of visibility and privacy, did you give it access to all of the information it needed to answer that?

Starting point is 00:12:18 And that's the first one. And we do a lot, go back to what we're building at Cohere. Actually, like, we do a lot of deployments on site. So sometimes it's just a question, like we didn't activate the access. to the right information to do it. So you need to figure out whether that access to the right information is there. And there's all sorts of reason

Starting point is 00:12:37 that you may not want to give the bots access to all of your information all the time. So that's when practical consideration. The other one is like retrieving the right information. And so, you know, did the query match how the information was encoded? Because in most of these, you may not want to just leave the information in raw form.

Starting point is 00:12:58 It gets very expensive. I mean, you're one person, but at the scale that some of these companies are operating, you have to compress it, which we often call them embeddings. So you create like embeddings of this representation. And so it may not have embedded the information properly. And then there's like retrieving that information. And maybe it retrieved like 10,000 different items and didn't drink this one close to the top. And so it didn't generate the right response.

Starting point is 00:13:23 But it could be that it knows of it. It just didn't show up at the top. So there's like a few different reasons, which makes it hard. One of them is like the access to the information, what is encoding that information, and then it's like retrieve the information at the right moment. But when this stuff works, it's pretty magical. I was just in Claude, actually, and I noticed that Claude's memory capabilities have really improved. I was speaking, so I love to upload the transcripts of my interviews and like, just, you know, get a grading out, like, give me a rating

Starting point is 00:13:54 on a variety of metrics. You decide. I tell the bot. You decide. Do you agree with the ratings? It's pretty good. The body's giving you? Definitely. Okay. Usually. You've trained it well. I did.

Starting point is 00:14:04 So some are good, some are bad. I actually had Gem and I do a bunch of ratings and it was like five of five on all categories. And I was like, that is wrong. And then I went to chat Shippee and Claude and they were actually much more reasonable about it. But one of the interesting things that Claude did when I asked it this week, it started comparing it to the other interviews I had done. And it said, you know, you actually hit better points on this. one. And this is why this one didn't resonate, in my opinion. And did you benchmark it with a sample of your audience? That is probably the next. And it'll probably when it, when I, because I'll take

Starting point is 00:14:42 data out of the podcast analytics and drop it in these bots. It's going to be able to cross-reference. So when it works, it's magical. And, you know, you've identified this as one of the areas where AI research really needs to, you know, concentrate. And this is the cutting edge. How good good can this get? And what do you think? Do you think that it's at a moment of real progress, or is it sort of party tricks to be able to get Claude to be able to do the things that I talked about? The question on rating specifically, like analyzing the information and sort of distilling some feedback? More about the memory, the fact that it can call back, memory in particular. No, I mean, we're making good progress on that. You know, extending the context length is the kind of

Starting point is 00:15:25 the easiest way to go about it. But there's quite a bit of progress that is being made on this. Okay. Let's talk about reasoning. You mentioned reasoning as a cutting edge moment. The problem is efficiency. Is that really the issue here? I mean, so reasoning is the model basically goes step by step. It tries to answer, checks the answer, tries a different answer, then eventually decides, okay, this is probably what they want, and then it spit something out. Yes. I mean, that roughly happens this way. I think the challenge is really being able to plan at different levels of sort of temporal granularity. So in terms of how you execute actions, let's say, you know, you're planning a trip, right? You're not going to start by thinking of like, what are the shoes that I put on to go on my trip, right? You're going to start by talking, thinking like, roughly what season, roughly what, you know, part of the world do I want to. go visit. You start from the top level and then you take it down a notch, which is like,

Starting point is 00:16:31 okay, you've identified like a rough time, a rough place, like let's get more precise on the time and the place and maybe the activity and like maybe who you want to go with. And then you take it down another notch, right? And that's when you start booking your reservations and so on. But sometimes, you know, you'll hit a blocker on the reservation and you can't get the flights or the hotel you want. And then you'll pop back up and say, like, do I change my dates? Do I change my place? Do I change who I go with? I'm not going to bring the kids because. then we can have more option. So we can pop back up in terms of level of resolution.

Starting point is 00:17:02 That's the part that the reasoning models don't do. They do really well at like one level of granularity. So you've got a robot, you give it all these like motions for the hands, the body motions. It can plan essentially to control the motors at that level of granularity. But the going back and forth between different levels of sort of resolution, solution of action, it's really hard. So on the technical terms, we call it hierarchical planning.

Starting point is 00:17:32 That's really hard to do that, decomposition and keeping the information relevant as you go back and forth. Is that just a limitation of the large language model? Because the fact that an LLM can even do this in the first place, like, again, like it started with predict the next word. They do not at the word level, right? Right. And out of the word level, you do get the higher level.

Starting point is 00:17:54 It is really impressive. I think that's the part that probably shocked a lot of people. They expected back in 2023, they expected that as you're generating tokens, you're not going to be able to generate sort of big ideas or bigger plan. And yet it's pretty remarkable that it does it, which is why you get sort of different opinions in terms of some people's thinking, like, hey, like it's already impressive. Like, let's just keep on pushing that way of doing things.

Starting point is 00:18:22 And we will unblock this. and other people being a lot more skeptical that you'll achieve it. Explain that a little more. So as it's typing, as it's, I mean, I think Andre Carpathie basically explained that the transformer is a computer and every time you generate a new token, you're going through a piece of computing. So the more you type, the bigger the computer is that you use? Yes.

Starting point is 00:18:42 The more, I mean, the more information goes in and the bigger your representation is. Okay. And so, but are you saying that as this happens, the computer is effectively or, already thinking ahead. I'll give one example. Claude, just to go back to some anthropic research, they published this amazing research where they asked Claude to write a poem. And as it's writing the first line,

Starting point is 00:19:06 it's already activating features in the model that's thinking what rhymes with that, which is amazing because, again, it's technology that predicts the next word. But as it's predicting the next token, it's already thinking the next sentence, which to me is just mind-boggling. Yeah. And so, I mean, this is why, to some degree, the emphasis on code and the ability to build representations of code and generate code is so interesting.

Starting point is 00:19:34 Because when you look at code, and for people who've programmed before, the code has that structure, that hierarchical structure, it's encoded in. Anyone who looks at a bunch of code, even if it's not necessarily a language, you understand, you understand the notion of functions and variables and libraries. and so on. And so those different levels of granularity of the project, it's encoded in there. And so there's some hope that by training enough on code, the machine essentially like infers these kinds of structural cues. Fascinating. So that, I mean, like you talked about, the fact of this technology, and this is sort of

Starting point is 00:20:13 the thing that sort of makes my head explode a little bit, the fact that this technology is able to do these things that you wouldn't. think, given the architecture it is supposed to do. Same with, if you think about video models and image models. And by the way, one of your former colleagues, Jan Lacoon, would always talk about how to generate, and I know he has some criticisms of video models, but to be able to generate AI video, you really have to be able to predict and plan what's going to happen in the physical world. Absolutely. And there's some embedded intelligence that even leading researchers, I don't think, fully get that when you, for instance, ask a model, just to use Jan's favorite example, to drop a pencil,

Starting point is 00:20:54 there's so many permutations of where that can go. And now the models, without like, I mean, without having lessons of physics, understand that it drops and maybe hits the table and might bounce up. Yeah, because it's seen enough data from objects that are dropped that have these kinds of behavior, but tried to predict what's the behavior of a similar object dropped on a different planet and probably the prediction is wrong because all of the data was taken with our gravity constant. Yes, I mean, I will say as I'm talking about this, I did just see a video generated where a man's fingers came out of a styrofoam cup as he was holding it.

Starting point is 00:21:31 So there's still a lot of room for improvement. Now, there is some talk, Demis and Sassabas was on recently talking about how Google's video models in some way have capability like these world model capabilities they do understand the physics and you brought up world models is another area where this technology really has the potential to grow it's the cutting edge still kind of undefined i i will say and and you know going back to the caveman here i'm a little bit confused about why for instance like one of the examples that you brought up earlier was that if you want a model to be able to like go out and like complete financial transaction you

Starting point is 00:22:10 and understand the implications of financial transactions. It has to know how the world works. But can't you just teach that in text? Can't you teach it? Like, if you use my credit card and, you know, buy anything online, I will go bankrupt, like in text or even number logic, and therefore don't do it. Like, why does, and I think world models is like these models need to understand gravity.

Starting point is 00:22:32 Why does a model need to understand gravity to learn these basic rules of sort of the way that the world works? Well, and this is why earlier I sort of distinguished between like physical world models and digital world models, right? It's possible that you can actually build really effective agents, web-based agents that don't understand the concept of gravity. And it's possible you can build physical world models for robots that don't need to understand, you know, the functioning banking system. And so you can define the world as being like a contained environment. And so, but if you want to deploy the agent on that environment, then it does need to understand the rules of that environment quite well. The challenge is getting enough coverage of data for all the possible futures,

Starting point is 00:23:23 right, and all the different ways that the world could evolve subject to various events, various events happening. So a lot of the cases today where it's actually most, beneficial is where there's like a place for the human at the table. And I'll give you an example, right? People talk a lot about using chatbots for customer service, right? Like chatbot should be, you should just like plug them in. They will answer all your questions. They'll be available 24-7 and so on. In reality, and there will be, of course, many chat bots deployed for these kinds of cases. But, you know, like one of the use cases we've seen that works really well

Starting point is 00:24:00 is actually to have the bot, like pull together all the relevant information. You do customer service. You pull together all the relevant information from many different sources. It's supposed to like following a script being just chatbot. Pull together all that information about, you know, the documents, the documentation that accompanies the system, the case on the client, the different problem, that description that you have. You pull all together that.

Starting point is 00:24:24 And then you pose a diagnostic. And then you pose a few suggested actions. and then you keep a human in the loop to validate the plan and to carry out the action. And so that means that the human, you know, can, and these are more complicated cases than just like your cell phone plan or something like that. But nonetheless, in those cases, like what would have taken a long time, you know, maybe, you know, half an hour to pull together all that information, distill it for a human, now you can reduce that down to like a 20-second, you know, analyze, verify, and carry out the action.

Starting point is 00:24:57 So if you have that ability to combine the human and the AI agent, actually you get often some much more powerful results. And it means if your world model isn't complete, humans in the loop, they figure out the pieces that's missing, they give that extra information, and then you bring that information back to train your agent. Then you get continual learning. There you go. It's there.

Starting point is 00:25:19 We're getting there. Do you buy that the models need to understand gravity for AGI to be reached? I mean, there are basically like a couple schools of thought that you could basically train AGI on bits and, you know, letters and stuff like that images. And then there are others that believe, you know, you really need these models to understand, like, you know, not just the rules of poker, but like what happens when a person puts their hand on a poker table? What do you think? Yeah. I mean, I tend to actually. place my bet, not on the fact that we're going to reach like a single super intelligent agent,

Starting point is 00:26:03 but on the fact that we are much more likely to live in a future where there's going to be many agents for many things. And so some agents will absolutely need to understand gravity. You know, if we're going to have physical robots that are moving around in the world, that are going to be hitting objects, that are going to be picking up objects and so on, they will need to understand that. Other agents that are dealing, for example, with our digital life may not need to understand that. And we also need to have a protocol for these agents to interact with each other and to talk to each other. I actually think that's a much more likely scenario rather than have the Uber agent that needs to understand everything and have a fully

Starting point is 00:26:43 encapsulated world model. There's a popular thing that AI lab leaders have been saying recently. They've been talking about how there's a capability overhang, how the AI technology can do a lot more than it's being used for. Do you believe that? Absolutely. Yeah. Say more about it. Talk about what do you think

Starting point is 00:27:03 is not being done that could be done? I see it every day. I mean, I'll open up a little window. One of the reasons I was super excited about joining Cohere is because it's one of the few places that we have a team that does research. So I get to see day-to-day what's happening in research. We have a team that does modeling.

Starting point is 00:27:25 So I get to see the models that were building, look at the evaluations, a full spread of evaluations. And we have a product. That product is an agendic platform that is going to real clients. So you get to see the whole thing. And I see something that our models can do. And I see some things that we've built into the products. And then we go and there's a lot of customers that are not using the full functionality

Starting point is 00:27:45 for all sorts of reasons. So I think that between like what we have in terms of capacity versus what's being deployed right now, there's a big gap between that. Sometimes the reasons are capacity questions. Like a lot of actual, we talk a lot about super intelligence, big models. In reality, paying customers want like a good trade-off

Starting point is 00:28:09 in terms of performance for efficiency. So, you know, we'll train bigger models, but we'll deploy smaller models because it gives us that trade-off. It's like good enough intelligence to get the job done. And I'm like, well, we could give you so much more. No, it's good enough.

Starting point is 00:28:22 So, and it's a perfectly, you know, rational position for them to be taking. So some of that is for efficiency reasons. Some of that gap is also because you're going into organizations which have systems and processes in place. And sometimes there's like a mismatch between what those processes are set up to do today versus what would be a, I think, a, you know, a more welcoming environment for an AI agent. So there's these kinds of things. And then the other one is often, I think there's a lot of intelligence that is not encoded. So the agents go, they plug into a bunch of internal system.

Starting point is 00:29:02 They leverage all the business intelligence with privacy, security consideration. They leverage all that information. But sometimes there's big pockets of information that we're not leveraging right now. And if we did, if we connected into that, then we would be able to do a lot more. So that, like, impedance mismatch in terms of the information sharing from the organization, or from the individual to the AI is another case where leaves a lot of, you know, a lot of machine intelligence on the table. So we're going to talk about enterprise in a moment, but let me ask you one question about

Starting point is 00:29:34 how this applies to consumer. Obviously, we talked about a lot of technology, and the vision is there within the big tech companies to have a universal assistant, something like an Apple intelligence or an Alexa plus. You know, both of them have rolled out in their own way. but both of them and I guess meta has their own product, Google has their own product. None of these are lighting the world on fire. Do you think that is, is this another example of a capability overing or is it that the

Starting point is 00:30:03 technology is just not there yet? I think both are true. I think, you know, people are expecting, you know, have physically been promised super intelligent. So, you know, they are expecting magic out of these AI systems. It is not magic. And so I would say, like, there's a big gap between expectation of what they can do today. And then there's also a mismatch between, you know, what people try to do versus what might be the strength of these agents.

Starting point is 00:30:33 I compared a little bit, you know, you're working in a team. You get a new teammate in, like, day one. You may not know exactly what this person is capable of, not capable of. And it takes some time working together. And sometimes that person gets a lot better when you give them a lot more information. And sometimes you discover they have a new skill that they didn't have. have. But at the end of the day, you know, often that person isn't able to do everything everywhere all at once. And so I think there's both, both these things are true at the same time.

Starting point is 00:31:01 Yeah, there's also, I mean, a lot of corporate politics. I just wrote this. Of course. I wrote this story recently in big technology talking about how there's like these two basic, and actually you're in a great position to talk about this or give us the real story here. From my vantage point, there's basically two trajectories that a lot of companies are on. The companies themselves have not talking about your customers, but if you take about like companies overall, many of them have struggled to put this technology into place. But individuals are starting to see the benefits. So you actually have like these companies with these pilots that are not getting into production, but then you might have somebody, you know, lower down using Claude Code who's like actually getting things done. So what do you think about that and what do you think it means if we end up?

Starting point is 00:31:48 seeing that divergence continue. I think that is absolutely true. We see this all the time, even within our own companies. Yes, people's ability to leverage the technology varies a lot. I mean, the reality is we are moving towards a world where there's going to be more and more of that technology, and so the people who have the ability to understand and leverage the technology are going to have an edge. Okay.

Starting point is 00:32:12 I agree. All right. Last question, before we take a break and go on to some more of like the practical applications, some of the more coherent stuff. I still can't wrap my head around the fact that the AI labs are so close together in terms of the technology they produce. One builds some innovation. The next has the innovation. One seems like it leaps ahead. The next seems like it leaps ahead. Can you envision a scenario where one of the labs just like kind of hits on something and can actually open up a lead against the others? Or is it?

Starting point is 00:32:48 just going to be neck and neck forever. I think it's really hard to keep ideas in a box, especially because in many ways, these ideas, they reside in people's heads. And I mean, you've seen as much as me the movement of people between these companies.

Starting point is 00:33:07 Like, they're always, you know, ping-ponking back. Every five minutes. They carry the ideas with them, you know, even if the code stays on one side. Like, once you've seen some insight,

Starting point is 00:33:17 you can't unsee it. Right. And so, you know, they may need to re-implement. They may need to articulate in a different way. They may give it a different name, but ideas just circulate. You can't keep ideas in a box. And that's why, honestly, for many years, I've been so much an advocate for open science. I just don't believe that you can keep these ideas boxed in unless you're willing to keep people boxed in, which we are not willing to do.

Starting point is 00:33:42 And so I don't think we have a way to close the ideas. we should embrace the fact that when you let the ideas circulate, all of us progress faster. And then the question is, let's say all these labs do reach super intelligence. You know, it's been asked, well, you can't hoard it. So where's the economic value in developing it? Yeah. We're still very, very early days in the technology, and we're even earlier days in terms of, like, what are going to be the dominant economic models, what is going to be the right business

Starting point is 00:34:16 strategy and the age of AI. I think we need to give ourselves the time to experiment. You know, now we have 30 years or so perspective on the Internet and the economic impact of that. And it's going to take a number of years before we figure that out. But often, you know, those who develop the technology are not necessarily the same as those who scale the technology versus those who actually commercialize it versus those who actually control it and regulated. So there's a pretty complex. ecosystem that is all going to arise out of that. Okay.

Starting point is 00:34:50 Well, at the other side of this break, we're going to talk about some real economic impact of this technology already. Talk a little bit about what coher is up to. And then we'll cover a lot more. So we'll be back right after this. Starting something new isn't just hard. It's terrifying. So much work goes into this thing that you're not entirely sure will work out.

Starting point is 00:35:10 And it can be hard to make that leap of faith. When I started this podcast, I wasn't sure if anyone would listen. Now I know it was the right choice. It also helps when you have a partner like Shopify on your side to help. Shopify is the commerce platform behind millions of businesses around the world and 10% of all e-commerce in the U.S. From household names like Allbirds and Cotopaxi to brands just getting started. With hundreds of ready-to-use templates, Shopify helps you build a beautiful online store that matches your brand style. You can also get the word out like you have a marketing team behind you.

Starting point is 00:35:41 Easily create email and social media campaigns wherever your customers are scrolling or strolling. It's time to turn those what-ifs into with Shopify today. Sign up for your $1 per month trial at Shopify.com slash big tech. Go to Shopify.com slash big tech. That's Shopify.com slash big tech. You want to eat better, but you have zero time and zero energy to make it happen. Factor doesn't ask you to meal prep or follow recipes. It just removes the entire problem.

Starting point is 00:36:11 Two minutes, real food, done. Remember that time where you wanted to cook healthy but just ran out of time to do it? You're not failing at healthy eating. You're failing at having an extra three hours. Factor is already made by chefs, designed by dieticians, and delivered to your door. You heat it for two minutes and eat. Inside, there are lean proteins, colorful vegetables, whole food ingredients, healthy fats, the stuff you'd make if you had the time. There's also a new muscle pro collection for strength and recovery. You always get to eat fresh. It's ready in two minutes. No prep, no cleanup, no mental load. Head to Factor Meals.com.

Starting point is 00:36:46 slash Big Tech 50 off and use code Big Tech 50 off to get 50% off your first factor box, plus free breakfast for one year. Offer only valid for new Factor customers with code and qualifying auto renewing subscription purchase. Make healthier eating easy with Factor. And we're back here on Big Technology podcast with Joel Pino, the chief AI officer, ACCO here. And of course, this is part of our Davos series that we're hosting at the Qualcomm House here in, Davos and running over the weeks following. So, Joel, it's great to have you. Let me give you what I've gathered as the use cases in business for AI. And you tell me if I'm missing any, and then maybe what you think is the most valuable. All right, I wrote four down. One is external chatbots,

Starting point is 00:37:35 the customer engagement type of chatbots, the type like Brett Taylor talked about at Sierra. The other is internal knowledge. So let's say a company has knowledge within the company. and it's all fragmented, and maybe there's a bot that you can start to query internal knowledge. Third is papering over systems that don't work. I don't think that needs much more explanation. I'm more skeptical about that, but still. And then the fourth is automation. Yeah.

Starting point is 00:38:03 Am I missing any big categories as far as AI in business, and where do you think the real value or the biggest category is right now? I think there's like different ways to slice it. I think that's a perfectly reasonable way to slice it. I think another way that I've seen, it sliced is between like predictive AI, generative AI versus agentic AI, which is like a whole other level of opportunity. And then the other way I've seen it sliced is more by application domains, right, like, you know, whether it's what AI is going to do in healthcare, what AI is

Starting point is 00:38:35 going to do for scientific discovery, what it's going to do in banking, what it's, you know, doing, for example, public sector and so on. So that's the other way that people have looked at the different the different cases, classes of opportunity. And so what do you think the biggest is? There's so much potential. I hesitate to pick one.

Starting point is 00:39:01 I will say, you know, quite frankly, where Cohere has placed its chips and the core hypothesis is on the case of enterprise AI that needs really high privacy and security guarantees. I think

Starting point is 00:39:15 there's a big cluster of applications, which falls a little bit in the second category that you outlined, where you know, you have a lot of internal business intelligence information, perhaps fragmented. You want to be able to leverage all that information to empower your employees. And so in that case, especially when that information is something that you don't want to pop up on the web through an API, there's an opportunity to build agentic systems that work in-house with the local data that inform the employees and are essentially like close partners to the employees. Can you give me like a use case or a case study?

Starting point is 00:39:53 Yeah, I mean, we do a lot of work, for example, in financial services, because as you can imagine, a lot of that data is quite sensitive in terms of information, very concrete use cases we're seeing is for financial analysis. So, you know, we have people whose job it is to advise various clients. and they need to pull on diverse set of data. Like what's the information that's relevant to this particular customer? What's the information that's relevant in terms of like the current landscape, the possibilities and so on and kind of pull all of that information

Starting point is 00:40:25 to make up like a personal plan, a financial plan for a client, is the kind of application that this technology can make much easier. And you can essentially then query your plan and decide, do I have enough information? Do I need to gather more sources of information? and you can combine the internal with the external information, but the output of that stays private, it stays secure, it stays in the hands of just the people who need to see that information.

Starting point is 00:40:50 You know, I'm glad you brought that up because I was asked recently by someone in the financial service industry, what's going to happen to entry-level employees who were doing a lot of that, you know, collating and pulling in the external information. Yeah. And I didn't have a great answer. I, you know, because, you know, you pay entry-level employees, less than your standard employees.

Starting point is 00:41:11 And you anticipate there's going to be some learning on the job and some productive things. And now the question is, what are these people going to do if they can do it for them? If these entry level employees are able to use AI properly, they're skipping ahead to the level where they can actually be fully functioning analysts and they can essentially do 10x the job with the tools. And so their growth in terms of their ability to deliver value to the employer has just been magnified by giving them the AI tools. So is then the threat really to the middle, the people who

Starting point is 00:41:43 are mid-career who are going to get, I mean, it's like, it's the old story of the social media intern who comes in and all of a sudden is managing like PR or marketing for a company. Is it the Gen Z kid who like uses, who knows, knows how to prompt and can use coher. And all of a sudden, the person who's been doing things for 15 years in a certain way has to look over their shoulder? I do think that whenever you introduce a completely disruptive technology, that is a lot of what you see. You see the younger generation for whom that technology is native and is very intuitive and they really, you know, learn how to use it very quickly. And that just makes them so much more effective and productive. And folks who are not able to engage with a technology as quickly are finding themselves as a disadvantage.

Starting point is 00:42:35 I just remember being early in my career, and maybe this is why I didn't last very long in a company and had to go start my own. But having the energy and wanting to do things, if I would have had something that could build a prototype and I could bring that to the meeting and show it as opposed to like, can I have like a couple hours of the developer's time to work on this side project,

Starting point is 00:42:59 that would change things. Absolutely. And to be honest, right, That capability is afforded to anyone in the company, right? It's not just the more junior staffers that have access to it. It's also the people who are in leadership position, which instead of writing out a memo suddenly can go out and produce a full-fledged prototype. They don't need 10 people, 10 staffers to help them produce their prototype.

Starting point is 00:43:22 They have an idea. They can quickly prototype it, and they send that to the team to get moving with a project. So I think that kind of capability is going to open. up new ways to, new ways to set up projects across the organization. This cloud code thing has been interesting to watch. It went like overnight from something that will like auto complete developers code to like will go out on the internet and do things and build things to accomplish specific tasks. So is this idea of AI systems going out and doing things?

Starting point is 00:43:58 Like on one hand, you know, I hear, I see the story of like, and I've said this on the show, a couple times, but like, you know, the former Amazon CEO of worldwide consumers going out and vibe-coding a CRM over the weekend, you know, that's cool, but I'm also just like, you know, how real is that? So I'm curious, oh, okay, you're giving me a look like, yes, it is real. Well, I think that goes back to my idea, right? Like those who are able to prototype in this way, it doesn't mean that whatever you vibe-coded into a weekend suddenly turns into a hundred million dollar business, right? But it's a way to communicate. with your teams, your intention. So as long as you have good ideas, you're able to share

Starting point is 00:44:38 these ideas in a way that's much more real and to start prototyping much faster. Now, there's other ways to communicate your ideas. There's other ways to direct your teams, but that suddenly opens up so much more. It is interesting how AI is many things, but it's a communication technology. It's becoming that, right? And this is the kind of thing that the new coding agents are opening up. Does Cohere have like a version of this that it's working on? I would say like, yes, we're working on the same kind of capabilities. We're building core generic models.

Starting point is 00:45:14 I would say that's, it's a bit of a different experience right now that we're offering in terms of the North platform. But there is a lot of that sort of collaborative work. There's a lot of this like, you know, going out in essentially deploying agents, leveraging external, external information. So there's some, there's some elements that are similar. but we're less focused specifically on coding use cases right now. Cohere obviously has raised a lot of money, more than a billion dollars.

Starting point is 00:45:43 But I'm, this is not, I'll just like draw it out. Open AI sneezes that over a weekend. You have a world now where AI is being developed by a handful of very big companies. Your former employer, meta is a big player, Amazon, Google, of course, Microsoft, and then Open AI and Anthropic with these, they raise the entire, like, years worth of VC money in a round now.

Starting point is 00:46:15 What do you think about the risk of the fact that so much of this is being concentrated in so few hands? Honestly, I do think it's beneficial to the ecosystem to have multiple groups who are able to develop models and to deploy them. Just to give you a concrete example, right, Cohere was very early on working on multilingual models. So the ability to understand information, digest information across multiple languages, 20, 30, and so on languages. We had a line of AIA models that is really well respected. open-sourced and so on. It's just not on the radar of some of these companies that are very focused on, you know, on English-centric information. Completely fine, you know, different space for different companies. When we get into markets in Asia, when we get into

Starting point is 00:47:07 markets in Europe, suddenly it matters to have a model that is actually state-of-the-art across languages or across the local language. And so that opens up completely new market. Right now, the opportunities are so broad that actually there's space for, you know, up-and-coming players to really keep on growing, to have a very healthy revenue, to bring in talent, to actually build new things that are different from some of these other companies are building. So I tend to think it's super healthy to have more rather than fewer companies that are building AI. And I think we're seeing the fact that, you know, going back to my idea of, you know,

Starting point is 00:47:50 many different AIs who do many different things, even at the company level, this is what's happening. There's a number of players who are building different things and learning from each other. But the fact that big tech has so much of it, not a worry? It doesn't worry me. Okay. And I mean, you know, we could have a much longer discussion about it, but it doesn't cause me to lose any sleep over the fact that like what we're building at Cohere has like an amazing, amazing path to be successful. Okay. By the way, I mentioned Anthropic and OpenAI,

Starting point is 00:48:20 which have Microsoft and Amazon and Google have massive stakes. And there are many more. Yes. Somebody who does Wario, Dario Amadeh from Anthropic. Well, maybe not the fact that he got all those billions from Google and Amazon. But he does have some things to say about the big tech companies. Here's the thing that he said recently. Some of these companies are essentially led by people who have a scientific background.

Starting point is 00:48:44 that's my background, Stemis Asavis' background from Google Deep Mine. Some of them are led by the generation of entrepreneurs that did social media. There's a long tradition of scientists thinking about the effects of the technology they built and not ducking responsibility. I think the motivation of entrepreneurs, particularly the generation of the social media entrepreneurs, are very different. The way they interacted, you could say manipulated consumers is very different. So basically, I don't think he wants them running.

Starting point is 00:49:16 Strong opinions from Dario. Which is, I guess, not something out of character for Dario Ameladay. Do you think that's a legitimate concern? Because it's so interesting, you're a research scientist who also worked at a social media company. So if anyone knows the answer to this, it will be you. I mean, I think what's really important, like, no one is going to be good at everything. Right? The question is, like, how do you get others in the room to advise you on how to build something great? And, you know, I spend some time at meta. I would say there was a very strong channel from researchers to the leadership team and the opinions were brought into the room.

Starting point is 00:49:59 I think, you know, I've seen that certainly at Cohere where, you know, the research team, the modeling team, the product team, like there's a room where all these points of views can. come together. I go back to this thought, I can't expect one person to have all out that information. And as long as they're building up the teams that are diverse, that are listening to these diverse voices, like they will build better products at the end of the day. Okay, on that note, as ads have started to enter the picture for a generative AI, there's a wonder among outsiders like me about whether these companies will do things like engagement max and try to optimize for time spent so they can get those numbers up. I don't want to ask you whether you think that's going to happen or not, but I want to ask you as a researcher whether that's even economically

Starting point is 00:50:53 feasible. Are the models now efficient enough where like a visit, let's say you were to show an ad, a visit to just like to serve that visit with an LLM? could be a profitable thing. Or is it still so expensive to serve these use cases that this even notion of engagement maxing doesn't make sense because economically it's not valid. I mean, in general, right, like through trial and error, we find economic models that are viable, right?

Starting point is 00:51:23 Like, that's still how it is. So it depends a lot on the pricing model and so on and so forth. Can be some expensive ads to buy. But, you know, it depends, you know, it depends on how the model is set up. So I don't know that this is the way that that gets rolled out initially. We'll have to see what's the progression of that.

Starting point is 00:51:40 I do think we have the ability to tailor content based on the information we have. That is there. That is a lever that's going to continue to be used from an economic point of view. AI sovereignty. Before we go, countries are starting, and institutions like banks are starting to build their own models or they're not relying on off the shelf. stuff. So talk, talk a little bit because this is something career is working on. It's something I don't know a lot about the fact that there is this push, or at least it's something that is

Starting point is 00:52:12 being discussed. So what is AI sovereignty and how is it playing out? Yeah, sovereignty has been used in a few different ways. In some cases, it means the ability to have your own model. So in the case of financial services and banks, that is definitely something that they spend a lot of time investing in, thinking about looking for solutions. They see the opportunity. They were, I think, early adopters even of, you know, previous generation AI technologies, predictive models, for examples, statistical models and so on. And so they see this as the natural evolution. So they're pretty advanced in terms of their sophistication and their readiness for AI. And often they have the means to invest in it. And so we're definitely seeing a lot of interest there.

Starting point is 00:53:00 Often, though, I think the talent gap makes it a little bit harder for them. So sometimes they've tried to build their own models and so on, then they come to us, and they're looking for solutions that are a little bit more mature out of the box and so on. And so we have really solid partnerships going on there. The other way to think about sovereignty that we're hearing a lot is that companies want a robust plan for AI. And so they want options.

Starting point is 00:53:27 They may be using one model, but they actually want to have another model to be able to compare, to benchmark. If when model access gets cut off or too expensive, they have another one. And so there's an aspect of sovereignty that's really about building a robust strategy. It's not about just using your own or using one thing, but it's about having control over the access to the technology. Yeah, as you speak about it to me, it's just amazing how fast this has moved. And going back to our first meeting, 2022, the fact that we're, It's 2026, so it's been three years in change.

Starting point is 00:54:02 But it's just a world of a difference year to year to year. Yeah. So last question to you, can the pace keep up? It is still moving very fast on so many fronts, you know, just the size of the investments. I think on adoption, we are so early in the curve. And so that's going to be the next challenge to see how do we enable this technology to sort of disperse through society, through the business world in people's lives and how do we do that successfully. But yeah, I think the pace, especially when it comes to commercialization and adoption,

Starting point is 00:54:42 is really very, very early days. So got a long way to go. Seriously. Well, Joel, we've spoken a handful of times. I always appreciate how you're able to take these big things that a lot of us are wondering about and grounded in the research and the practical side of things. So you're always welcome on the show. And thank you for coming on.

Starting point is 00:54:59 Thank you. Always a pleasure. everybody. Thank you so much for watching and listening and thank you to Qualcomm for having us here at the space at Davos and we'll see you next time on Big Technology Podcast. All right. Thank you. That was great. Thank you so much. Thank you.

Big Technology Podcast - AI's Research Frontier: Memory, World Models, & Planning — With Joelle Pineau

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.