Microsoft Research Podcast - Abstracts: February 29, 2024

Episode Date: February 29, 2024

Can how we think about our thinking help us better incorporate generative AI in our lives & work? Explore metacognition’s potential to improve the tech’s usability on “Abstracts,” then sig...n up for Microsoft Research Forum for more on this & other AI work.Read the paper

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Abstracts, a Microsoft Research podcast that puts the spotlight on world-class research in brief. I'm Dr. Gretchen Huizenga. In this series, members of the research community at Microsoft give us a quick snapshot or a podcast abstract of their new and noteworthy papers.
Starting point is 00:00:24 Today, I'm talking to Dr. Lev Tankalevich, a senior behavioral science researcher from Microsoft Research. Dr. Tankalevich is co-author of a paper called The Metacognitive Demands and Opportunities of Generative AI, and you can read this paper now on Archive. Lev, thanks for joining us on Abstracts. Thanks for having me. So in just a couple sentences, a metacognitive elevator pitch, if you will, Lev, thanks for joining us on Abstracts. Thanks for having me.
Starting point is 00:00:45 So in just a couple sentences, a metacognitive elevator pitch, if you will, tell us about the issue or problem your paper addresses, and more importantly, why we should care about it. Sure. So as GenRI has rolled out over the last year or two, we've seen some user studies come out. And as we read these studies, we notice there are a lot of challenges that people face with these tools.
Starting point is 00:01:08 So people really struggle with writing prompts for systems like Copilot or ChatGPT. For example, they don't even know where to start, or they don't know how to convert an idea they have in their head into clear instructions for these systems. If they're working in a field that maybe they're less familiar with, like a new programming language, and they get an output from these systems, not really sure
Starting point is 00:01:28 if it's right or not. And then sort of more broadly, they don't really know how to fit these systems into their workflows. And so we've noticed all these challenges sort of arise, and some of them relate to sort of the unique features of Gen ed AI, and some relate to the design of these systems. But basically, we started to sort of look at these challenges, try to understand, like, what's going on? How can we make sense of them in a more coherent way, and actually build systems that really augment people and their capabilities, rather than sort of posing these challenges? Right. So let's talk a little bit about the related research that you're building on here, and what unique insights or directions your paper
Starting point is 00:02:05 adds to the literature? So as I mentioned, we were reading all these different user studies that were testing different prototypes or existing systems like ChatGPT or GitHub Copilot, and we noticed different patterns emerging, and we noticed that the same kinds of challenges were cropping up. But there weren't any sort of clear, coherent explanations that tied all these things together. And in general, I'd say that human-computer interaction research, which is where a lot of these papers are coming out from, it's really about building prototypes, testing them quickly, exploring things in an open-ended way. And so we thought that there was an opportunity to step back and to try to see
Starting point is 00:02:38 how we can understand these patterns from a more theory-driven perspective. And so with that in mind, one perspective that became clearly relevant to this problem is that of metacognition, which is this idea of thinking about thinking, or how we sort of monitor our cognition or our thinking and then control our cognition and thinking. And so we thought there was really an opportunity here to take this set of theories and research findings from psychology and cognitive science on metacognition and see how they can apply to understanding these usability challenges of gender AI systems. Yeah. Well, this paper isn't a traditional report on empirical research, as many of the papers
Starting point is 00:03:16 on this podcast are. So how would you characterize the approach you chose and why? So the way that we got into this, working on this project, it was quite organic. So we were looking at these user studies and we noticed these challenges emerging and we really tried to figure out how we can make sense of them. And so it occurred to us that metacognition is really quite relevant. And so what we did was we then dove into the metacognition research from psychology and cognitive science to really understand what are the latest theories, what are the latest research findings, how can we understand what's known about that from that perspective, from that sort of fundamental
Starting point is 00:03:51 research, and then go back to the user studies that we saw in human-computer interaction and see how those ideas can apply there. And so we did this sort of in an iterative way until we realized that we really have something to work with here. We can really apply a somewhat coherent framework onto these sort of disparate set of findings, not only to understand these usability challenges, but then also to actually propose directions for new design and research explorations to build better systems that support people's metacognition. So Lev, given the purpose of your paper, what are the major takeaways for your readers, and how did you present them in the
Starting point is 00:04:25 paper? So I think the key sort of fundamental point is that the perspective of metacognition is really valuable for understanding the usability challenges of generative AI and potentially designing new systems that support metacognition. And so one analogy that we thought was really useful here is of a manager delegating tasks to a team. And so a manager has to determine what is their goal in their work, what are the different sub-goals that that goal breaks down into, how can you communicate those goals clearly to a team, then how do you assess your team's outputs, and then how do you actually adjust your strategy accordingly
Starting point is 00:04:58 as the team works in an iterative fashion. And then at a higher level, you have to really know how to actually what to delegate to your team and how you might want to delegate that. And so we realized that working with GenRI really parallels these different aspects of what a manager does, right? So when people have to write a prompt, initially, they really have to have self-awareness of their task goals. What are you actually trying to achieve? How does that translate into different subtasks? And how do you verbalize that to a system in a way that system understands? You might then get an output and you need to
Starting point is 00:05:28 iterate on that output. So then you need to really think about what is your level of confidence in your prompting ability? So is your prompting the main reason why the output isn't maybe as satisfactory as you want? Or is it something to do with the system? Then you actually might get the output that you're happy with, but you're not really sure if you should fully rely on it, because maybe it's an area that is outside of your domain of expertise. And so then you actually might get the output that you're happy with, but you're not really sure if you should fully rely on it because maybe it's an area that is outside of your domain of expertise. And so then you need to maintain an appropriate level of confidence, right? Either to verify that output further or decide not to rely on it, for example. And then at a broader level, this is about the question of task delegation. So this requires having self-awareness
Starting point is 00:06:03 of the applicability of generative AI to your workflows and maintaining an appropriate level of confidence in completing tasks manually or relying on generative AI. For example, whether it's worth it for you to actually learn how to work with generative AI more effectively. And then finally it requires sort of metacognitive flexibility to adapt your workflows as you work with these tools. So are there some tasks where the way that you're working with them is sort of slowing you down in specific ways? So being able to recognize that and then change your strategies as necessary really requires metacognitive flexibility.
Starting point is 00:06:34 So that was sort of one key half of our findings. And then beyond that, we really thought about how we can use this perspective of metacognition to design better systems. And so one sort of general direction is really about supporting people's metacognition to design better systems. And so one sort of general direction is really about supporting people's metacognition. So we know from research from cognitive science psychology that we can actually design interventions to improve people's metacognition in a lasting and effective way. And so similarly, we can design systems that support people's metacognition. For example, systems that support people in planning their tasks as they actually craft prompts.
Starting point is 00:07:06 We can support people in actually reflecting on their confidence in their prompting ability or in assessing the output that they see. And so this relates a little bit to AI acting as a coach for you, which is an idea that the Microsoft Research New York City team came up with. So this is Jake Hoffman, David Rothschild, and Dan Goldstein. And so in this way, generative AI systems can really help you reflect as a coach and understand whether you have the right level of confidence in assessing output or crafting prompts and so on. And then similarly, at a higher level, they can help you manage your workflows. So helping you reflect on whether generative AI is really working for you in certain tasks or whether you can adapt your strategy in certain ways. And likewise, this relates also to explanations about AI. So how you can actually design systems that are explainable to users in a way that helps them achieve their goals. And explainability can be thought about as a way to actually reduce the
Starting point is 00:07:59 cognitive demand because you're sort of explaining things in a way to people that they don't have to keep in their mind and have to think about. And that sort of improves their confidence. It can help them improve their confidence or calibrate their confidence and their ability to assess outputs. Talk for a minute about real-world impact of this research. And by that, I mean, who does it help most and how? Who's your main audience for this right now? In a sense, this is very broadly applicable. It's really about designing systems that people can interact with in any domain and in any context. But I think given how Genvai has rolled out in the world today, I mean, a lot of the focus has been on productivity and workflows. And so this is a really well-defined, clear area where there is an opportunity to
Starting point is 00:08:46 actually help people achieve more and stay in control and actually be more intentional and be more aligned with their goals. And so this is an approach where not only can we go beyond sort of automating specific tasks, but actually use these systems to help people clarify their goals and interact with them in a more effective way. And so knowledge workers are an obvious use case or an obvious area where this is really relevant, because they work in a complex system where a lot of the work is sort of diffuse and spread across collaborations and artifacts and softwares and different ways of working. And so a lot of things are sort of lost or made difficult by that complexity. And so systems that are flexible and help people actually reflect on what they want to achieve can really have a big impact here. of paper, I noticed that as I read it, I felt like this was how researchers can begin to think about what they're doing and how that will help downstream from that. That's exactly right. So this is really about, we hope, unlocking a new direction of research
Starting point is 00:09:55 and design where we take this perspective of metacognition of how we can help people think more clearly and sort of monitor and control their own cognition and design systems to help them do that. And in the paper, there's a whole list of different questions, both fundamental research questions to understand in more depth how metacognition plays a role in human-AI interaction when people work with generative AI systems, but also how we can then actually design new interventions or new systems that actually support people's metacognition. And so there's a lot of work to do in this. And we hope that sort of inspires a lot of further research. And we're certainly planning to do a lot more follow-up research. Yeah. So I always ask, if there was just one thing that you wanted our listeners to take away from this work, a sort of golden nugget,
Starting point is 00:10:40 what would it be? I mean, I'd say that if we really want GenRI to be about augmenting human agency, then I think we need to focus on understanding how people think and behave in their real world context and design for that. And so I think specifically, the real potential of GenRI here, as I was saying, is not just to automate a bunch of tasks, but really to help people clarify their intentions and goals and act in line with them. And so in a way, it's kind of about building tools for thought, which was the real vision of the early pioneers of computing. And so I hope that this kind of goes back to that original idea. You mentioned this short list of open research questions in the field, along with a list of suggested interventions. You've sort of
Starting point is 00:11:26 curated that for your readers at the end of the paper, but give our audience a little overview of that and how those questions inform your own research agenda coming up next. Sure. So on the sort of fundamental research side of things, there are a lot of questions around how, for example, self-confidence that people have plays a role in their interactions with generative AI systems. So this could be self-confidence in their ability to prompt these systems. And so that is one interesting research question. What is the role of confidence and calibrating one's confidence in prompting? And then similarly, on the sort of output evaluation side, when you get an output from generative AI, how do you calibrate your confidence in assessing that output, right?
Starting point is 00:12:08 Especially if it's in an area where maybe you're less familiar with. And so these interesting nuanced questions around self-confidence that are really interesting. And we're actually exploring this in a new study. This is part of the AI Cognition and Economy pilot project. So this is a collaboration that we're running with Dr. Clara Komobata, who's a researcher in University of Waterloo and University College London. And we're essentially designing a study where we're trying to understand people's confidence in themselves, in their planning ability,
Starting point is 00:12:35 and in working with AI systems to do planning together, and how that influences their reliance on the output of generative AI systems. Well, Lev Tankalevich, thank you for joining us today. And to our listeners, thanks for tuning in. If you want to read the full paper on metacognition and generative AI, you can find a link at aka.ms forward slash abstracts, or you can read it on archive. Also, Lev will be speaking about this work
Starting point is 00:13:02 at the upcoming Microsoft Research Forum, and you can register for this series of events at researchforum.microsoft.com. See you next time on Abstracts.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.