The Vergecast - I just want AI to rename my photos

Starting point is 00:00:00 Support for the show comes from Retool. Too many companies run critical operations on duct taped spreadsheets, Slack workflows, and whatever else they could cobble together. Not because they want to, but because building internal tools means weeks of waiting on someone else's backlog. That's where Retool comes in. Build custom internal tools just by describing what you need. Prompts something like,

Starting point is 00:00:22 Build Me a Revenue Dashboard on our Salesforce data. And Retool actually builds it on your company's data, in your cloud with enterprise security built in. Go to retool.com slash Vergecast. We all need to retool how we build software. Welcome to the Vergecast, the flagship podcast of using AI models to rename all the files on your computer for better and for worse. I'm your friend David Pierce,

Starting point is 00:00:52 and this is the first in a two-part series we're doing about AI. And more specifically, how people who are building AI tools are thinking about the AI tools that they're building. Basically, we're at this moment in time where everyone who makes any kind of app, any kind of software, any kind of hardware, for goodness sake, is trying to figure out the ways to put AI into this. And on some level, I think that's silly, right?

Starting point is 00:01:19 Like, there's a lot of stuff out there that actually does not benefit from having chat GPT shoved into it in some ridiculous way. But on the flip side, there are actually a lot of things that become better and more useful and more functional with these kinds of tools. One thing I think about a lot is text transcription. It's a simple thing, but right, Open AI put out this whisper model that does really good, really fast transcription of audio. That ends up being really powerful for lots of things. There's this feature in

Starting point is 00:01:53 To-Doist, the app that I really like. It's a To-Do-List app. And they have this feature called Rumble. I think I've talked about this before, but you can just talk your to-do list. All the things you're thinking about, all the things on your mind, all the things on your shopping list. You just sort of yell it into the app, and then it will attempt to go through and structure it all and make sense of it. And there's a couple of different layers of AI in there, right? But the first one is just take your voice and reliably successfully transcribe it. That's very powerful. There's also an app I use called My Mind that is using AI to do really great.

Starting point is 00:02:28 search. So instead of having to like make a bunch of notes and then file them into folders or give them tags or do any kind of organizing, you just put it all in and trust that you'll be able to sort and search and find things as you need to, this stuff can really work. So for the next two Sundays, what I'm going to do is I'm going to talk to two people who are making apps that I think are doing a smart job of AI. It's going to sound in both of these interviews like I like I like the products. And I do. That's why they're here. Because I think they're thinking about AI not as just something to like shove into the app to charge you more money or juice their stock price or whatever. But because there's something it actually makes possible. And sometimes it makes those things

Starting point is 00:03:14 possible in ways that are complicated and messy and privacy threatening and maybe even threatened to like ruin the vibe of the thing you're trying to build in the first place. But that also have websites that make the things more useful and more fun and more discoverable. So we're going to talk about all of that. And my guest for this first one is Thomas Paul Mann, who is the founder and CEO of a company called Raycast. Raycast is initially was a Mac app. It's now on iOS and on Windows. The way I would describe it is it's sort of a launcher and then some, right? So you use it to replace Spotlight on your Mac and it then will let you launch apps, you can use it to store like text expansion things. I have one set up so that when I type

Starting point is 00:03:59 H-H-O-M-E for home, it just immediately spits out my home address. That's a thing that lives in Raycast. You can also use it to like manage the windows on your computer and move stuff around. But increasingly one of the biggest things that it can do is access AI models. And you can use it just to chat with chat GPT inside of Raycast, but you can also use chat GPT to use your apps. I can go in and I can type, you know, at browser, uh, uh, download all of the tabs as a CSV and put it into a text file that I can then send to

Starting point is 00:04:34 somebody. And that's like a thing it is in theory capable of doing. I can open it up. I can say at finder, show me all the files that I have created in the last 24 hours. And it's, it's actually an AI system that can use your other apps and even use your computer. We've talked a lot about AI browsers.

Starting point is 00:04:52 we've talked a lot about these sort of tools that have lots of additional context. Raycast has more context than just about any other app. I've been using this app for a long time. I really like it a lot. I have not made that much use of all of the AI stuff inside of it. So I wanted to have Thomas on to both talk me through how he thinks about putting AI into this product and also what it can do for you when it really starts to work. I really enjoyed talking to him.

Starting point is 00:05:17 I think you'll enjoy hearing it. We're going to take a quick break and then we're going to get to my interview with Thomas. We'll be right back. Support for this show comes from Shopify. Every thriving successful business has to start somewhere. A good place to start is a relatively simple question. What if, given the right tools, I've really put my all into this. One tool that can help grow your sprouting business to new heights is Shopify.

Starting point is 00:05:43 Millions of businesses around the world rely on Shopify for e-commerce. They offer a host of helpful tools you can take advantage of, from payment processing to analytics to website design. Their design studio includes hundreds of templates to help you create the exact website you've been envisioning for your business. If you're wondering, what if I need help, then no worries, because you're never left to fend for yourself. Shopify's award-winning customer support is available 24-7. It's time to turn those what-ifs into a thriving business with Shopify today. Sign up for your $1 per month trial today at Shopify.com slash vergecast.

Starting point is 00:06:23 go to Shopify.com slash vergecast. That's Shopify.com slash vergecast. Thomas Paul Mann, welcome to the Vergecast. Hey, thanks for having me. You and I have talked many times, but we've never talked into a recorder like this. And I'm very excited to have you here. This is when we were like,

Starting point is 00:06:48 we're doing the series about people who are sort of building and thinking about AI and what AI can do, which is a conversation you and I have had version. of many times. So now we're just going to do it again, and I'm excited about it. Sounds good, yeah. Sounds like we have done it a few times already, so let's see. Indeed. So first, give me a sense of, I think you've been thinking about AI inside of Raycast for a while. And I would say just sort of rewind to like the early days of when you started thinking about how AI models fit into what Raycast

Starting point is 00:07:23 was doing a couple of years ago. Like, what were those first conversations you were having, like. Yeah. So yeah, Raycos is like sort of this global search bar on your Mac, right? And actually now also on Windows. But like basically what we realized when chat chad shpd came around and suddenly everybody talked about a prompt and everybody was looking for like a text box to feed in a prompt that we were really well positioned for that because Raycos itself is like a massive search

Starting point is 00:07:53 box and so you can open it anywhere. and so you can just type something in. And it just used to be just very static text, like, oh, you're searching an app or like a command or you want to do something. But then it felt quite natural to extend this and just put in natural language, like a prompt, and then get going.

Starting point is 00:08:11 So pretty much right after chat ChdpD came out, we said, hey, wait a second, that suits us really well. And so I think the first model was GPD3 that we integrated, which was back like end of 2022. Okay. So the thinking even then was like we can solve some of the way people talk to their computers. Because I think to me it's like the very first good thing that any of these models did was make it slightly easier to like speak in English to your computer. Was that kind of your thinking too that like we can just make this make a little more sense?

Starting point is 00:08:47 Yeah. I think like the very first thing that people did was just like asking questions, right? And so it's kind of funny because all the way back when we started at Raycast, we were like, oh, like we're programmers, we sometimes have questions, how do I do X with that? And then we used to go to something like Stack Overflow and find those questions, right, that somebody else asked and then you like go over it and read the answer yourself. And you kind of had to be very good at basically keywords to find a proper question that leads you to the answer.

Starting point is 00:09:16 But now this whole thing got like basically flipped upside down. And so the very first thing we saw like, oh, people just. just ask questions. And so get the answers and then carry on with whatever they do. And this can be little things, can be fun things. But like it helped them to basically stay informed. And so one of the first big challenges to overcome was like, oh, but sometimes those models hallucinate and how do we get over that?

Starting point is 00:09:41 And I think then the next sort of wave was very easy. Like, oh, well, let the models do what we would do and search the web and then take that information and distill it down into the style and the tone of voice we wanted to have. So that was sort of the very first things that we've seen picking up. And since then it became more and more advanced, right? And then it was like, okay, maybe not just search the web, but search your calendar, search your files, read your files, do actions like organizing your folders and files on your Mac or like other things.

Starting point is 00:10:15 So I think like as with every new technology, people kind of adapt to like, what's possible, and then they take the next step and, like, pushing the boundaries a bit more and more. And given we have, like, quite advanced users, like, they're oftentimes on the forefront. And so they're pushing really hard to the extremes, and then we can kind of see what they wanted to do and then integrate it quite nice. They make it accessible to many more people. So do you have to make a decision early on about, like, how much of the stack of AI do we want to be part of? I assume there was never a question of like, should we train a Raycast LLM? But it does seem like you know, you could build something that is essentially just a text box that replaces the chat GPT text box.

Starting point is 00:11:01 Or you could build something on top of it. You could try to integrate with an API and be a sort of developer. There just seemed like a lot of different ways you could try to do that thing that you just described at like vastly different levels of complexity. Was it obvious to you where to land early on? No. I mean, this thing came out overnight, right? Like, suddenly the thing was there and was like, oh, wow. Like, what used to be sort of sci-fi and the movie thing was suddenly like somewhat possible.

Starting point is 00:11:36 But like, you kind of need to get started somewhere. So early on, we kind of were just playing with the APIs and then they created them. initially just opening eye because this was literally the only API that was there, right? There was nothing else that was even available. And so that took a several months until somebody else popped up.

Starting point is 00:11:57 And then it became clear that models are all a bit different, not even talking about which ones are smarter or faster, but just to have nuances and people prefer the one over the other for oftentimes personal reasons. Like, oh, I like how this model talks to me. and so really quickly we said like, oh, it probably makes sense to indicate all the different models

Starting point is 00:12:18 because people are going to have personal preferences, and they're going to be better and worst models. Or like better models for certain cases. Like sometimes you just don't need the full intelligence. You just want to do something simple like, oh, summarize this blog post or rewrite this message. Those don't need to be the cutting-edge models. You want to have them rather fast. And then sometimes you want to have like a model that goes on for several. minutes and does a bunch of research and then coming back with like a big research paper for you.

Starting point is 00:12:49 And for that, you probably need to have a better model. And so early on we said, like, okay, let's indicate with as many models as possible because we have a quite technical audience. So they will help us also guide which models they prefer. And what we now see after like a couple of years, whenever there is a new model dropping, everybody goes to the new model, tries out and basically want to have the latest and greatest. and then they're using that for several months until the next model drops from a different company most likely

Starting point is 00:13:18 and then they're going over to the next one. So the switching costs between those models are extremely low at the moment, at least for us, and for our users. And then I think they're building up a bit of muscle memory or even learning how to get the most out of those models and those things are sometimes a bit more tied to, let's say, a model family.

Starting point is 00:13:42 earlier on we said like we're not going to go and build our own models. What we did, we did some optimizations on the prompt level and also some fine tuning to make the models really good in our case. And so that is, for example, making a lot of like a genetic workflows. But nowadays, like a lot of the models are pretty good at that on a basic level already. What were you doing in those early days? Like that was before everybody was talking about agentic stuff. But it also seems like kind of perfectly up the alley of Raycast to figure out, okay, how do we

Starting point is 00:14:13 have this unique access to your device and your files and your data. How do we teach this model how to do stuff? Were you poking at that stuff even in these sort of early days before everybody was talking about agentic AI? Yeah, so we had this pretty early where we said like, this

Starting point is 00:14:29 makes total sense for us. Like we have this extension platforms, so they're like over 2,000 extensions, they're publicly available. You can indicate Raycox with Notion, Linear, Google Docs, GitHub, you name it, anything you can really think of. And also on your local computer, like it can see your files and your calendar, et cetera. And so we had all of those extensions lying around and we're thinking like, oh my God, the obvious thing is instead of like you're doing everything manual, you say what you want to do and the computer does it for you. Turns out it's a bit harder getting there, right?

Starting point is 00:15:05 but the Promise Land is like quite nice, right? So you flip it upside down essentially and kind of change how you do use computers in the first place because if I think about how I used to use a computer, it's like, okay, I have an idea what I want to do. So in my head, I'm kind of transpiling that into clicks and key strokes and navigate around my computer. But now it's like instead of doing that, I just write down

Starting point is 00:15:32 or even like speak into my microphone for several seconds and then let the computer handle it for me. So we had this idea of early on, getting there was a bit harder because one, we wanted to make sure that those things work really well, so that isn't that easy. Two, you also had to figure out a bit the UX because it's still, with a prompt way you can say anything,

Starting point is 00:15:57 is great, but also we used to have great UI that guides us, right? We have buttons that we can click and help us basically navigating. And now suddenly the computer pretends that everything is possible. It's a bit of a lie because that's oftentimes not a truth.

Starting point is 00:16:14 And so figuring out sort of the middle ground between when it makes sense to have UI and when it makes sense to just have an open-pronged field, that was like a bit of a challenge. Well, that's also kind of an essential raycast problem, right? Like, this is the thing you and I've talked about before, the how do you discover what this thing is able to do problem because you open it up and it is just a text box.

Starting point is 00:16:36 You have the exact same problem that chat GPT has, which is that you open it up and it it makes clear that you can do things, but it's not super clear what things to do or how you teach it to do things. And again, this is where like all the agented companies get really excited because they're like, you just say it and we'll figure it out for you. I'm extremely suspicious of that as a concept. But it does seem like you're sort of. stacking discovery problems on discovery problems here.

Starting point is 00:17:05 Is there a way to start to push through those things? I think so, yeah. We've got to learn how to use this new technology, right? And it changes in how it behaviors. If I look, for example, for a moment, at coders, right? So they probably a bit further ahead in this adoption curve. And it's like, they're very close to the technology. So I think that's why those are progressists there really quickly.

Starting point is 00:17:27 But programming used to be like you write text and if you write something wrong, that's bad, and then at some point somebody, like a compiler tells that that's bad, and then you correct it, right? And then we happen to like, no, it's like, oh, we can do some other completions. So, we kind of know what's possible, or we can show you possibilities, and then you pick

Starting point is 00:17:45 them. And that's great. And then the first LLM use case was like piggybacking those completions and say, like, oh, maybe I can tell you a bit longer what you could write and kind of suggest you that and, like, predict that for you. And then that worked really well. And then now is

Starting point is 00:18:01 like, well, I don't even write code anymore. I just like write what I want to do and let the LLM do it for me, right? And so I think you see like sort of that the pattern where I call this oftentimes prompt first. Like if you know what you do and you know the system, you get actually really good results. But you're right. Like there is a discoverability phase where you need to know what a system can do, right? And I think we had is like not that long ago when we had all the voice assistants. I'm not meaning like the ones we have friends. now, but like back in the day, the Alexis and all those things were suddenly like voice interface were the hot thing. And then everybody was like, oh my God, God is amazing. I can order

Starting point is 00:18:39 an Uber and God knows what. And then everybody got them and well, we all know how that one turned out, right? Like it wasn't that useful after all. So I think like it's sort of like the tech is obviously like much better now. But people still like need to learn how to use the tech. And that just doesn't happen overnight. And so prompting is still. like a skill thing. Like oftentimes you get user feedbacks, oh, this didn't work. And then you look at it, it's like, well, I'm not surprised that it didn't work because

Starting point is 00:19:09 it didn't really tell it. So yeah, discoverability is something that I think is extremely important. I think as those systems become much more proactive, I think this will be better. Like when a system pushes to you like, hey, how about that? Or you start typing and it suggests you, oh, I know

Starting point is 00:19:25 kind of what you want to do. And I know what the system can do. So I can suggest intelligently what's possible. and kind of guiding you into my direction. So one place I think that approach could be really useful, and I'm sort of surprised to have not seen more people try to do it, is what you were talking about earlier with the fact that there are lots of different models

Starting point is 00:19:44 that all have lots of different skills. It seems to me what we need is not just a sort of model switcher, right? And Raycast offers you access to lots of different models. There are lots of apps out there that are just like, we have all the models in one place. And that's something. but what I actually want is something that is like an intelligent router between the models that's like, okay, this is actually the one that is going to do better image generation.

Starting point is 00:20:08 And oh, what you prompted me is actually a huge research product. Let me funnel it to this. Like, I think the idea that we all have to understand which model is best for which thing is like ridiculous and just bad UI. And it seems like this is a thing that you're actually in theory in a pretty good position to orchestrate. is this is this like a possible thing to do like why why doesn't this exist yet yeah um in fact like we started doing that right um i think like it's basically first you kind of need to understand what are the best models for what thing right and like some of them you can measure but others

Starting point is 00:20:45 is also like a personal choice um but we started doing this and i noticed like oh there are some models that just better at like for what you said like image generation or some are better at like um at trending workflows where they use certain tools to get a job done. And some are better at like, yeah, recognition of like images and all this kind of stuff. So we started basically now abstracting that away. And sort of we think about it of like this disclosing the complexity over time. So we think the best experience is like you sort of have an automatic mode which just does what you want, right, automatically. And you don't need to worry about it. Like if you ask a question where it needs like a deep research, it does it for you. If you want to generate an

Starting point is 00:21:31 image, it picks the best model for that. But then also like sometimes when you get more advanced, you maybe want to have the flexibility. So you can go a level deeper and we want to give you still the configurability where you say, hey, I kind of know what I'm doing. So I want to have that specific model doing that shop. And so you kind of like go up and down in the configuration and can pick what you want. And I think you see this like in the industry where you saw this on chat GPT they're like put out like an auto mode

Starting point is 00:22:03 and then everybody was freaking out that you can't select models anymore and then they kind of like had to broaden it up. And I think like this is like something which you will see more often. Because I mean if I think about it like we have this massively smart systems and we still need to figure out which one we need to use.

Starting point is 00:22:20 As you mentioned it's a bit ridiculous right. So over time I think this will just go way and it just does what you want and figures it out, which just not there yet in a way. Do you think Raycast is in a uniquely good position to do that? Because it seems to me, like, again, when I think about Raycast, I think about it as like, it's just, it's just a box with access to all of my stuff, right? And I think on the one hand, it has, it has access to all the files on my computer, which I think is a thing that is sort of unique to Raycast. On the other hand, you have, like you said, all of these extensions and all of these apps that I'm plugging into,

Starting point is 00:22:59 and I'm like, you know, typing my API keys for all of these apps into Raycast. So like you, you have this access. And then on the, on the third hand, you have access to all of these models. And so what it seems to me is, at least in theory, there's nothing you don't have to just be able to orchestrate all of this stuff on my behalf. Like, is there some big blocker here that I'm missing? Or is it just a matter of figuring out how it? to make all of this stuff work. Yeah, it's more a ladder, just basically for us.

Starting point is 00:23:28 I guess you're right, like, we're basically in the perfect spot, right? Just, like, happen to be there at the right time, at the right place. So, yes, we have access to all of that. On top, we also kind of see what you're doing, right, and throughout the day, which actions you perform through Ray cars, and so we had, like, the usage patterns and those things as well. So connecting those things together is, I think, the magic cells here. is like what basically makes this really personal and unique

Starting point is 00:23:58 and tied to you, right? Because we're all somewhat unique in using our computers differently and we use different apps and we write differently and we interest in different things. And so you kind of need to have this personalization layer. But we're also spending like hours a day in front of a computer and perform things, right? And so collecting those and analyzing those

Starting point is 00:24:22 and making sure that we basically can predict the next things you want to do and becoming smarter over time and having this sort of reinforcement learning for you personally, I think that's what really makes a difference. And we call this sort of the contextual AI. We talk a lot about context generally with AI, right? Like I think when you talk to people like prompting is providing context to a large language model so it can produce the best results.

Starting point is 00:24:47 But sometimes it's really hard having the relevant context. but if you happen to be always around while somebody works on computer, you can collect a lot of that and basically become smart over time and can help the person steer in the right directions. And kind of ideally the computer knows the same as you do, right, from like what you have read and what you have consumed and building that up over time. So it can basically like take the same resources

Starting point is 00:25:16 and throw them back at you and form, connected dots because it has read all the things and consume that. And then also tie together and use the same apps and tools they use that are already connected to Raycost. How do you start on a project like this? I think a thing that I see a lot of AI developers and people building stuff with these tools do wrong is they try to do the like 100% version of the thing all at once. Right? And like I can say with great.

Starting point is 00:25:49 confidence, most agenic workflows don't work. They just don't. And there are a lot of perfectly valid reasons for that, but there's also a lot that this stuff can do. And I think to me, it seems like the struggle right now is figuring out how do we do, how do we sort of sequence this thing that eventually gets us to the place that we think and hope this technology is going, but that actually works today. And what I see is everybody either built stuff that doesn't work or it ends up just being a chat GPT text box and you just sort of offload the, does it work or does it not, to chat GPT?

Starting point is 00:26:26 Yeah. Like, have you figured out how to sort of sequence your way to this magical thing that may someday be true, but clearly isn't yet? Yeah. I think that's the tricky part, right? Like, we all seen the shiny demos in launch videos

Starting point is 00:26:40 and then they fall apart the moment you use it, right? And it's kind of annoying. And I mean, sometimes those things don't even ship, like they're just videos and they never get it, getting materialized strut. It's science fiction, right? And it is like, the dream, this is the interesting thing about this moment is like, I think people mostly agree on what the dream is.

Starting point is 00:26:58 But it is still a dream, right? Like, it is, it's a plausible future, but it is still the future. And I think like everybody would do well to just remember that a little more often. Yeah, I think I wonder like if it ends up being this like, you know, like self-driving cars. Like, oh, it's just like one more year. and then it's going to be self-driving, right? And then this goes on for 10 years, right? The progress that LLM's made shows a bit different the trend, right?

Starting point is 00:27:29 The trend is more like, oh, wow, we're, like, having, like, a lot of progress in a very short time, and it doesn't seem to stop anywhere close. But I think, like, it boils down to, like, sort of the usual things, like, do the simple things first, right? So try out, because it's such a new technology. you kind of need to get an understanding what's possible and what is not. And so it's a lot of like just prototyping and see kind of what sticks

Starting point is 00:27:55 and what brings real value and it's not just like science fiction as you say, which maybe works one out of ten times and then that's not going to be useful. And so I think like finding that middle ground is extremely hard. And you see like some of those things that are happening, right? So where people see value,

Starting point is 00:28:13 which may be not a sci-fi thing that we all dream up, but like say meeting recordings, like those happen now basically on a regular basis, right? So there's some true value in here that was not possible before. Or like just like, yeah, the research case is like just consuming and finding information about topics that you would otherwise not have looked up. So there's some of those very concrete examples. And I think there's a lot more out there.

Starting point is 00:28:37 Like coding is another one, right? It makes just so much progress in such a short period of time. But it's not this like super general stuff, which I think for us is in a way like a challenge because Rayco's is just like everything app right you open it and then you can type something in and so finding that middle ground is sometimes hard for us but it's coming back to like okay let's see what people do

Starting point is 00:28:59 very very often every day see how we can improve those workflows and then go sweater details to the prototyping and see what actually makes a difference and then when you find something like that you kind of can bring it back and then and then bring it back to you. uses and that usually like resonates as well because that's what people then are I used to.

Starting point is 00:29:20 But yeah, it's like there's a lot of like expiration and at the end of the day everybody cooks with the same water, right? Yeah. Yeah. Is there an example of that in the project now that you can think of that feels like that that sort of medium measure that you either got right or kind of in the middle of getting right? Yeah. It's weirdly the simple thing sometimes. Like, oh, you open it and like I use it for meetings, for example, all the time. There's like a word pops up, you don't know, you open your ask, you get the answer done.

Starting point is 00:29:51 Like, it's those, we also optimize RACOS as a tool for something that you basically use like hundreds or thousands of times, right? Like, it does a lot of like little things that pile up. So it's for those short interactions. Things that we see people use all the time is like just like plain reformatting

Starting point is 00:30:09 text and fixed spelling because like, well, we're still typing all day long, right? So making those things easier and faster to do. And then people, when they get comfortable with it, they're getting a bit more adventurous, right? And then it's like, oh, I just happen to download a bunch of files. I need to move them in a separate folder and also rename them so they make more sense.

Starting point is 00:30:29 And so they then type in those prompts and see like, oh, this works as well. And then I take the next step, right? Yeah. Okay. Renaming files is actually like a perfect example of the kind of thing I want to talk about, about Raycast specifically. Yes. Because we've been talking a lot on the,

Starting point is 00:30:45 show in elsewhere about the idea that Saty Nadella and Microsoft have right now that before long you're going to barely use your computer and it will sort of use itself on your behalf. Just to put my own cards on the table, I think that is not correct. At least not in any sort of near future. But I do think that there is a lot of room for like doing computer tasks without having to do the tasks. Right. And I think about like all of the things that, you know, we've spent 20 years downloading little tiny utilities to do that that were these sort of like one-off apps that are like batch resize a bunch of photos. Simple example. Or like rename these photos with all the same name in sequential order based on when I took them. Like these are the kinds of things that we do a lot on our computers that are not hard tasks and they're not particularly like mentally complex tasks. But it's like a. a constant part of computing life.

Starting point is 00:31:45 It seems like you're in a position where I should just be able to say to Raycast, rename all of the photos on my desktop based on what they are and when I shot them and put them in an order that makes sense. Just clean up my desktop for me. Yeah. Are we almost there? Are we there? Are we nowhere near there?

Starting point is 00:32:03 Where are we? We're 90% there, I would say. Really? In fact, you can do this today in RayCost. Like, we have that, right? You can't do this. And then the 90% I'd say, like, every now and then it doesn't work, right? Thomas, I'm going to try this right now while we sit here.

Starting point is 00:32:19 And it's not going to work, and I'm going to be mad at you about it. I'm scared. But so, but like it's possible, right? And you mentioned, like, this sort of, like, super-agendic OSIS, I think, what you mentioned, right? So you, like, where the computer does everything for you. I mean, when we reached that state, we talk about H-E-I, right? Then there was a question, like, why should I even open a computer? like what is a computer at that moment, right?

Starting point is 00:32:44 Sure. I think how we think about it is more what's an intelligent OS, what's sort of the AIOS, right? So how our operating systems will change to like adopt it as new future where everything can be smart and it's not necessarily static. So you mentioned like you maybe want to have a little app to do something. What if you could have this app just like by asking AI and it builds this little app for you? And then you have it for yourself, right? and then you use it for the shop. And then the job is done and then it maybe gets disposed.

Starting point is 00:33:16 And then it's like, that's fine, right? And so it's like this one-off software, this personal kind of software that is like personal to you, but maybe also to your team or your company that is like very tailored to the use case you want. I think that's like something which is quite fascinating. As we like things get smarter and software maybe gets cheaper to build, I think there is something quite fascinating when you're operating systems. become similar, right?

Starting point is 00:33:42 So where you can just like prompt things into existence for like a short period of moment when you need them. And then when the job is done, you just like don't need to use them anymore. And then tomorrow you have a different one. Or maybe at some point you have apps

Starting point is 00:33:57 that are just appearing there as you, as you progress with your day. And it's like, oh, I saw David needs to like do certain things. Hey, here's a little app for you that you probably can use. All right. We got to take one more break. and then we're going to come back, and we're going to finish my conversation

Starting point is 00:34:12 with Thomas Paul Mann. Be right back. Support for the show comes from Grammarly. You don't need reminding that the world moves fast, but work today requires clear communication, and when every message counts, sounding rushed or generic, can be getting lost in the shuffle.

Starting point is 00:34:30 Grammally gives you one place to think, write, and finish your work where you already write, while giving you access to agents that help you sound natural and engaging. No matter what kind of writing you're doing, Gramerly helps you get ideas done faster and move from draft to done with less friction. You can use Gramerly's AI chat to brainstorm ideas, outline a solid draft, then refine it with context-aware suggestions that fit what you're working on.

Starting point is 00:34:57 See why 90% of professionals say Gramerly has saved them time writing and editing their work. In a world of generic AI, you don't have to sound like everyone else. With Gramerly, you never will. Download Gramerly for free at Gramerle. That's grammarly.com. All right, we're back. We're talking AI with Thomas Paul Mann. Let's get back into it.

Starting point is 00:35:27 You bring up another thing that I've been wondering about, which is, I think a thing that Raycast did really well early on was make it really easy to build Raycast extensions. Like, it's just a little bit of fairly straightforward JavaScript, and you can have something up and running pretty fast. And so you've built sort of an app store on top of Raycast in a way that seems to be working really well and there's a lot of stuff and it's pretty easy to do. Does that all eventually go away if we get agentic AI that is good enough to just go do all this stuff on my behalf and I no longer need this sort of interim step of somebody built an extension that helps go do it? Or is actually

Starting point is 00:36:04 what we need lots and lots? Like should I be using AI to build JavaScript's extensions for Raycast or should I be using Raycast to just completely obviate the JavaScript extensions? Yeah. I mean, Fair point. So yeah, extension was really what put us sort of on the map because we realized really quickly, okay, people just want to integrate and break us with everything, basically. And there's no way we can build all of that.

Starting point is 00:36:29 So we gave it out the community and then we made it like super easy to build them. And that allowed us to like have over 2,000 extensions now in the store. So every day there is like new contributions coming and so on and so forth. But if you take a step back, what we really wanted to do is like build a productivity platform. That's sort of like what we wanted to do. And extensions is almost like an implementation detail or JavaScript itself,

Starting point is 00:36:53 but even extensions are an implementation detail, right? So imagine like those wouldn't exist for a second, but services still exist, right? You still want to do something with like Google Docs or Spotify or you name it, right? Or your files for that. And so the idea was always like, how can you integrate with those things really easily is like can't do the chop for you. Like this illusion that we did is like,

Starting point is 00:37:16 oh, people can build extensions, you can use them. But you could even equally think about it like, oh, like an AI can build something like that for you, so you can use it. And then your extension might be behaved differently. So the notion of extensions becomes almost a bit plurry, right? It's just like, that's evolving software in a way. And even for yourself, you're probably like just downloading some extensions,

Starting point is 00:37:39 but you haven't built them in the first place or somebody else built them. So it's not too far off, like, for you prompting an AI, to come back with a solution for you, but it's like tailored towards you, right? The key thing I think is like to make it all cohesive. Like if everything is like different and you can't find yourself around, it becomes quite annoying and not useful, right?

Starting point is 00:38:01 That's like why people prefer like apps in the first place and why apps and mobile phones, one, because they're like optimized for the phone, right? They follow the same UI and UX patterns and people know how to use them. And so then the mobile app is like also kind of like more and more catering to with that. And I think that's going to be similar.

Starting point is 00:38:19 Like you, to make it really useful, we want to integrate with everything around us and make it extremely easy for you to consume that information. And then also because software becomes like free in some way to create, at least like little apps, you can transform that however you want to consume it. And I think that's super exciting because we all like slightly different and we have maybe different preferences.

Starting point is 00:38:44 I maybe want to see like a craft where this you will have a different representation and like if you could just change that with like just a little prompt and then you have it your way I think that's super exciting where basically software becomes malleable and you can change it at hoc

Starting point is 00:38:59 and it becomes just what you want and becomes really this personal touch and that's what I'm personally really excited about and that's what I feel like operating systems will evolve into something that is like a personal operating system to you and they're not looking all the same and software is not all the same.

Starting point is 00:39:16 They're like tailored really to the person that sits in front of the screen. Yeah, it's funny. One of the things I talk about all the time with AIIS stuff that I think is actually really powerful is just like simple CSS stuff for styling apps and web pages. Just the idea that all of a sudden

Starting point is 00:39:33 what I now have is the power to tell this app that I want it to be blue and it can be because that's a thing that like Claude code can do, right? Is change the CSS to make it blue? That is a thing it is capable of doing. And then what you need on the other side is basically just the hooks that give me that tool to do. And I think what it's been before is like, okay, you have to build a bunch of complicated things.

Starting point is 00:40:02 And you have to come up with a whole like, how do we display the color wheel do we do? And it's like that's not like an impossible thing to do, but it is a thing to do. but if you just let people plug in that way you give them all kinds of opportunities and options just by opening it up to we're going to let you build this however you want to build it. Yeah, I think that we have all the building blocks, right?

Starting point is 00:40:26 Right. But I think what I'm getting at with the extensions thing is like as you're thinking about AI and I guess just to go back to this, I want to rename a bunch of photos in a folder on my computer, which is the thing. Raycast is very well set up to do. If I prompt Raycast kind of out of nowhere to just do that,

Starting point is 00:40:47 you have a bunch of tools and you have a bunch of, you know, agentic systems that will go try and figure out how to do that for me. Or should I build the thing once, like vibe code my way into a Raycast extension that renames files on my computer and then just use that over and over because now I've built a thing that is like reliable and robust and stable and it will do the same thing every time. And the problem with a lot of these AI systems

Starting point is 00:41:14 is they don't do the same thing the same way every time. And sometimes that's exciting and interesting and leads you down different roads. But other times, I just want it to rename the photos. Like, I don't need new ideas about renaming photos. I need you to rename the photos. And in the same way all the time. Exactly. So I think, and especially as you're thinking about this stuff, you're like, okay, well, do we

Starting point is 00:41:32 want to use this all of these AI models in a way to like build rigid, structured things that you can then do on your computer over and over reliably or is the kind of open-endedness of the system

Starting point is 00:41:47 a feature, not a bug? And I just can't quite figure out where I land on that spectrum. Yeah, it's a tricky one, but I think like for tools, having something unpredictable, it's like a no-go, right? Like you wouldn't use, let's say,

Starting point is 00:42:03 I don't know, something complex like Photoshop and half of the time the pixel turns red and half of the time it turns blue, right? Like, you couldn't work, right? Right. And so I think that's a strong argument for software, right? So let's say, if you can generate a software once,

Starting point is 00:42:17 you don't need any AI anymore. It just works, and then it does the job perfectly all the time. I think it's like a feature, right? It's not a buck. Like, it's great. So I think, like, leaning much more towards that because that's kind of what the world runs on, right? It's software.

Starting point is 00:42:34 It's like getting written once and then you use it. Right. And you can always adapt it. or tomorrow you said like, oh, rename the files like this way now, and then you can use this. And I think that's like something which is quite nice when you get out an artifact that you can use. And that's like what we have at the moment as extensions, right? You get this artifact out. You can use those extensions and use them over and over again.

Starting point is 00:42:56 Where we sometimes struggle with is like, yeah, sometimes those non-smart things, how you do them, they're like just because there's so reliable and fast and become the muscle memory are somewhat better in a way. So you kind of want to find a middle ground. And I think for tasks that are very concrete, you want to have what you mentioned, like just you have an app, an extension, whatever it is, but it does the chop, it does it all the time the same way, great. And there's some other tasks, and I feel like they're oftentimes more open-ended. They don't have a single solution.

Starting point is 00:43:28 They have, like, nuances to it. You don't even know exactly what you want. And those, I feel like are the ones that are really good with AI, where it just goes out and does something for you and you come back and you come back, oh, I haven't thought about that. That's cool. That's a nice solution. So yeah, I think like there is something nice about the concreteness of

Starting point is 00:43:48 software. You write it once and then it works the same way the whole time. Yeah, that makes sense. Does your quality bar have to be higher than some others because you have this kind of access to all of the apps and even like the system? Like, if you

Starting point is 00:44:04 wanted to break my computer or allow chat GPT to break my computer, you could But you have an unusual level of access to my computer in that sense. Do you have to treat this kind of nascent technology differently because of it? There is certainly a lot of scrutiny there. And then when users come to us, they oftentimes ask us like, oh, is AI running in the background? Can I do something? And so we had actually to put a lot of like just like even UI and callouts.

Starting point is 00:44:38 into the product to say, okay, this is secure, this is not running if you're not triggering it on, you're in control. So if there is a disruptive action, for example, like deleting a file, you will be prompted and you can say yes or no to that. And I think that's like, that's like definitely something that we need to maybe do more than others, which others can go a bit yolo in a way. And because we have this like system that you mentioned, like we can access your system in a very deep fashion.

Starting point is 00:45:10 And so kind of need to build up this trust. And that's also what people expect from us. Like they used it for years already. And it becomes, well, it always works, right? It's this app that basically can never fail because it's like always there. And if you don't have it, people, like, feel like they can't use their computer anymore. And so we put a lot of effort into, like, making, like, super stable. And so that's, like, in here the same way.

Starting point is 00:45:32 Like, if you use that, it needs to work basically all the time. which as we discussed is really challenging, right? And I think this is with machine learning and AI generally, it will never be 100%, right? This is just the technology doesn't get you there. So it's always like how far you can push it. That's why we have all this benchmarks, where all the model providers try to climb them up

Starting point is 00:45:53 and be on top of each other. But you will never be 100% correct. And for that, it's even more important to have the guard rails, right? So if something goes wrong, you can easily recover or in a dear world it never goes off rails. and you basically give the user the control, which is often described as like having the human in the loop, even so that feels like, again, a bit of a sci-fi term, the human.

Starting point is 00:46:15 I mean, yeah. Yeah. So do you have to be extra careful about that stuff kind of at every time? Like, does it make building Raycast harder because you have built in this AI stuff that can do so much but is kind of unpredictable in that way? I wouldn't say necessarily harder, but it's something which we think about. From the get-go, we say, like, hey, we want to build a private company.

Starting point is 00:46:43 We don't want to collect your data and this kind of stuff. So that is something that we build trust on. You just need to be smart to know what you build and maybe what you shouldn't build. And then when you build it also in an elegant way and give the user basically the choice of, like, do they want to use it? and then if they use it, give them control. You can also say, like, hey, always delete my files. Don't ask me for confirmations. That's like user configuration, right?

Starting point is 00:47:12 But default, that's not turned on for reasons. And so, basically giving flexibility. Yeah, yeah. Full nihilism, just whatever, delete anything you want. Go ahead. See what happens. But then you also want to be smart, right? If it's like a rename that you could do undo,

Starting point is 00:47:29 you don't want to, like, prompt a user for that. So this is, I think, the complexity. you may be referring to, you maybe need to think a bit more differently about certain things to make sure that the users build up confidence over time. Okay. What's something you wouldn't build? Like, you mentioned things you can and can't do because you have this kind of thing. Is there something that feels obviously over the line to you on that front?

Starting point is 00:47:52 I should watch out now what I say, obviously. But I think like it goes to like the privacy aspect. we had certain things like for example give you like a sense of what we felt like quite cool we have this feature called focus and the idea of it is like basically you can block distractions like websites and other things and then it basically plens them out

Starting point is 00:48:15 and if you go there you see like a warning and so on and so forth and then initially we had the idea is like hey wouldn't it be cool to make this smart so that you don't even need to configure what you want to plug it just kind of like detects that this is probably a distraction. And then how you would do this is probably like you do a screen recording all the time

Starting point is 00:48:36 or some screenshots and then you send them out and then you analyze them and then you come back. But at the end we felt like, yeah, this is maybe stretching it a bit too far on like analyzing your screen all the time which we don't really want to do. And I think what we realized, like users probably would be very hesitant. And then we thought about using like local LLMs

Starting point is 00:49:00 for that. And then we said, like, actually, the person that sits in front of the computer kind of knows what the distractions or the better solution is probably just letting them define it. As boring as it sounds, I feel like sometimes that's the right thing, right? Like, I mean, we have still intelligence we can think. So sometimes maybe we can also put in what we want. So that was like just one of the things which came to mind, which we like sort of first started of like, oh, let's make this super cool AI solution. And then you ask yourself like three times wine and you end up as like, yeah, maybe a more traditional solution actually cuts it here. That's such a good example because that is the sort of thing that at first glance, you're like,

Starting point is 00:49:37 yeah, it would be useful if Raycast or my system could understand the places that I'm wasting my time, right? Because it's going to be slightly different for everybody. I spend too much time on Reddit. You might spend too much time on Instagram. And if I could just be like, just delete all the places that I waste time. And it could do that. There's something that is cool about that. And there is something that is like immediately horrifying and off-putting about that. Exactly. What a lot of companies have said forever is just we're going to push through that discomfort and trust that actually if people will eventually get used to it, we've made it so convenient

Starting point is 00:50:12 that they're going to get past the ick factor of this. And I think, A, this stuff just doesn't work reliably enough yet to do that in a really sort of predictable way. And the minute I go to like my work email and my focus session is like, no, uh-uh, I'm I'm out, right? Like, you've now broken this system. But also, I think, I think frankly, every developer has some responsibility here to say,

Starting point is 00:50:38 it's actually okay that we're not comfortable with this. And maybe I shouldn't be pushing you to get comfortable with this. Maybe I should be asking you to make decisions because you're a person capable of making decisions, not to get over the fact that I'm going to make them for you. And I think we're about to go through a million versions of that with all of this AI stuff. It's like, should we just,

Starting point is 00:50:56 just bet on the tech getting good enough that everybody will get used to it or have to or should we like continue to make an effort to let people be in charge of their own existence. And like this gets big and heady and existential really fast, but it does feel like we're encountering that question kind of a million times every day. And even like I just keep thinking back to this thing Satinella said

Starting point is 00:51:18 about like we're not that far away from people mostly not using their computers and just directing their computers to use themselves. And I think philosophically there are ways in which that feels wrong to me. I feel like it's always sort of this value exchange, right? What do you get out? What do you put in and what do you get out, right? And so if it's like super valuable, people are willing to put certain things in, right? I mean, people upload hell stuff to chatbots nowadays and all this kind of stuff, but they're getting something out of it, right? So I think it's always the

Starting point is 00:51:47 question like, what is the value exchange here? I think it's at this moment really hard predicting the future. If I would look back two years, ago when we basically just started this whole AI wave, Freud, like, would you think like the world is as it is right now where everything is AI? I don't know. Like, it's really hard to predict. Like, would you think that coding has changed that much? Would you think like whatever, pick any topic really, right? It's really, really hard to predict. And I think it's the classic, we overestimate the short term and underestimate the long term in this case. I think it's really like that. I think no idea what's going to be happening in the next six to 12 months.

Starting point is 00:52:29 I mean, everything changed so rapidly. One thing is clear that those things are here to stay. You hear sometimes, like, even if no models progress any further, we by no means have reached the limit of what you can do with even the state of the art, right? And I think that's kind of nice for everybody in industry because, I mean, before AI, let's be honest, like there was a bit of a try phase in tech, right, but everything was hyper-optimized and nothing really radical changed, at least in the terms of software. And so now there's a lot of bus, and like every week there is something new.

Starting point is 00:53:04 And I think like even if everything stagnates, we haven't reached sort of the limits, what we can do with all the technologies that we invented in the last two years alone. Yeah. I know it is strange that it feels like everyone is so busy. I mean, the self-driving cars thing is a perfect example, right? Like everybody is so busy trying to invent the absolute end state of this. where it's like, what if it reshaped society? It's like, no, no, no, no.

Starting point is 00:53:28 What if my car parked itself? That's awesome. Let's do that. Like, let's figure out how my car can park itself and then how my car can, like, run more efficiently. And there are like a million things along the way that are cool and exciting and powerful that don't require, like, rethinking the way an economy works.

Starting point is 00:53:45 And like, let's not skip all the steps because those are interesting things on the way to something potentially bigger. Before I let you go, let's just spend a couple of minutes talking about how you use AI in Raycast and in general. Like where does this stuff fit into sort of your day-to-day life and workflows right now? Yeah.

Starting point is 00:54:04 I think the biggest change for me is like, for me, it's prompt first now. Basically everything I do is start with a prompt. Like, well, we launched something. Okay, got to write a blog post. Let me ramble for five minutes into my microphone. And then that's my starting point. And then iterate on that. Like, that's one of the things.

Starting point is 00:54:24 Oh, I need to like enter emails, which I do a lot. Okay, I'm going to do a lot with AI here. Writing code, same way. One of the things that changed for me quite radically is that you can sort of do things in parallel in the background. Like I can just kick off a bunch of things. Oh, there is a feature request on Twitter. Okay, let me kick something off and address that right away. Oh, there is another one here.

Starting point is 00:54:51 Let me do that as well. Oh, I have this idea. Let me like kick off some. deep research and figure out what's a good solution for that. And it's like, oh, I need to like prepare for the board meeting. Oh, let me put a few things together. So I think my brain is like completely rewired

Starting point is 00:55:06 and it's like I'm prompt first by now. And I basically just put things on the on like start with a prompt and then see. Do you then, wait, I have a procedural question about that. Oh, yeah, please. Do you, if you start everything with a prompt, is the goal then to kind of filter everything out into somewhere? or do you find yourself like living more and more of your life kind of inside the chats of these LLMs? Oh yeah, there are sometimes things that are just like inside of our AI chat and Raycos,

Starting point is 00:55:34 but like this never really sort of produces an output, right? It's maybe me like chatting for a while through something. Oh, like I have this like pick any topic I have like, oh, I want to think about how we can land a deal. These are sort of the points we have. Like what are elegant ways to like maybe continue. continue the conversation. So how could I like find find a solution to like reach our customers better? And like sort of it's almost, I think about it as like a thinking partner, like throwing things back and force and like talk to somebody for like a bit and sharpen myself up in a

Starting point is 00:56:08 quicker fashion. That's how I use it like a whole lot. And so that's why I see also like in our company is just changing where more and more people just start with a prompt. A big change that we've seen in the companies, all our designers they code now. What used to be basically all static designs, they're more and more become interactive prototypes directly in our product.

Starting point is 00:56:33 They can get something where you can feel it and see it and it works and then oftentimes an engineer like prushes it up but all our designers are basically also halfway developers now, which is an incredible change and

Starting point is 00:56:49 I think that like just it's really nice for creative people as well because there was always this barrier of like oh you you draw a few pixels and then somebody else needs to rebuild them to make it interactive and so now we cross that bar essentially it's just like it's a plent

Starting point is 00:57:07 like if you're a creative person and you have to will you can make things happen which I'm super super happy to see that basically coding becomes more accessible to a way and the lamps are still failing in a lot of ways that regard in programming.

Starting point is 00:57:23 But I think that's like something that we've seen in our company happening really heavily that designers become also developers. Okay. Yeah, I think to me, part of the reason I ask is because one of the things that was most sort of unlocking in my brain was the thing in Raycast where you can like, you can basically at mention one of your apps. Oh, yes. And then prompt it.

Starting point is 00:57:45 And it's like that, that to me is like, okay, now we are, now we are getting to like the sequence of things that make sense together, right? Where I don't now need, I don't know, I don't, I don't now need a bunch of different, very specific apps. I can just ask AI models to talk to the apps that they already have access to. It sometimes works. It sometimes doesn't. My whole cleanup the desktop thing has not worked at all as we've been sitting here. Just, nothing. It gave me a bunch of semi-helpful information about the files that I have. Got to improve it. See, that's the way. But I can do more prompting.

Starting point is 00:58:22 I'll figure some stuff out. But I think like there's just there's something that unlocks when you start to see, okay, here are kind of the things that are available to me. And you've just seen more of those things than most people. So I was curious to know, like are you just constantly doing computer activities through prompts now? Like you're starting by everything starts with a prompt. Yeah, pretty much. Like basically for me it's it's a lot of like.

Starting point is 00:58:50 I'm in a browser. I have a few taps open. I pull them in with my ad browser, essentially. I get all the taps in. Then I start from there. Then I say, like, oh, by the way, put this in a Notion page. So then it ends up in a Notion page, and I can share it with my team. Then I iterate on the Notion page.

Starting point is 00:59:09 I do those things, like, quite a lot. But also, like, I let it write code for me to do certain tasks. Like, I had reasonably a bit of a silly example, but I had to do my text return. Well, I didn't do the text return with AI, right? But for that, I needed to download on my payroll, and all of them had a password. So I was just asking AI, it's like, hey, take those 10 PDFs,

Starting point is 00:59:31 and here's the password, can you remove it so I can send it to my accountant? And it did it for me, just wrote some code. I didn't really look at a code because I kind of like know, okay, that's like what it would do. And then it's like, perfect. Otherwise, I would have spent like, I don't know, five minutes going over each PDF, first of all, figuring out how to remove a password, which I have no idea.

Starting point is 00:59:52 And so I think that's like, that's, I think, the change, which I'm quite happy about. And it's like, for programmers, this has kind of existed for a long time. We call this things scripts. It's like little things that a programmer, every programmer you ask, they have a script for various random stuff that I do multiple times a day. What is if this script is just natural language? Like, what if you just say this? And then, to your point, if it solved a problem once, just reuse it.

Starting point is 01:00:17 so you can use it like many times, right? That's, I think, like, those kind of little things that will make a big difference. And that's what we do with RayCost. We want to speed up every little thing and you use RayCost hundreds of times a day. How can, what are the next hundred things you should do with RayCost? That's how we think about it. What are the problems we can solve that you use actually super often and not just, like, once a year or whatever. And that's like, I think, the journey we own.

Starting point is 01:00:42 That's pretty cool. Yeah, it's like once you have computer access, the number of things that can start to comprise becomes just enormous. Yes. And you have access to the browser. And it's like, again, this is why I think Raycast is so fascinating. Because you have, you can see the whole stack in a way that is very hard to do for almost any other app. It means the trust bar for you is very high.

Starting point is 01:01:09 But it also means like, we talk a lot about, you know, these AI agents just can't see and do all the things. that they need, Reckast kind of can. Yeah. I think like that's the nice position to be in, like being at this position to do all of this kind of stuff. But we're still got to connect all the dots and build up the discoverability, as you mentioned, make sure that people get it. And also make sure people get real value out of it.

Starting point is 01:01:40 I've seen so many demos of cool stuff, but then you're never going to use this day-to-day, only so little that it doesn't really play well and so that's like for us really the challenge like natural language is great

Starting point is 01:01:53 but discoverability is hard you don't know what's feasible and so on and so forth but yeah I'm excited about this helping basically making your computer smarter by using the same apps and tools you have by having one AI that kind of photos you around

Starting point is 01:02:08 across your journey on your computer and not having like an AI in every app and it's like everything is like isolated. We've been there with apps, right? It's kind of like annoying. And we don't want to spread that again, that all our knowledge, memory, context lives in each and every app. And I get it. Like, every company of those apps want to have this, right? They want to lock you in. So you stay in that single app. It's like the financial things that I want to have, right?

Starting point is 01:02:35 They don't want to give it away. But if you purely think from a user's standpoint, AI should be on the operating system level. It just makes so much more sense. to be there instead of like in every app and every app needs to rebuild it. It just happened to be this gold rush that everybody sees. But truly from a user's point, I feel like the best thing is if you have a smart operating system that helps you to get your job done.

Starting point is 01:03:02 Yeah, I agree. All right, Thomas, this has been very fun. Thank you so much for doing this with me. Well, thanks for having me, long-term listener, and finally making our way here somewhere together. We did it. All right, that's it for the show. Thank you to Thomas again for being here. And thank you to all of you for watching and listening, as always.

Starting point is 01:03:20 If you have Raycast extensions you want to tell me about, if you have thoughts, concerns, feelings about any of this, I want to hear all of them. You can call the hotline 866, Vorge11. You can email Vergecast at theverge.com. I'm David at theverge.com. Hit us up. I think this question of how AI belongs in our software is big and fascinating and messy. and I want to know how you feel about it. So get at us, ask us all your questions.

Starting point is 01:03:45 We have another one of these coming up next week about a very different kind of app that I'm very excited to talk about. We'll get to that, but for now, the Vergecast is a Verge production and part of the Vox Media Podcast Network. The show is produced by Eric Gomez, Brandon Kiefer, and Travis Larchuk.

Starting point is 01:03:58 We'll be back on Tuesday and Friday with all of your usual good Vergecast stuff. We'll see you then. Rock and roll.

The Vergecast - I just want AI to rename my photos

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.