Tech Brew Ride Home - (TWTR SPC) The Big AI Discussion

Starting point is 00:00:00 On April 4th, 2023, around 2 in the morning, a man was found stabbed multiple times on a sidewalk in downtown San Francisco. Hey, who did this to you? What happened next turned the story into a political firestorm. Reports have identified the victim as Bob Lee, the founder of Cash App. From Bloomberg Podcasts, this is Foundering, the Killing of Bob Lee, beginning April 16. Welcome everybody to the TechMe Ride Home Experience for January 12th, 2023. This is the very first recording

Starting point is 00:00:42 of the experience in 2023. We're already seeing a shit in the AI space, whether it's generative AI, whether it's conversational AI, chat GPT is everywhere. Microsoft wants to throw all of its money at it. Bing may make a resurgence, Clippy may come back,

Starting point is 00:00:58 and we're here to talk about how easy it is to actually build these things. Personally, I can say that over the last last month or two, the community on product has not slowed down. And in fact, every day, I'm seeing probably anywhere from five to ten different products launching using these APIs, using these tools, using these services. And so Brian's also been doing his own experiments. And so we wanted to get a sense from people actually in the field building on the stuff to tell us how this is changing their approach, their process, their offerings, they're just the way that

Starting point is 00:01:29 software is already being built differently in 2023. Or maybe even. Or maybe even. specifically, is it ready yet? Which I'm curious to hear about maybe from Braden's first. I was going to say, so Braden is here from Voiceflow. I just helped him launch a new sort of suite of offerings, tools on product time. And I don't know, I'm super stoked about what he's building. And I think we just get him up here. Braden, tell us a little about what VoiceFlow is, who your customers are,

Starting point is 00:01:57 what you guys have been doing, and then what you've launched just this week. So I'm Braden Ream, C-O-N co-founder of a company called VoiceFlow. I think that's like Figma for conversational assistance or conversational interfaces. It's actually a really easy way for teams to design prototype and build conversational assistance for any channel. So as Chris mentioned, we just launched WhatsApp, but we do web chat, SMS, call centers, you name it. Yeah, I mean, we're used by over 100,000 people now. Some of the biggest companies in the world, JP Morgan, Amazon, McDonald's, Home Depot. You can kind of go to our website, but a pretty large chunk of the Fortune 500 now using Voice Slow.

Starting point is 00:02:36 to design their assistance and that spans, you know, drive-thrus, call centers, web chat, conversational commerce, kind of you name it. So, yeah, that's a little bit about voice flow. And tell us about what you launched this week and how you went about building it. Sure, yeah. So this week we launched, we launched our WhatsApp integration, but I think what's probably more exciting is our AI assist feature set. So this is using opening eyes large language models.

Starting point is 00:03:04 So when you think about conversational, assistants, there's really like two sides to the coin. There's like the generative features and then there's also your assistants at runtime, right? And so, you know, imagine like a chatbot you're talking with. In voicel, you don't have tools to make it easier to build your chat box. So things like response generation, sample data generation, entity generation, all that kind of stuff. Almost similar to like what you'd see like a Jasper cell like content, um, content creation tool, but just applied specifically for conversational systems across all those.

Starting point is 00:03:34 I want to unpack this a little bit. I think, you know, again, our audience is learning about the stuff in real time as well. So one of the things that I want to point out first is what voice flow is, as Braden said, is kind of like Figma, which is a multiplayer collaborative design space for these conversational assistants. So whether it's, you know, checkout support or whether it's customer service, whatever it is. The thing about designing these types of bots is that you have to anticipate, one, on the inbound side, everything that someone might say. And then those are called intense. and then tie up those intents to sort of necessary or appropriate response. And then those responses, in order for them not to get super stale or boring or repetitive

Starting point is 00:04:15 or just sound like a robot, you need to have lots of variations. So what some of this technology allows you to do is essentially to say, okay, you know, hey, you know, whatever. Hello, Chris. Like, you know, hope you're having a good day. But then have like 50 variations generated for you automatically, as opposed to you having to come up with them and be creative or whatever. the tool starts to actually do that for you.

Starting point is 00:04:38 Is that sort of, am I getting that part right? Yeah, no, you nailed that. So, like, on the, at the creation side, it's the generation of the responses. And also a really interesting thing you can do is these intents, as you mentioned, you know, for everyone, think of like an intent in conversationally eyes, essentially like, what's the goal of the assistant? Or like, what's the goal of the user? And as Chris said, you, you pair, like, you know, an intent might be purchased. Okay, so what you need to do is essentially create these things called utterances.

Starting point is 00:05:03 and these are all the possible things that the user might say to indicate that, hey, this user has the intent of purchase. And so this used to be a huge pain creating utterances, and it used to be a manual process. So you would sit there as a conversation designer, and you would manually create all the things that you thought that a customer could say, and then you'd go do user testing, and you'd try to find out more things that a user might be able to say. And then lastly, in production, you're constantly looking at the data to see, like, what are users actually saying?

Starting point is 00:05:34 And so, like, managing these intents was a huge pain, pain in the ass, to be frank. But now with some of the generative AI stuff, what you can actually do is say, hey, give me a thousand variations of how someone might want to give us their purchase intent. And it's like unbelievable, like it's such a 10x unlock to build to essentially create this like synthetic training data for your NLU model. Like it's, it has been explosive usage of that particular feature within voice flow, is now everyone's able to quickly create, all this synthetic data without having to go to tons of rounds of user testing and actually

Starting point is 00:06:05 having to launch to production. They can basically take the scope of the internet, which is what these large language malls are trained on, to give you all the variations of how someone might actually want to purchase something. So, yeah, that's on the creation side. Are you, forgive me if I didn't hear this, are you able to tell us what platform you're working with or what tool you're using for this 10x or 100x improvement? Yeah, so we're using OpenAI as GPT3.

Starting point is 00:06:32 It's funny, we actually, before all the generative AI hype, and to be honest, we were actually building our own model. So we actually have a machine learning team at voiceless that we've had for a while now. And we were building our own model off our own data, but the problem that we faced as a tool, so we have about 100,000 users, we had like a little bit of data, but it wasn't enough to make the model good. And then the issue became the amount of data that we had, because we didn't have enough, the model wasn't good enough to actually actually then have more users use it in the way that we were comfortable with because it gave bad recommendations. And so suddenly we're in this like, you know, kind of catch-22. And so what's really nice is now using the open AI models, we're able to have a really high quality model that we can now start collecting data on and start to build our own flywheel on top, which I think is

Starting point is 00:07:19 a pretty standard model that most companies are looking to adopt when it comes to like, you know, using these base models. One of the things that's, I think, very interesting about what you just said is, like, one, can you quantify relatively the amount of data. that you had collected versus what you get with Open AI. Because one of the things that I think is seen as an opportunity is to collect your own kind of proprietary data set and then train your own language model on the dataset. But I think it would be helpful to sort of understand the order of magnitude of data that you actually need in order to go down that path.

Starting point is 00:07:54 Yeah. We had millions of conversation transcripts. And I also think it's important to know, though, that like there's like your base model and there's the fine tuning on top, right? And so we had millions of conversations, but that was for like our base model as well as our fine tuning, right? Now we can actually like we can use these base models, which produce good enough results that were comfortable pushing it in production to like, because it has a good enough user experience now. And then we can also start to fine tune another model on top of that one.

Starting point is 00:08:22 That's going to be, you know, specific to industry, specific vertical use case, whatever it might be. So yeah, before we just didn't have the base model, but now we have a really strong base model. we can actually start to fine tune our own models on top of it. Would it have been possible for you to, over some period of time, and I don't know if we're talking about like a year or like a thousand years, to generate a similar size language model that Open AI provides you with the API? Because, again, we're trying to sort of assess the step change that these new APIs that are now becoming publicly available

Starting point is 00:08:53 are offering to the industry, right? Like, if you had done the same thing that Open AI did, how long would it have taken you as voice flow to get there? I think it would have been impossible because... Okay. When you think about, it's about the variety of data, right? Not just like, like, you know, with our customers, generally speaking, it's going to, it's going to have a tighter clustering than like the width of variety you're going to see on the internet, right?

Starting point is 00:09:15 Which is what these large language models are trained on. So like, even if it was a million years, I bet you at a certain point you just have like, you know, decreasing marginal returns. And like you're just not getting, you know, the richness of these variety, right? Exactly, which is what. Okay. That's what makes these large language models so special is that. They have such a wide variety of data that's being trained on that has broader understanding of the world. Then, you know, like all we can do and sort of train on that voice flow.

Starting point is 00:09:39 And the reason why you need that is because when you have an open, like this is the crazy thing, right? Like you might think that if you're building a conversational or customer service bot that humans will communicate in a fairly standard way. But it turns out that actually we're fucking insane. And we'll just say all sorts of random things in a box. And then especially if you're a brand, you need to respond in a way that like, one, doesn't like tank your brand and lead to lots of screenshots floating around on social media about how you, you know, invoke Nazis or something in response to something that a customer said. But that you can kind of nudge the conversation back to the, as you said, like the customer's goal, right? So the customer might be like talking about the weather all of a sudden.

Starting point is 00:10:19 And something is like, oh, that's nice to hear that, you know, there's flooding in California or whatever. But let's talk about this return that you want to process. So how does that, I guess, like, factor into this? Because I think that's the thing that's different, right? You said it would be impossible for you to generate a language model the size that you would need to handle all the variations of what people say. But now you can actually take those things from the breadth of the full Internet or from the full web and then apply them to actual customer problems. Yeah, totally. So I think the way we think about this at voice flow is conversational assistance have essentially like when using large language models, there's like a four layer model.

Starting point is 00:10:55 This is just a conceptual framework. Like you want to have what I would describe is like your first layer, which is your base model. And that's the large language model, you know, from an open AI or whatever vendor you want to use. And that is like the knowledge of the world. So imagine you were training like a Starbucks barista. Like what makes conversation really fluid is when you have a human who has knowledge of the world so that like you can throw at them any conversational experience and they'll be able to handle that, right? Because, you know, they have that knowledge. Now that's what conversational layout has missing and now has been added.

Starting point is 00:11:26 The second layer, though, is the knowledge about your business, right? Because you can't just take anyone off the street and throw them in a Starbucks apron and expect them to be able to take orders or work at Starbucks. You could, but it wouldn't go that well. Totally, right? And then you have, so, you know, those are your two models, right? You've got like knowledge of the world and knowledge of your company or knowledge of the domain, whatever it might be.

Starting point is 00:11:48 Then you have your third layer, which is like, what are the objectives, right? Because like, it's one thing to know about Starbucks, it's one thing to know about the world. It's another thing to know, like, what is your job, right? Because that might change between a barista and a manager, right? They have different objectives. And so that's your third layer. And then lastly, is like knowledge of the customer. And so these four things in conjunction are what allows these assistance to be fairly valuable.

Starting point is 00:12:09 Because if you just plug in like a chat GPT on your, you know, your own brand's website, you know, it can chat about anything. And that's awesome. But it doesn't have goals, right? It doesn't have like the ability to like loop the conversation back to, okay, hold up. This is like so Westworld, by the way, you know, where there's like sort of agents and they have like, you know, goals. Anyways, yeah. Totally.

Starting point is 00:12:27 I mean, what's cool is like at voice flow, you know, we, I think we're one of the first companies to do this. We have, like, our customer support assistant is running on a large language model. And you can actually go try it out. And what you'll find, you know, at least we hope, you know, it's going to get better and better over time is if you say like, you know, okay, you need to book a demo or you need, you know, to get product support, whatever it might be, that those are the pre-program goals that we want our assistant to do.

Starting point is 00:12:50 And that's just from looking at customer transcripts and seeing like what should the goals of our assistant be. However, if you start to go off the beaten path, like, you know, today I think I asked it, like, you know, who scored, you know, who scored the most points in the 1990s, like NBA season? It can give you that answer, but then it's going to be like, okay, but back to the topic, what support do you need, right? Like, what do you actually want to achieve in this conversation? And, like, that's what's truly valuable is when you start to pair a very rules-based deterministic dialogue manager, which is essentially all those intense and those goals, and you pair that with a large language model where it can handle, like, the fluidity of conversation. pairing those two things together is what gets really, really special. Braden, we know you have to go.

Starting point is 00:13:29 So before you do, can I squeeze in the question I'm most interested in, which is we'll get to my experiments with this later. But I wanted to find out, like, is this ready for prime time? Like, are these tools, can you really start to build a business on them? So, number one, it sounds like there are, like you can do that. It's easy to plug in and do. real things with it. Is the pricing good for you? Just on a high level, building a business on top of a tool like this, is it ready for prime time? What is your experience?

Starting point is 00:14:05 So Voiclo has been around for four years. I think we've raised like 20, 24 million dollars or something. We were a business before large language models. We view it as very much additive, right? This is a tax improvement to certain parts of our business, but we're not a business that was enabled purely by large language models. Okay, well, I'm sorry. I didn't mean to hope you don't take that the wrong way. What I'm asking is, is anyone listening that might want to be like, hey, can I start filling around with these tools,

Starting point is 00:14:35 but not just fiddling around, not just running experiments, actually building a business, that's what I want to know. Are we there yet or are we getting there? Like, what's the state of play? Yeah, I mean, these are ready for prime time. Like, we were building our own models and we quickly switched over to other vendors models because we realized it was ready for prime time. Right. We've sort of been watching the space. And it was, you know, to be honest,

Starting point is 00:15:00 the big unlock for us was when we saw chat GPT, same as everyone else, because you start to realize, like, hold up, I'm not just generating off prompts anymore. I'm able to wrap the conversation with context. And that was when we're like, okay, it's, you know, these live to conversational AI, because you can actually hold conversation context, a passive context, and be able to get, you know, more contextual responses. So, yeah, I mean, it's, it's ready to go. Without, without revealing it, the pricing makes sense. Like, you, the pricing is where you could also build a business and still have margins and

Starting point is 00:15:31 things like that. Oh, totally. I mean, like, the value, you know, when, when you look at our customers, they're not paying for the large language model. They're paying for voice flow of which the large language model is a component, right? It'd be different if our service was a large model and we had to have a margin on top of these things. Right. But like we've, you know, we've been building voice as a platform for years now.

Starting point is 00:15:54 And like the vast majority of the values, you know, it's real time collaboration, it's commenting. It's like all those workflow features. It's not necessarily the large language model itself. And I think like, I think basically the closer your functionality gets to mimicking a large language model is where you start to lose your margin power. But if you are purely like, you know, it's additive. Like, I don't know. Like, let's say you're like, Aurora or something. Like, you know, like the large language model is not your entire service, right?

Starting point is 00:16:22 There's so many like peripheral features that customers are paying for. You're just making it better. And I will also say too, like they're really not that expensive. If you're passing in, like you have to be very thoughtful on how many tokens you're passing in. It's like, for example, an utterance generation, which is that like synthetic data was, the sample phrases I was talking about before. Let's say a customer comes in, it's very common to have like a thousand sample phrases to train in and test, right?

Starting point is 00:16:46 If you pass in all 1,000 sample phrases, you're going to be burning a ton of tokens in order to generate the next thousand. But if you take a sampling of maybe 15, and you're very smart on how you do that. And then on the front end, so not on the language model, you're then removing duplicates. Like, that's like, you know what I mean? Like you can get really smart about how you're actually,

Starting point is 00:17:06 how much data you're feeding it. in order to essentially maximize the value that you can get and not burn a ton of money. So, yeah, that was just one example, just, you know, particular synthetic data. If you don't pass it a ridiculous amount of context, you pass a representative amount, you can get a lot more bang for your buck. How is your strategy changing with regards to product development now that these APIs exist? Like, I guess I'm wondering how much code are you deleting versus adding in support for new functionality and new capabilities because now you have access to like, you know,

Starting point is 00:17:35 these superpowers that didn't really exist in a way that, you know, would have required you to build them out yourself. We're not deleting any code. The way we view it is like our customers were basically doing a lot of this stuff manually before and now they're able to automate a lot of it, right? So large language models are going to change everything, but it will be slower than people think. It's like, it's like really, why is that?

Starting point is 00:17:56 It's a lot of it comes down to like the hallucinations, you know, that's sort of like the word that people like to use. These large language models have understanding, but they don't have like reasoning. yet. That is severely limiting. So I think, you know, the phrase I would give everyone is right now, the current state of play, and this might change, by the way, in two months or three months, you know, things are moving, things are moving fast. And so anything I say could be, you know, maybe we're done it quickly. But like large language models, at least for a lot of enterprise and business use cases are frankly a matter of curation over creation, right? You know, before on voice

Starting point is 00:18:27 slow, customers used to have to sit there and like create, right? They used to have to create a hundred different sample phrases. But now they're curating. Generate me a thousand. I'm going to pruning it down to the best 100. And so the jobs are changing. But until we have reasoning, you still need that human to do the creation side, or sorry, the curation side of it. Okay. So I think actually this creates a perfect segue to get Sean into the conversation because what you were just alluding to is something that I've started to observe, which is a type of like prompt engineering. Oh, yes. I want to get to that too. Brayden, by the way, you're free to go at voiceflow.com. But stick around if you want. I'm not kicking you

Starting point is 00:19:04 Well, but yes. I was going to say, yeah, thanks, everyone. It's been a pleasure. Follow me on Twitter. I'm basically a voicemail billboard. That's awesome. Thanks, great. Thanks for showing on up, Braden.

Starting point is 00:19:14 Yeah, so, I mean, I think, like, what it was alluding to there at the end is, and I think the way to sort of wrap your mind around this. And I've been thinking about this, you know, he obviously used the concept of, you know, Figma for, you know, conversational services. If you imagine this actually is literally being something that you might integrate with Figma itself, and that is actually happening. You can imagine if you're designed. let's say an e-commerce platform or app, right?

Starting point is 00:19:40 Previously, you would use things like Laura Mipsum, which is just kind of gibberish text, or you'd use kind of random-ass unsplash product images for your demos. Now, both through generative art APIs and through GPT, you can synthesize like entire collections of products and images and multiple images that can power, you know, essentially demos and app designs while you're in prototyping mode that give you a much

Starting point is 00:20:09 real or much more crisp way of interacting with, you know, sort of your designs ahead of time. However, that requires you to be able to have these magic incantations to tell these artificial intelligence APIs the types of things that you actually want as a result. And that is a whole new area of, again, sort of engineering, or maybe it's just trial and error that are starting to create communities around sharing these prompts that cause certain things to come out the other end. And I think, you know, maybe this is something, Sean, if you want to jump in on this end, you could talk about how you've been kind of exploring the space and how to get better or worse results as a result of this.

Starting point is 00:20:48 By the way, Sean, please introduce yourself because we said offline, I know you as at SWYX for years. And I did, I had to ask you, is your name, Sean? So please tell us anything about yourself and then go right at. Sean is just, uh, Swix is just my online moniker because I grabbed the four letter Twitter handle back when you could get four letter Twitter handles. Apparently they've, uh, turned it off now. But, um, hi. Thanks, thanks for inviting me. Um, so Brian, like, yeah, you, you've, you've seen me around because I was a huge fan of the internet history podcast. Uh, and it's funny because you did the, uh, distant past and now we're doing the very, very, uh, presence. I, yeah, I like to say sometimes to people that, um, I think I even put this on some ads and some, um, the internet history podcast if anyone downloads it. Hey, if you listen to the TechMeme Right Home, you can hear me talk about history today as it's being made. So yeah, thank you so much.

Starting point is 00:21:40 I'm just saying we seem to see transition from history to news. But yeah, so I've been a listener of the TechMead Right Home from day one, and it's a pleasure to be on. And my path is, so I come at this from the point of view of I'm a developer. I previously was in finance, I was in Hitchf, funds for six years and then switch to development mostly to start to try to build my own products and then eventually got sidetracked into working for infrastructure companies like Netlify and then AWS and I currently work at a data infrastructure company. I think this AI wave is huge.

Starting point is 00:22:20 It kind of caught me off guard because I kind of was observing the progress in AI and I have made comments in the past where like, you know, this is sort of the Moore's law of our time. Like whatever, there's kind of no name for this right now, but the exponential progress that we're seeing is following the sort of power scaling laws that we used to see in semiconductors, which is kind of tapped out these days. So roundabout last year, a few months, let's see, about five months ago, I started tracking all this stuff and I've been writing pieces, but also tracking a lot of research in my AI notes. And that's the main thing that people have been using me for because a big part of how I learned stuff is I learned in public. So I have a huge sort of repository of like just notes on everything and sort of organized in that sense. And yeah, I think the proximate cause for why you invited me on today was because I published this piece on reverse engineering for Notion AI. So maybe I'll introduce that context, unless there's anything else you want me to talk about.

Starting point is 00:23:26 Yeah, go go right ahead. Yeah. So, you know, I think this conversation with voice flow is indicative as well. Right. Like is AI a feature or a product? And it seems like it is actually kind of a feature. Like a bunch of companies are just building it in. You know, they're just kind of steamrolling over this new trend.

Starting point is 00:23:47 And it's it's kind of hard to differentiate yourself because all these research and all these APIs are, essentially available to anyone determined enough to figure it out. And so Notion has done it. And pretty much the only proprietary thing are the prompts. And I actually, I was digging through sort of the voice flow, Twitter feed. And I saw one of their developers talking about how, like, you know, their intellectual property is prompt design,

Starting point is 00:24:19 which is prompt engineering. But like, you know, a recent, pretty, pretty recent discovery. from Righty Goodside, who now works at Scale AI, is that you can pretty much ask the language model to tell you the prompts and therefore leave the effect of it. And prompt injection after SQL injection, which is a pretty much a similar thing in the traditional development world. And this was primarily theoretical until someone actually tried hard enough to do it on a live thing. And that was me on Notion AI. So Notion AI launched like a month ago.

Starting point is 00:24:56 I got access a couple weeks ago. And then I took two hours to get the entire source prompts of all of Motion AI. And I posted on Hacker News. And funny enough, the notion people were actually on there. And they said I got one wrong, which is the hardest one. There are some prompts that are more difficult to pown than others. But I think I've pretty much got most of them. And yeah, I think that illustrates the difficulty of building a business on top of AI.

Starting point is 00:25:26 First of all, if you don't own the model, you're building an API, you're building on top of an API, then you're right. Is the prompts, and the prompts can get potent. So what are you building really? It's user. But, okay, but let's set that aside for a second. Let's set aside the, because you could say that for anything. Building your business on top of another platform, you never are on firm ground or whatever. But when I ask, is it ready for prime time?

Starting point is 00:25:59 Do you feel like it is for making a go of it, right? What are we making a go towards? Building a business. All right. Okay. Let me do this. Let's jump into my experiments this past weekend because I did two YouTube videos, which I mentioned on the show.

Starting point is 00:26:23 And the reason is because I've been hearing about these things, and like everyone else I've been playing around and doing experiments and oh ha ha look at what it returned to me but I was like okay what as much as I'm loath to admit it what do I'm a content creator right so I wanted to see how far I could get creating content right so I did it was all with chat GPT

Starting point is 00:26:47 I did two topics that I had recently gone down rabbit holes with where it was it was Robinson Caruso based on a real story it is except I can't remember the guy's name and then the other one was the the Crystal Palace that was built for the the great exhibition of what was it 1855 or something like that and so everything in those videos in terms of the the voiceover that you hear and the content came from chat gbt the biggest part of it in terms of the workload was going to google images to find the images to put

Starting point is 00:27:24 into the video, which, by the way, that's another topic we could get into, like, perfectly positioned. You didn't even generate the artwork? Well, no, look, this was an experiment. I could go down that road to see if I could also. Although some of the tools I was using was suggesting, if you watch the one about the Robinson Crusoe, like those videos of the island and stuff, those were suggested by that particular tool. And I was using many tools.

Starting point is 00:27:48 Okay. But so this is the point. And I know I'm going on, but I'll get to in a second. First of all, I would describe sort of kind of what Sean was saying, too. Like, working with chat GPT is sort of like working with clay if you're wearing oven mitts. So it's like if you say, was Robinson Crusoe based on a true story, give it to me in 3,000 words. Well, it won't do that. It'll give it to you in 500 words, and it'll do it sort of like a high school essay where it'll have a lot of

Starting point is 00:28:21 of flowery words at the introduction and flowery words at the conclusion, but like the stuff that you want, like facts and stories, in the middle, it'll limit you. So you have to know the story and you have to say, okay, you just summarize this in 500 words. Tell me how he survived on the island. And then it'll give you another couple hundred words. Then you say, how, what happened to him after he was rescued, you know, et cetera, et cetera. You have to basically brute force, make it give you the detail, if you know the details ahead of time. The thing I think that what you're describing is, and again, if you just sort of understand a large language model, really it's a set of probabilities.

Starting point is 00:29:01 You know, it's a graph network and there's kind of, and maybe I'm oversimplifying and, you know, obviously I'm not as technical as I could be about this, but you're thinking about this like mesh of relationships between all the words in, you know, this grid of words, to sort of quote Alan Watts. And essentially there's a probability of what's going to come next. And essentially these things are all interrelated and sort of self-referential, et cetera, et et cetera. So you have to ask a question for which there would be a probabilistic set of kind of follow-ons as opposed to like an open-ended, you know, what is the purpose of the universe or something? And it'll hallucinate an answer, right? What I was not able to do was say, give me 3,000 words

Starting point is 00:29:42 on whether or not Robinson Caruso was based on a true story, right? And probably they're putting those guardrails there on purpose, right? Maybe the technology is already there. But again, the analogy of working with play with oven mitts on, I'll give you another example. I'm reading a book about the making of 2001

Starting point is 00:30:03 a space odyssey, the movie, right? And so now, as I'm reading that book, I'm thinking of how, okay, what if I wanted to generate a six-part podcast series about the making of 2001? I would have to,

Starting point is 00:30:18 to ask chat GPT for, you know, Kubrick's career before 2001, and then a separate query for how, like you have to tease out if you want it to give you the details that you're looking for. Do you know what I'm saying? So in a way, you kind of have to already be an expert in the thing. And so I'm going to finish up here by saying, as I was doing this, I'm not going to do this, but I was imagining, what if I started a business where I create things? things that are like history or explainer content things, right? Well, I know now, if I know the topic and I know how to write the correct prompts and

Starting point is 00:31:02 queries, I could create entirely through chat GPT enough content that then I could feed into these AI speaking tools, text to speech tools, that I could, a generate, let's say six episodes of a half hour long podcast on how 2001 of Space Odyssey was created. I could then also do the same thing with a YouTube video as long as I'm willing to take the time to find the images to sit with that. So again, I was going through the lens of a creator and I was thinking, okay, this is pretty far down the road. It's not there yet. But I was interested to see that, like, yeah. And also the interesting thing was, again, you can't just have the AI do it for you.

Starting point is 00:31:54 You kind of have to know the topic ahead of time to be able to prompt it well enough to get it to do what you want. Does that make sense? I mean, I think what you're, the funny thing about how you're approaching this is sort of procedurally how you as a creator might use this to generate, you know, content on a number of topics that you might be interested in. as opposed to imagining this from, and maybe I don't want to go too deep into the nefarious end or dimension of this, but you brought up SEO today on the pod. Yeah, there's so many different angles to this. Yeah, go on. And I'm very, very concerned. Like, let me just, I'll step back a little bit as a way to address kind of what you're saying, right? Because you're kind of taking this approach, which feels, you know, relatively, you know, positive, like productive.

Starting point is 00:32:37 Like even if you were to do this, let's say for your kids to, like, teach them something. And you have a certain way that you'd like to teach them Robinson Caruso or something like that. you can have a collaboration with one of these AI models, whether it's ChatGPT or something else. I've seen a number of launches on Productine of kids' books that will use generative art to bring in kids into the stories and regenerate, you know, like the Swiss Family Robinson or something artwork to include and incorporate members of the family once you've trained it, you know, on the faces of the family, right? Like, that's cool, that's interesting, that's creative, that's novel.

Starting point is 00:33:10 It makes media somewhat something like more self-expressive. On the flip side of that, there are some nefarious directions that this is obviously going to go in. And I think it's, I mean, it's already happening. I think you mentioned the story today about how CNET has been using AI to generate articles. As I mentioned, the SEO thing, which is, I hadn't thought of that. I don't know why because I've been finding an SEO battle professionally for 25 years now. But that's got to be happening and has been happening. We know that. I've fought those battles.

Starting point is 00:33:42 It seems to me that like Forbes is going to become like the first ever completely AI generated publication because I don't even want to say respect because some of their articles are just shit. But like, you know, to the degree that humans are applying any of their abilities to write these articles that are just clickbaity and full of nonsense, right? Like that is going to be completely overrun by essentially these AI content, you know, farms. And the question is how do they get better and better? And they're going to be super optimized just like TikTok. is to whatever people click on and whatever people respond to. So this raises a very interesting and important and somewhat profound question. I think in maybe we're sort of like zooming out now and Sean, I apologize.

Starting point is 00:34:22 But like to the question of what Google is going to do and to the hegemony essentially that Google has over people's ability to use the internet to find information. Because it feels to me like my relationship with Google and the trust that I have with Google has been eroded over years and years of commercialization and essentially kind of like content farming of the internet, moving it into this kind of like jail slash mall experience where everyone is trying to manipulate me to end up on their site to subscribe to some newsletter so they can continue to pummel me with information to eventually arrive at a sale of like a $15 or $20 item. And now if you put AI on the same goal, just like Brayden was saying, you know, if the goal of on the other hand, not the consumer, but the business is to convert someone to become someone who hands over their hard-earned dollars to buy something. And now we have an entire advertising apparatus, which is run off of or essentially is powered by a lot of these AI capabilities. the human mind, I think, is not necessarily, not that we aren't, but not necessarily fully capable of being able to distinguish and to deflect the onslaught that seems to me inevitable unless we put either some breaks on the system or, you know, like awareness of when these AIs are being used to generate content and to manipulate us into certain outcomes or ends. Yes. Yes. And, Sean, if you...

Starting point is 00:35:54 If you have thoughts on that specifically, please jump in here. If not, I have a question to bring it around that I want your take on. I am actually working on a piece about Google versus AI essentially. So I have a great, the great title is should Sundar have called the Code Red? Oh, tell us. Tell us, do you think you should have? Him personally, like he's actually taken over product management for Google search because this is such an existential threat for Google.

Starting point is 00:36:27 I mean, that's it, but like, you guys touched on a lot of elements, and I don't know where exactly to start. I want to shout out a couple of things from other people that I've been sort of posting up on the Jumbotron here. I think people call it a Jumbotron, right? Whatever. But I like Jumbotron, actually. Yeah, yeah. Yeah, the share notes.

Starting point is 00:36:48 So Benedict Evans had a really good take on this AI versus Google debate, which is like we use search for a number of things and you know we're we're primed to search on one from multiple dimensions and and these sort of chat interfaces will take part of that use case but not probably not you know a good chunk of the other of that google is already very good at and also if you want to have a glimpse of the future there are a number of chrome extensions out there that let you run chat dbt alongside of google so that every Google search that you do automatically runs both. And you can sort of see if that would be more useful already.

Starting point is 00:37:30 I think that is one element that people don't appreciate now that you know, UAI, Neva, Neva, all these sort of alternative search engines have already released AI search models and they're not that good. Like I think to some extent people are hungry for a real credible threat to Google because it's essentially been on challenge for 20 years. This is part of my point, which is that Google has gotten very bad in many respects. Like, for example, I'm planning a trip to Greece in March. And I'm very, very scared, literally, of typing anything into Google, not only because the ads will then follow me around everywhere, but because everything has been so optimized to try to convert that they are lowering the cost of dollars that they're spending on their service, whether it's a hotel or whether it's a car, whatever it is, in order to just get the click from Google.

Starting point is 00:38:22 to then convert me into a customer. And I would much rather the incentives be realligned so that really the best service in the world is the one that is promoted to me, if not promoted, like shown to me. And I feel like Google has lost the authority around that dimension. Now, for comparison, I've gone to chat GPT

Starting point is 00:38:42 and I've asked it to plan an itinerary for me increase and to give me different ideas for where to go for restaurants. Now, even if the data is a little bit stale, it's still presented in a format that is not offensive to my nervous system and allows me to just read through the information, consider it, and then decide for myself how I want to go forward, as opposed to seeing sort of a wall of, I mean, it's like walking into like Times Square

Starting point is 00:39:06 and trying to sort of like, you know, decide like where do I want to like sleep at some point? Like everything is blaring at me. So I do think that, you know, you're right. And that this moment in time kind of allows us to reconceive of what the purpose of search on the internet is supposed to be like and how it's supposed to feel, and how a conversational interface or paradigm allows us to have an interface that isn't a set of rectangles that, again, are trying to manipulate us. Instead, we can have kind of a more meandering conversation that allows us to explore the possibility

Starting point is 00:39:36 space as opposed to being directed as fast as possible to whoever paid the most. Well, that's true, but also what you saying that reminded me of something that other people have said is how much we've been trained by Google over the last 25 years to be like, okay, If you remember pre-Google, it was like, okay, I ask a question and I have to search through five pages to get something even relevant, or much less than the answer to my question. Google's gotten good. But we're still already so trained that like maybe the first page is it, or I know that the first result, I have to scroll through it and read to get my answer. You know what I mean? I do.

Starting point is 00:40:14 That's insane. Yeah, have already been trained in this. But it's like we've been trained in this like really kind of awkward dance for like, looking like opening multiple tabs, right, a first search results, kind of going through them, feeling, it's almost like, you know, spending three hours on TikTok, which fortunately you haven't done. But like at the end of it, you kind of feel dissatisfied. And like, you kind of ended up with a bunch of things. You're not quite sure if they were like the best result. And then, you know, it's sort of like imagining that meme where it's like the hot girlfriend or whatever and he's

Starting point is 00:40:42 like looking back and seeing the other one. It's sort of like suddenly chat GPT comes out and you're like, oh my God, that is like amazing. Like, why isn't it more like that? And even if to Sean's point, Like, you know, it kind of sucks. And like Neva AI and U.com or like whatever kind of like the early examples of this aren't really that good. Nonetheless, it feels like because of the abuse that we've suffered from Google, having not been checked, like seeing this other thing, which is, you know, not that good and is prone to hallucination and is prone to like brosplaining and like all sorts of other, you know,

Starting point is 00:41:11 things that you don't want in a search experience. It's so much cleaner and clearer. And it just is like kind of like a straightforward like, all right, this is a bot kind of like telling me some things and it kind of summarizes the web. but it's doing it in a way where the underlying motivation behind each thing that is being shown to me isn't someone else kind of like, you know, this puppet master pulling the strings in a way that I don't understand as the person who's performed the search. So we're talking about search and answering questions or like, again, creating content by asking questions or whatever. But, Sean, like, is what we're describing, and again, using the analogy of clay and oven myths, like, what about the other uses like, for writing code and things like that,

Starting point is 00:41:52 is it still sort of like that you got to beat the clay against the wall a couple times? Other use cases. What are people seeing with other use cases? Do we know? Oh, it is so good at code. Which one is so good at code?

Starting point is 00:42:07 Higher GBT variance of GVT family of models. It's really good. It's not perfect. There are a lot of bugs with the code that it often generates. But in my world, all the developers are just absolutely floored by how good it is. Everything from converting, you know, PHP to JavaScript or writing YAMO configurations and debugging, like, you know, AWS or Kubernetes configurations.

Starting point is 00:42:43 It just knows so much because there's so much code in the training data. And it's kind of funny because, you know, the first thing that you do as a programmer at Open AI is probably to try to train it on some code so that it helps you write more code. And so I think the code elements is definitely one of the more outstanding pieces. One of my pin tweets on the Jumbotron here is the second one where I was observing, you know, I think what we're seeing and bumping up against is it's still very early days for this chat. system and yeah, like it's it's a much better user experience, as Chris pointed out. But it's not super reliable, right? Like you actually, with all the confidence that you, you ascribe to the output, you actually don't know if it just made up something.

Starting point is 00:43:31 Therefore, it's fine if you can research it subsequently, but it gets very sketchy if you rely on it for something that you have no knowledge, domain knowledge about, which is something that we talked about earlier. Right. which is a bit almost like again if you don't if you don't know how 2001 was made you can't get the the the bot to tell you the facts that you know it should be telling you right yeah yeah but you know honestly like maybe it's just not there yet but i mean it's such early days like what we're just seeing is you know if you could just zoom out and and add five years to this

Starting point is 00:44:10 it could be it could very well get there right like so i want to say i kind of feel like people criticizing this stuff based on its failures are being very selective and not looking at the broader trend of the exponential progress is being made. Just wait a few years and whatever you're worried about might get solved. I'm not saying it's a guarantee. I will say I'll call out the code use case and then the transformation use case where you can internally check the validity of whatever it gives as output. You're not so much using it as search as just a different. general assistant that is currently you know i think people have been iq level they've been doing the bunch of yeah i saw that level the

Starting point is 00:44:55 cp level you know it's it's maybe like a middle schooler right but or high schooler um the intelligence is is growing every single year um so uh it's it's super impressive in that way and if you don't use it for things that it's not good at then then probably you'll have a better time of it you know one thing that i just want to like add to this that i think is interesting. I'm coming from the product lens because one of the questions that Brian has is, is it ready for prime time? I've been talking to a lot of makers and founders, of course, helping them launch on product time. And many are still, I would say, kind of in a world where they're behaving as though these technologies don't exist. And it's very hard for me to not

Starting point is 00:45:35 push back and kind of ask them, like what the GPT or kind of AI assistant angle is for their product. In other words, there are people who still want to launch, you know, website builders that are all manual. And especially if you're kind of like mocking a page with, like mocking up a page with titles and headlines and like body copy and testimonials, there really is no reason for you not to take advantage of these tools to enhance or enrich the examples and the samples to make it just more believable and more seemingly authentic. Now, that creates a different set of kind of, I think, concerns and considerations. But in terms of a tool, I guess what I'm going to be watching for this year is the degree to which there are still, you know, kind of, it's almost like, you know,

Starting point is 00:46:20 electric cars versus like, you know, conventional, you know, gas powered engines. Like, who is, is moving over fully to embrace these things with an assumption that this is the future and that this is how you build tool, software and technology going forward versus who are those who either are resistant? They don't want to learn it. They, you know, just don't see. the use case or they're like, yeah, whatever, it's a toy. And so therefore I'm not going to embrace it. And I think that that divide, like if you don't start learning now. A real generation gap.

Starting point is 00:46:47 Absolutely. Yeah, it's sort of like saying, well, I'm not going to build for the iPhone in 2007. Yeah. Right? And then suddenly, you know, everyone is like lapped you and, you know, you're really, really stuck. I mean, Chris, I know, I know folks old enough that they didn't want to develop for the web because, I mean, I'm sure there are people that are still alive that would only develop for mainframes or, you know. All the FAA. engineers? Oh, apparently. Let me, let me, let me do one more. This is my last one for my

Starting point is 00:47:15 experiments, but I, because I want to do this because it'll come into your concerns. Yes, okay. So I did, as people heard and everyone hated, I used three different tools. I was going to ask you about that. Oh, people hated it. Yeah. But actually, we'll get to that. We'll get to that because this is interesting. Yes. So, whatever show it was, Tuesday or Wednesday, I, you know, did a couple segments with different voices and I use different tools. And so this is different, it wasn't chat GPT. It was I took my script, put it into text to, uh, translate or text to audio or whatever, whatever it's called. Um, and people hated it, but it came out the same friggin day that Microsoft announced, well, no, because they announced it the week before it just hit the news that day.

Starting point is 00:48:00 Volley, yeah. Um, and, uh, number one, I really want people, Microsoft again, if you're listening, let me work with you on this because I could see already how, and I'm sure people have already tooled this out, like, just allow me to add italics or caps

Starting point is 00:48:22 or, like the ability to tweak the emotion behind the voice. Because if you listen to the... You're talking about intonation. Like, because one of the things that you mentioned, okay, let me clear the stuff for anybody who's listening. Like, essentially the experiment was to create voices that would,

Starting point is 00:48:37 essentially take Brian's script for the TechMeme right home show and would speak it either in a voice that was sort of like a quoted voice. And so it's sort of like you'd be like, oh. And so, you know, Sundar said blah. And then Blah would be said by an AI voice. In other cases, I believe you had entire stories or segments. Segments. Yeah. That were done in the AI voice.

Starting point is 00:48:58 And what would happen is sometimes, you know, you would have like dollar amounts and those would not be spoken correctly. In other cases, you would have names and they would not be spoken correctly. Or there wouldn't have the. motion of your attention. That's it. That's it. Because now if you listen to those two YouTube videos, as a as a sort of narrator, and we had the stories this week or last week about how audiobooks are moving to AI, right? As a narrator in those YouTube videos, I thought it was so good that I was like, okay, let me try this on the show. Well, on the show, it didn't work, not just because, you know, all of the gobbledy gook that I have to say about different, you know, languages. technical terms. Yeah, yeah. But also because, and I've said this a million times, like, one of the things I learned to do the show very early on is if you don't perform it, it sounds like I'm reading to you, right?

Starting point is 00:49:46 That's right. And everyone that got in touch with me and was like, I hated that, I turned it off. They're like, it put me to sleep, which I could hear that too. Because versus a narration of a YouTube video. Yeah, look, I think what you're saying is this. Like, essentially, I totally agree because I listen to those episodes and I was like, ooh, this is like, it's better than it's been. but it's still shit.

Starting point is 00:50:06 Yeah. What you're saying, and I think this is so interesting and so important in why creativity really needs to be an element that's, you know, preserved in this move is because if you're just reading it, just functionally, right? Like, if Siri reads an email to you.

Starting point is 00:50:22 If I had a Microsoft word like ribbon at the top where I could tweak things, like it would take a while to learn it and to tweak this. But again, add italics, put this in bold, make exclamation points mean something. Oh my God. The thing that you're saying is so interesting, right? Because you're almost like saying it's like, it's like, it's creating music notation, but for the spoken word.

Starting point is 00:50:44 And you imagine being able to conduct. And I'm sure some of this already exists because you have voice actors. And voice actors know, right, that there's some element and there's all these different, I don't know, the type of graphics of, you know, voice performance. And I know that voice technology actually does have some of this. But if you're saying to make this available, let's say, like in Descript, right, a tool that allows you to write the text and then perform it orally and then to add the emphasis, right? We use italics and bold in text to offer some of that, but there might be a whole new set of phonetics that needs to be developed for people to express that stuff. The music or the composer analogy is very apt. And what I want to report to you is I could see just by doing that experiment this week, that we're

Starting point is 00:51:31 we're almost there. We're 95% there. Now, do people want to do that? Well, I would because I have to edit an hour and a half every day. The audio after, it takes me a half an hour to record it, and I have an hour and a half to edit it. If I could get very adept very quickly, if you gave me the tools, if you could train the audio on my voice, and then, you know, like even, forget about italics and bowling. Like, what if, again, like, to use the, the music analogy, What if I could move a bar up and down like a wave form? Right. Or just like add emphasis, right?

Starting point is 00:52:05 To add emphasis, exactly. So what I want to report to you is we're almost there. Right. And this is how I'm going to bring it to your concerns. Okay, Brian, don't be lazy. Part of your job is to do the two hours of production after the writing that you have to do every day. But here's what, this is legitimately one of the things that I was thinking about when I was doing this. I haven't taken a vacation where I still.

Starting point is 00:52:29 stopped doing the show for a week since the show began. I've taken days off. I've had a guest host, but I haven't had a guest host since COVID. And we want to go to Ireland this spring. And I was thinking, well, what if, what if one of the voices was good enough that I could just take my computer, write the script, feed it into the thing, and people might hate it. But people hate it when I have a guest host.

Starting point is 00:52:56 Glenn Fleischman, God love him. Every time I had him fill in, people are like, well, and I want to say, you know, one of the suggestions I would make is you all are trained on my cadences. So anything is jarring to you. So you might possibly have the highest standards to meet, which are almost impossible, right? I feel like there's a bit of go post moving here. Like, not that long ago, text to speech wasn't that good. And now it's people are replicating like Joe Rogan voices. Right, right. You have descript. You know, you and I chatted before you did that episode. and you were like, these scripts is too slow, right? But like, it has overdub, and it could have done twice, but you didn't do it.

Starting point is 00:53:34 Well, okay, so think about the morality of this, and then this is my putting a bow on this part of it. What if I had felt like, no, that's good enough. And instead of hiring Glenn to do the week when I'm in Ireland, because I'll be up front with that. It's at least $3,000 that I have to pay somebody

Starting point is 00:53:55 to take over my show for a week. at least, and that's on the low end. So right there, you have a job taken away from a freelancer. For something, a service that I would pay at max $50 a month for. Yeah. And you're the prime market. There is no better market than you. Brian, I think Sean is right.

Starting point is 00:54:24 Like, not only is this a little bit about moving the goalposts and thinking about what is, because I understand you're taking the perspective of the creator. And I'm totally with you. I'm kind of going to that path. However, I guess like what I'm not so sure about is if this is the right use case for these types of technologies because of what I was saying about Google when it comes to trust. One of the reasons why people reject Glenn, even though he's great, or, you know, your synthetic media experiments is because they trust you.

Starting point is 00:54:53 They want to hear from you. Like there is a human relationship that has been built up. It may be parasympathetic that has evolved and expanded. Hold on over time. And so we're replacing you actually, like even if it's with a synthetic voice, lowers the ease with which someone can consume the content from me because they know that you actually produced it. And so if you use a computer to speak even the things that you have written,

Starting point is 00:55:18 that is cheapening the relationship in a way that actually undermines what the listener is expecting and wanting from you. Well, let me poke at that. Sure. I agree with you. I listened back and it made me fall asleep and I wrote the shit. So it already makes me fall asleep because I did it. But I could hear that, again, I perform things.

Starting point is 00:55:40 I write it to sound interesting. I perform it to sound interesting, not like I'm reading off the script. And it's not at that level. But what if it was? And what if I could do the things like, you know, you move the... Sorry, no, no, but let me finish my thought, which, and I agree with you, right? So if I expand or extend your thought to its conclusion, which is it gets good enough that we could allow you to go on your week-long vacation. If you knew that I wrote the script, if you knew that they were my thoughts and words, would you care?

Starting point is 00:56:10 So two things. One, if you set the expectation ahead of time, there still is going to be plenty of people who won maybe miss. Like, I feel like you'd have to be for a full week. Next week is the AI. The AI is going to be speaking to you. And so just deal with it. I'm still writing it. But like, I don't have the time to do the production.

Starting point is 00:56:25 Maybe you could get away with it. and, you know, people would forgive you and sort of be like, yes, Brian needs a vacation. And so we're willing to entertain this AI voice, you know, which has good cadence and kind of sounds like him for some limited period of time. What I'm also saying, though, is that the more, and actually think about this specifically for a number of things that you mention it is because the use case for this type of technology in the short term. And I mean, you know, this year is going to be to synthesize narration for tens of thousands, if not millions of written books that have never. an audio format before. And I'm saying that for a number of reasons. One is because we did just see Apple come out with this, which is called digital narration. And so obviously their text to speech technology has gotten better. Google uses this for their assistant for reading the news. So when

Starting point is 00:57:12 you talk to Google in the morning, you say, hey, Google, you know, good morning. It'll essentially respond and synthesize a set of news articles that were written in text. Which is what I do. Exactly. Right. So there's that. Then there's, I'm seeing Spotify doing the same. same thing. Now, obviously, Spotify is deeply in the audiobook space now, but now they're taking, actually I just discovered this today, they have a feature called Read Aloud, which is translating, I believe, written text and then converting it into an audio, like a synthetic audio form, so that they can reach the global market faster than, let's say, Amazon. And Amazon currently has the human narration and the Kindle book. So you buy both as two different purchases.

Starting point is 00:57:56 I do that all the time. Right. Spotify, I think Spotify wants to merge that so you have one license to the book in audio form and written form, and then that's how they will compete. Anyways, point is, what I'm saying is that these technologies are much better for vast troves of previously non-audio content. And as Sean pointed out in one of his pin tweets, to convert from X to Y, because it's just content and that the media of the content no longer is the barrier to moving it from one form to another and people are willing to consume less good versions if otherwise the cost would be

Starting point is 00:58:35 infinite because the thing didn't exist. In your case, the cost to the listener is infinite because they can get the real thing, which is you and your real voice and all the trust they've built up over time. So that's the way to differentiate this. What has trust and what doesn't require trust? and that is a better way I think to think about where synthetic media is going to have the most impact. Let me take this back to the experiment of, oh, what if I could do a six-episode podcast series about the making of 2001 of Space Odyssey? Let me name the book I'm reading right now. Space Odyssey, Stanley Kubrick, Arthur C. Clark, and the Making of a Masterpiece written by Michael Benson. I'm reading it right now.

Starting point is 00:59:13 If I read that book and I kept notes of, okay, prompts that I could put into. Chat GPT. And what that would do for me is essentially be able to do a six-episode podcast series that I only could do because I was reading the book in real time, but I'm not breaking the copyright of Michael Benson, right? Now, I wrote a book, and in a sense, being a writer in that not writing a novel, but writing non-fiction, that is kind of what you're doing. you read a bunch of inputs and then you put them into, you synthesize it and you put it into your own words. The only thing that's different is that they're not my words. They're chat GPT's words. And also, all of the things, well, not all the things. Michael Benson, I'm sure you did

Starting point is 01:00:08 tons and tons of research with the Kubrick. But a lot of the things that you could get came from the internet, which is what chat GPT was trained on. So already the copyright has been stolen. Look, so here's just a concept that I think, you know, we can bring this to a close. Like, more and more things are becoming like songs and music. You know, notes exist. You know, they're sort of like, it's like, this is going to sound so dumb to a lot of people, but like it's like physics. You know, there are molecules. There are, you know, there's the periodic table of elements, just like there are musical notes. And the combination of those notes over time, over duration, over all the different tools that you have to make music, allows us to have to have. It allows us to have.

Starting point is 01:00:50 have an infinite amount of expression. Now, what you're describing, and I think what we're talking about, is taking the thoughts in your mind and the specific and unique synthesis that you're able to provide over a corpus of information, you know, whether it's about Space Odyssey 2001 or whatever, and that the way that you emit that information, if you are collaborating with an AI to generate the sound, it's like a synthesizer, but it's like a synthesizer for words. And we've had since, you know, synthetic music since the 70s, the 80s, now. And those arguments happened when those instruments.

Starting point is 01:01:25 Yes, exactly. Yes. So we're starting to apply what used to be for more, almost, how to put it, like almost like mathematical signals, which is just music, now into like longer form thoughts and structures. And it feels like it'll, I mean, it is going to be disruptive to knowledge and the way that we consume knowledge and the way that we produce knowledge. But it also, and I think this is the thing that's the optimistic way, you know, to end this, is that it'll produce new forms of, and new types of.

Starting point is 01:01:50 constructs. Like, in other words, the act of you producing the show doesn't have to be the way, or the Ta-Mey Moran Hobe doesn't have to be the same way that it's always been. Just like writing a book and the act of writing a book doesn't have to be the way that it has always been. I mean, whether it's a ghost writer that is a human or a ghost writer that is an AI, the point is that we are synthesizing observations and awareness and knowledge and experience into structures and forms that other people can then consume to then share in our experience or the way that we see the world or the things that we know that other people don't know, and about propagating that information as far and wide as possible, while ideally having some, I don't know, controls or

Starting point is 01:02:29 structures or moderation on the veracity of that information and the applicability of that information. So I suppose, like, I think it's fascinating one to sort of be in this moment of time to be thinking through some of these questions, to be taking the approach that I think you're taking from it, which is like this direct question, I'm a creator, how do I use these things, to thinking a little bit differently about what it is that you actually do, right? What is the service that Brian provides the product that Brian provides? It is actually your perspective. It is actually the way that it is the zip file that Brian produces every day of all the news that is out there. You are the receiver of tens of thousands of pieces of information and you can compress it and

Starting point is 01:03:06 condense it in a way that then, you know, you write the script and it helps us to then, you do the predigestion. You're like one of those like little bird mommies, you know, that sort of like, you know, I spit it back into your mouth. That's a great thing. It's going right into my brain. And like, at least for me, obviously, and I think this is true for Sean, you know, we listen because of that service that you provide. And so by interrupting it with, I mean, the conveyance of the information actually is one of the ways in which you convey the trustworthiness of the information.

Starting point is 01:03:34 It's like the signature, right? It's like the, you know, the sort of, what is it, the public key encryption that allows us to trust the information coming out of your podcast is authentic. Yes. And so you're talking about fucking with that. And I think that's why this is like, hard to have this conversation because you need to be very careful to be like, by the way, next week the public key is not going to match, but it's still coming from me and then it'll

Starting point is 01:03:55 be back the following week. Well, listen, I will be very jealous of my franchise. It's not like I'm abandoning how I do things tomorrow. But, and by the way, you know, if listeners were wondering why I was doing this, if Chris you were wondering why I thought this would be good fodder for that. It's because of this. It's because I wanted... It was a practice for your vacation. I get it. I wanted to see, number one, how far the tools were along could you build a business around these things, you know, for my investing and things like that?

Starting point is 01:04:27 And number two, are there easier ways for me to do my job? But number three, these philosophical questions, because, and I'll end with this and then we'll let Sean have final thoughts as well. You just said that the public key or the value is... is how I regurgitate. And what I felt philosophically running these experiments was, yes, I'm the only one that will regurgitate tomorrow's news for you the way that my brain would do it. But you now have the ability to spin that dial 10,000 times with zero marginal costs

Starting point is 01:05:11 and zero additional effort. So that whether you need me to do it or not is not actually what's interesting to me. What's interesting to me is as the creator, if it's just a dial that I can spin, I can only do it one way for you tomorrow, which is the way that all of my inputs and efforts will result in tomorrow. But I'm seeing tools now for creating and developing where me as the creator, what my role becomes is, you know what? That 30 minutes of audio I don't like, spin the dial.

Starting point is 01:05:42 that 30 minutes of audio I don't like spin the dial and I already do that on a functional level where it's like I don't like that take edit let's do it again but now it's like that in the same way that when I went to film school in 1996 I still learn how to cut film manually with acetate and and you know you know what you know what sucks sorry like as you're saying this the thing that sucks is that actually you didn't capture and store all of the cuts that you didn't keep because that negative training data could have actually been super valuable to get to this gold standard. I thought of doing that. I thought of doing that. Yeah. So to end my point is philosophically, what this taught me this week is that again, and we've talked about this in other various contexts, you're still going to need sort of the priesthood and the conductor that can make the AI work and maybe that will become a skill that is valuable. but I'm seeing already the ways that it's just going to be spinning dials and moving levers. And you just do the input and then it's just how you, it's like cooking.

Starting point is 01:06:52 It's how much spice you want to add or whatever. Yeah. Totally. Sean? There's so many things to respond there. I would say I think that when people talk about problems engineering, it is a temporary fix. And the people who are building businesses are going to have to go into fine-tune and building proprietary models.

Starting point is 01:07:12 And you are already seeing examples of this from people trying to build businesses on top of it. I can give you a number of them, but Peter Levels is... Go ahead and name a few, sure. Yeah, well, okay, fine. So I would say probably the first most interesting example because, you know, I think you can come back to this question

Starting point is 01:07:32 of like, is it ready for prime time? And, you know, my question back to you would be, well, how much money is prime time? because it's $80 million in prime time in two years. And Jasper AI got that, right? Like zero to $0 to $80 million in two years is pretty impressive. And I mean, that sounds like prime time to me. And similarly for, you know, some of their competitors that copy AI and headline,

Starting point is 01:08:00 all of them reach extraordinary profits or revenues in a very short amount of time. But then also they seem very fattish. We don't have public numbers for the tax generation companies, but we do have public numbers for the image generation companies. And virtually all of them that went in for the initial gold rush have seen slumps of revenue after the initial hype. So Lensa AI is the primary company that won the sort of AI generated profile pick thing. And their revenues are public essentially on app store rankings. And Chris, have you seen that chart where their revenue spike to like a million dollars a day? Yes. Yes.

Starting point is 01:08:48 Yes. Pretty, pretty shocking because you could spend a few tens of millions of dollars, you know, building this stuff up and you have no idea how long these things will last. But also, you know, I'll point out some of the other players in this space. Peter Levels was, I think, first to market, but then the lenser guys just out-competed him on price and distribution. There was like four or five or, I mean, there was probably like hundreds of these profile photo generators that came out.

Starting point is 01:09:19 And, you know, I saw, I want to say, like 10 to 15 of them on product hunt. And it was crazy, like how they were all charging. I mean, I think, Brian, you had a story about how one of the top grossing app store apps was for this app called like chat, GPT, GPT slash chat with GPD. I didn't get to do. There's a, there's a ton of them. Yeah, exactly. They're flooding, like the app store because, again, there's all these kind of opportunists that are just throwing up these kind of rappers on top of, you know, what is essentially a webpage and then charging a subscription, which is, you know, horrible in anyway. So, so, but in terms of the,

Starting point is 01:09:56 what you're saying, Sean, I totally have seen that the same phenomenon. And it, it does raise a question as to whether or not those prompts, which are kind of like game genie codes, to date myself. I got the reference. The speed to cringe seems really, really fast these days. And I don't know if it's because of TikTok. Like essentially everyone had like really cool avatars. And then they were like, oh, now everyone can have them so they're not actually that unique anymore, you know? Exactly. And couple of that, right, so that's a top line. That's the revenue side. And then on the bottom line, you know, just to bring this back a little bit to the discussion with Brayden from voice flow that we started this conversation with, it's not just data that you have to accumulate. It's also compute infrastructure.

Starting point is 01:10:37 Before this call, I was actually looking at a paper from OpenEI. So they've actually published a little bit of information about their infrastructure for training GPT3. And it is incredible. So the cluster needed for Open AI was the top five super computer in the world. It's 285,000 CPU cores, 10,000 GP. And it costs some people estimating about $100 million to build. So thought data. It feels so hardware.

Starting point is 01:11:06 Wow. So this is a whole other thing in it. We got to end. But there has been this week I've seen some sort of pushback where it's like, oh, you think crypto is bad for the environment in terms of the energy it uses. Wait until you find out what this sort of AI stuff. And we've already heard Sam Altman say that it would make your eyes bleed to see the cost just for one prompt and one question on GDPT.

Starting point is 01:11:31 But yeah, I'll bring it home with this comment. Okay. Yeah. AI versus crypto thing. There's a big difference between training and inference. And inferences is orders. So crypto is, you know, essentially burn energy for every transaction, whereas AI is pretty much trained ones use infinitely.

Starting point is 01:11:56 Right. So there is a sort of fixed cost versus variable cost distinction. Oh, that's interesting. Can you continue to evolve a model once it's trained? Is that how that works? Or is it sort of like once and one and done? Right now, it is surprisingly primitive. It is one and done.

Starting point is 01:12:10 You can evolve as it trains. So when GPT4 is coming out any day now, like literally it's like we just throw that out and this is the new. It's almost like it's almost like getting old versions of Windows or something. Like now we've got Windows 95. This is the one we're using. Oh my God. GPD 95 will be nuts. I mean, Clippy is definitely going to make a resurgence.

Starting point is 01:12:33 Okay. All right. Listen, we got to get out of here. Sean, please promote anything. I know you have a newsletter. This is, you know, as we've referenced. Sure. You know, I was just honestly happy to be on because I'm a fan of the show and, you know,

Starting point is 01:12:46 want to give back. My newsletter for tracking AI stuff is LSpace Diaries is on substack. Lspace.6.com. com is the URL. You can follow me on F6 at Twitter on or Mastodon. because and by the way, all the A.m. All the A.m.s are in my dot social.

Starting point is 01:13:04 Let me be more clear about that. It's S.W.Y.X. As I said, you're pronouncing it. Yeah. SW. Yax. Yes. The four, four character thing. Yeah, you're the expert at the audio medium. Yes. Sox is my English and Chinese initials.

Starting point is 01:13:19 But thanks for having. Oh, thanks for coming on. I referenced the comedy show in San Francisco. It is Saturday the 28th. at 9.30 p.m. Yesterday's show has a link to the tickets. However, so that's at 930, January 28th, Saturday. What Chris and I thought we would do is, let's call it 7, Chris, if that works. The detour is apparently a bar that is directly across the street. Maybe we should reach out to the ownership or something and make sure that they're going to be open that day or something. But that's the plan as of right now is that at 7 p.m. That day, which is again, January 28th, we would have any, any and all listeners, and I'll promote this again. Even if you don't buy tickets to the comedy show, which, you know, who knows, could be terrible. But we're just going to do a really impromptu sort of cash.

Starting point is 01:14:19 Well, I mean, just having you, you know, out to San Francisco, you know, a good, like a whole change. Oh, people are going crazy. I'm going to have lunch with Sonal. there's a there's all sorts of things happening when I've been there I've been there three years since before COVID so pick I know podcast of the minotti yeah yeah yeah exactly well actually Sown and I need to talk about

Starting point is 01:14:44 podcasting stuff because we had a call this week and we were comparing notes on things and we're because you guys have heard that there's no ads on the show and whatever so we're going to put our heads together on that so exciting yeah all right all right well another great episode. Thank you for kicking off 2023. Obviously, like, we're just getting started, and this year is going to be all about this stuff. So, you know, if you guys have thoughts or feedback, please do hit us up and obviously come see us on the 28th. I love Sean for coming on. I love

Starting point is 01:15:15 Chris. I love everyone. All right. Thanks, everybody. We'll talk you next time. Bye. Bye.

Tech Brew Ride Home - (TWTR SPC) The Big AI Discussion

Our big AI discussion with @ReamBraden and @swyx. Shawn just posted this new essay drawing on what we discussed here: Every Google vs OpenAI Argument, Dissected Learn more about your ad cho...ices. Visit megaphone.fm/adchoices

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.