Latent Space: The AI Engineer Podcast - Heralds of the AI Content Flippening — with Youssef Rizk of Wondercraft.ai

Episode Date: September 20, 2023

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We’re collaborating with friends at Amplify to launch the first State of AI Engineering survey! P...lease fill it out (and tell your friends)!In March, we started off our GPT4 coverage framing one of this year’s key forks in the road as the “Year of Multimodal vs Multimodel AI”. 6 months in, neither has panned out yet. The vast majority of LLM usage still defaults to chatbots built atop OpenAI (per our LangSmith discussion), and rumored GPU shortages have prevented the broader rollout of GPT-4 Vision. Most "AI media” demos like AI Drake and AI South Park turned out heavily human engineered, to the point where the AI label is more marketing than honest reflection of value contributed.However, the biggest impact of multimodal AI in our lives this year has been a relatively simple product - the daily HN Recap podcast produced by Wondercraft.ai, a 5 month old AI podcasting startup. As swyx observed, the “content flippening” — an event horizon when the majority of content you choose to consume is primarily AI generated/augmented rather than primarily human/manually produced — has now gone from unthinkable to possible.For full show notes, go to: https://latent.space/p/wondercraftTimestamps* [00:03:15] What is Wondercraft?* [00:08:22] Features of Wondercraft* [00:10:42] Types of Podcasts* [00:11:44] The Importance of Consistency* [00:14:01] Wondercraft House Podcasts* [00:19:27] Video Translation and Dubbing* [00:21:49] Building Wondercraft in 1 Day* [00:24:25] What is your moat?* [00:30:37] Audio Generation stack* [00:32:12] How Important is it to Sound Human? and AI Uncanny Valley* [00:36:02] AI Watermarking* [00:36:32] The Text to Speech Industry* [00:41:19] Voice Synthesis Research* [00:45:53] AI Podcaster interviews Human Podcaster* [00:50:38] Takeaway This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe

Transcript
Discussion (0)
Starting point is 00:00:05 Welcome to the latent space podcast where we dive into the wild, wild world of AI engineering every week. This is Anna, your friendly neighborhood AI, and I'll be standing in for Alessio today. Yes, you heard right. AI is taking our podcasting jobs. We flew all the way to London to interview Yousef Risk, co-founder of Wondercraft AI, which has created the number one piece of AI generated content enjoyed by the latent space community. We asked him how he arrived at his idea, what the future of commercial AI generated content looks like, and confront him with the hardest question of all.
Starting point is 00:00:38 What is his moat as an API rapper startup? At the end, we even have him turn the tables and do a customer interview with SWIX. There's lots of audio goodies in this one and bonus 30 minutes video on YouTube.com slash latent space TV. Watch out and take care. So we're in the studio here in London with Yusuf.
Starting point is 00:00:57 Welcome. Thank you. It's been such a joy listening to Wondercraft's podcast over the last four or five months. You guys have been around for only five months. Yeah. And as you know, I'm one of your podcast's biggest fans. And I think that it's super interesting because I talk to a lot of vendors, effectively people who create services for other developers to build.
Starting point is 00:01:21 And you are at the application layer, which is great and challenging for me as a podcaster because you have some secret sauce that you're not going to share. But I also want to just talk to you as someone who's evaluated a lot of things and built something that I actually use. every single day. So that's that's the context. How do you feel when I say these things? Like is that exactly what you're going for? Yeah. Yeah. Yeah. So it definitely makes sense, right? Resonates for definitely on the application layer and that's definitely by design. Yeah. Yeah. We can talk about the origin story leading into Wondercraft, but just to learn a little bit more about you. You grew up in Egypt? I grew up in Egypt, yeah. I spent the first 18 years there. Cairo. And then you came over to the UK. You got your masters in Triple E at Imperial. For those who don't know,
Starting point is 00:02:05 That's electrical and electronic engineering. You then spent four years at Palantir as a forward-deployed engineer. I think it's a role that Palantir invented. Forward-deployed engineer is a super interesting job because it is kind of at this intersection of being an engineer, so software engineer, but also still doing like business-related things. Yeah, solutions, architects maybe. Yeah, so part of the job was a solutions architect.
Starting point is 00:02:27 Part of the job was reviewing contracts. Part of the job was doing sales. Part of the job was coding things. Part of the job was interacting with. Right? So many different things. and I think that is a really good foundation for someone who does want to start something in the future.
Starting point is 00:02:39 Excellent. You just do everything. So kind of an endorsement of that job if people want to... Huge endorsement of that job. Okay, excellent. Amazing. Yeah, Palantir is surprisingly strong here in the London Tech Circles. I have a number of friends who are all ex-valier.
Starting point is 00:02:51 I think it's actually the biggest offices in London. Surprisingly, because I think of it as like a U.S. defense company. Yeah, yeah, yeah. Then you started moonshaw for nine months, which is pretty important in your journey. I'll bring it up to, Wondercraft, and you started Wondercraft in April this year, and it's been about five months, going through YC in the winter batch, summer and summer of 22 batch.
Starting point is 00:03:14 Okay, cool. What is Wondercraft? Nice. Wondercraft's a podcast builder that uses hyper-realistic AI voices to create podcasts and make that whole podcast creation process super simple, right? So super simple example is you can, you know, you publish a bunch of blogs. You can take that blog, put it in there, it'll just convert it to an audio-friendly format that people can listen to.
Starting point is 00:03:32 It's just sometimes it's a bit more efficient to listen to. things rather than read them. What does it strive to be is a little slightly different because what it really strives to be is it strives to be this platform with the mission of expanding access to content. Okay. And I mean this in a variety of different ways, right? Some people just are able to consume content. You know, we have this whole debate in education. It's like, are you a visual learner or an audio learner? What do you do? People just consume content better in different ways. I'm a visual learner. I need to see things. So for me, actually, it's sometimes it's a little better to read the blog.
Starting point is 00:04:04 But if we're just talking about, like, I want to get a lot of information. Podcasts are great because you can just do them while doing something else. There's a reason that that podcast functionality is so natively embedded in all these smart speakers. It's just because like you're doing anything at home,
Starting point is 00:04:19 just put on a podcast. So really what we're trying to do is podcast is the first instantiation of that, which is like how do we expand access to content? Yeah. But it expands so much more, right? you know, instead of just going to, I don't know, we talked about this like blog to podcast, you can go block to video, you can go podcast to blog, you can go podcast to Twitter threat.
Starting point is 00:04:41 Like the permutations are, frankly, endless, basically depends on how many platforms there are that people consume things on. But that's essentially what we do. The use cases for this are pretty interesting. The one that we like just see immediate value in is this just this ability to translate the content that you already have into other forms of content. if we just take with that blog post example again, right, you've written, so, so, you know, a lot of companies might have this content team that focuses a lot on producing quality blog posts. Blog posts, you know, they're good for SEO and whatnot, but they're not, sometimes they, you know, they don't really achieve a specific goal or outcome that you want.
Starting point is 00:05:17 One thing we see that is really useful for podcasts is they actually carry a lot more weight in credentializing you as a thought leader or your company as a thought leader. so but but like you know we spent the last 50 minutes trying to set up this room to record the podcast right so it's it's not easy and it's a very synchronous process right me and you have to find the time to go and sit here yep and record this you have to come up with questions i have to come up with answers yeah right but this ability to actually just like take the content that you have and transform it it's pretty powerful you know and there's a lot of other use cases as well which is just like podcasts really all they are is like like the final podcast right like the line between or the difference between an audio book and a podcast
Starting point is 00:05:58 that's just the format and the length. It's an MP3 on a RSSS. It's an MP3 with someone or something speaking. Yes. Right? I've actually played around a lot with this stuff, by the way. So I've done music-only podcasts. You just listen.
Starting point is 00:06:11 Tiesto has been podcasting for 15 years every single week just DJing from his house. It's just basically a radio show. It's great. It's just radio. Async radio. Yeah. Yeah.
Starting point is 00:06:40 So it's super interesting. But podcasts, like, okay, ignore the word podcast and just think of what we do, which is like we help you create audio content. Super valuable for anyone who just needs that. If you can imagine a world in which like, I don't know, to call them, like calm or headspace or any of these things. Hi, and welcome to day one of Take 10. Over the next 10 days, I'm going to be showing you how to get a little bit more headspace in your life.
Starting point is 00:07:02 With the starting point, you just to get familiar with this really simple and easy to learn exercise and then just commit to doing it each day. Remember, this is your 10 minutes. So all you have to do is sit back, relax, and allow your body and mind to unwind. To begin with, once you're sitting comfortably, I'd like you just to gently close your eyes. They can do a lot of their meditation like that super quickly.
Starting point is 00:07:27 What you can get to is a point where you're doing like these super personalized things. Yes. Because you just have the ability to scale the content production so quickly. Same with educators. I think there's actually, at this point, there's a few YouTube channels at this point that are all based on synthetic voices. Yeah. That produce a ton of educational content.
Starting point is 00:07:42 The problem with podcast is, podcasts just have a slow adoption rate. Yes. You're listening to a thing for an hour, right? Like, we as a generation still have attention spans. It's the time and the attention spend. Yeah, yeah. Like, TikTok, give it to me in 30 seconds. I think why clips are taking over.
Starting point is 00:07:59 30's too long, man. 30's too long, 10 seconds. 10 seconds with captions. I need to read it. And good. So what we also do is actually, and this is kind of still a beta. feature and we're working and improve it. But like we also, you know, let you take that podcast and then clip it
Starting point is 00:08:11 into an audio, a video that you can go and share on socials, right? So it's this ability to take one form of content, produce it in a bunch of different ways that serve different purposes and be able to distribute it, basically. Yeah, excellent. I want to go through features so that people have a high level overview of what you offer.
Starting point is 00:08:27 So I think at the core, it is basically two things. One is you generate scripts, and that's optional obviously if you want to just write the script yourself. You can write a script yourself. But I think most of your users would generate a script. And then two is from that script, you create, use AI voices currently using 11 labs. Is that the rough flow?
Starting point is 00:08:47 That's like the really core basic. That's the core basic. Obviously there's a lot of plumbing on top of it, but that's the core. And then you offer video clips for YouTube. You offer 28 languages that you can produce. You offer show notes production and podcast hosting too. So they don't have to host it on like Anchor. Don't host it on Anchor, by the way, people don't, don't, don't,
Starting point is 00:09:07 host it on Spotify, don't host it on Apple Podcasts. These people don't respect the RSS feed. Anyway, I have very strong feelings about preserving the sanctity of the RSS feed for open podcasting. And all these Spotifys of the world want to close the podcasting ecosystem. So I have this tirade about them. But yeah, those are your top level features on your landing page. Anything that you highlight to go deeper on? Yeah, I think those are the top level ones. There's also, it's basically sort of like also ancillary tooling that goes around all of this to just make it easier. The goal is
Starting point is 00:09:39 like every time we speak to a customer or someone who's thinking about it, they're like yeah, literally yesterday I was speaking to a potential customer and they're like, yeah, I just, you know, I want to make sure this is a distraction because we don't have that much time to do this. Yeah. And really the whole point is that this doesn't take time, right? The whole point is
Starting point is 00:09:55 to provide all the rails that make this not take time. And this comes with a million different things, right? Like we sometimes the AI voices don't really know how to pronounce a word. So you have a pronunciation feature. Go and define how you want that word pronounced, and it'll take care of it.
Starting point is 00:10:09 If you are, we obviously have that hierarchy of like a podcast, an episode, and then all of that gets published in RSS feed that you can just upload to Spotify and we'll host that for you. But what you also have is just like, you know, maybe you want some defaults, right? Every podcast needs some defaults. Intro autos, the music, the speakers. Yeah.
Starting point is 00:10:28 We're working on adding templates for the kind of podcast that you're doing. Instead of it just being this narration style, you can just do an interview style podcast. and a few more features. But basically there's all of like tooling that just makes this a very useful, usable product for podcasts. Yeah.
Starting point is 00:10:42 You said you have 100 creators publishing with you? Yeah. So, you know, the interesting thing is if you write a newsletter, I mean, I don't know, my emails, but it was with newsletters at the moment. Sometimes I just want like the recap of it.
Starting point is 00:10:54 Again, audio form is just for some people easier. If you're on, you know, commuting or whatever, you can just listen to it. So a lot of folks actually just convert their newsletter. It takes like two minutes. put the text in there, voila, you have an audio version of your newsletter that you just published to Spotify. Yeah. And I am a newsletter writer, and I clicked around and wanted to basically just chuck my RSS feed in there.
Starting point is 00:11:17 And I think I gave that feedback exactly to you guys for like four months ago or three months ago. And it looks like you've already shipped it. Yep. Well, I'm announcing it basically here today, which is, as of today, we've actually built a Zapier integration. And we have a bunch of blogs on our website to kind of show you how to do this. But what you can now do is as soon as you publish a newsletter, it goes on your RSS feed. We'll pick up the newsletter from your RSS feed automatically and just publish an episode for you.
Starting point is 00:11:44 Yeah. Question. What if I change something after I publish? So you don't have to publish. It basically just generate, do all the work for you. And then you can go in and kind of modify it a little bit. Yeah, yeah. Makes sense.
Starting point is 00:11:55 We also have scheduled publishing so that you can, I don't know, maybe you want to release it a few hours later. Yeah. The professional podcasters that I've spoken to say that that is very important. I personally don't care. Like it shows up in my feed or not. I don't care when it drops. Anyway, so you do want to basically time it.
Starting point is 00:12:11 Like if you're basically targeting like a commute for like the US time zone, you want to be like, oh, ADM, you know, Pacific for people driving into work. Then you like show up at the top of the reverse chronological feed. I feel like that's too much tactics. You know, and that's a good point. I think it depends a little bit on your audience and what you're building. But I do think, so I don't want to undermine like the importance of consistency. consistency in podcasting.
Starting point is 00:12:35 Yeah. Right. Like you, whether that consistency, consistency literally translates into, I publish at 8 a.m. every single day. Yeah.
Starting point is 00:12:44 Or I just publish every single day or, you know. Yeah. So, there is a huge importance to just like making sure that what you're publishing is always.
Starting point is 00:12:51 Yeah. It's there. People need to know that your brand is like kind of constantly pushing stuff. Yeah. So for a lot of people talk to me are interested in like,
Starting point is 00:12:58 what's my advice on content creation? Yeah. At least once a week. Yeah. Whatever you do. Yeah. I don't care when you do it. Just once a week.
Starting point is 00:13:04 put something out. But I do notice that specifically in the podcasting field, and you talk about this in the next point, daily podcasting is the meta game that is, I think, doing extremely well.
Starting point is 00:13:14 Especially because I think the Apple podcast list biases for daily. Because obviously the downloads will be higher. So daily podcasts are just kind of rank higher more. And obviously, because you're daily,
Starting point is 00:13:26 you also do shorter podcasts, which guarantees that more people listen to you. Yeah. Yeah, I think the fact that, so obviously we do the Hacker News Recap the fact that we did that and that it is daily
Starting point is 00:13:36 actually just helped us reach that top 30 tech podcast on Spotify. Yeah, that was mostly because you were on Agen, right? We did launch, but obviously the fact that like you just publish a lot of content,
Starting point is 00:13:46 you're just going to get a lot more list. Like it's a statistics thing, right? Obviously, I think they do it by like total time listened as well. Yeah, yeah. But, you know, the fact that it's daily it's just not overwhelming. Again, we don't have that much
Starting point is 00:13:56 of a, like an attention span anymore. Yeah, yeah, that's true. That's true. Yeah, I love it. I listen to it every day. Excellent. Awesome. I think that's a really
Starting point is 00:14:03 good overview. Then you also produce three in-house podcasts. Yeah. Hacker News Recapped, Product Hunt Daily, and PGSAsas. So we drop the product hunt daily. Oh, okay. So we do the hack and news recap and the PGS is the two most popular ones. Yeah. We're constantly experimenting with new internal. We're like podcasts that we publish. Yeah, yeah. What are your other, you can tease a little bit? What do you think about? Tease a little bit. Well, are you like Reddit? Yeah. I'd love to listen to some of the Reddit things going on there, but instead of like reading them. Yeah. It's always just a notification that I got. I'm like, this sounds interesting, but I don't know.
Starting point is 00:14:35 You can do it like per subreddit that you care about. Yeah. A few things like that. I love life pro tips. Yeah. Life, I see. Like super interesting things or Wall Street bets or whatever you're into. Yeah.
Starting point is 00:14:45 Well, the problem with these things is that a lot of them involve images and memes, which you cannot consume. Well, yes, we cannot consume. This is like a simple, we can't consume that at the moment. But, you know, maybe a few weeks down the line on that video future of ours gets. a little better. You can actually start shipping it like that. Okay. Anyway.
Starting point is 00:15:07 Yeah. And I'll just feed you an idea. To keep up on AI, a lot of stuff actually happens in Discord. And there's way too many discords. Way too many, and they're way too active. Yes. So I've actually built a little feed for myself that scrapes a bunch of discords and creates a daily newsletter for myself.
Starting point is 00:15:24 Amazing. And I have thought about turning into an audio feed. But, and this is the problem for Wondercrafts, I read better. I read faster. I scan up and down faster. than I listen, right? And there's just too much noise in Discord for me to listen as audio format.
Starting point is 00:15:40 Your Hacker News stuff is very high signal because obviously you're folding, right? Because we haven't done the curation, like Hacker News did. Exactly. That's why it's guaranteed to be good. Whereas for Discord is a bunch of junk. But I do think there's something similar. Like Reddit also does the curation.
Starting point is 00:15:58 It's not us doing it, right? Yes. Yes. Still a little bit noisier. So I don't know if you know, I was a moderator of the React to Reddit for four years. So I've seen a bunch of stuff. And I know it's noisier than like a hacker news, but still still pretty good. Yeah, yeah.
Starting point is 00:16:15 So we do hack news. We do PGSs. I think PGSs are also super interesting. I listen to them all the time because I, well, first of all, I actually think they're pretty well produced. Like we do a good job. One of the most common types of advice we give at Y Combinator is to do things that don't scale. A lot of would be founders believe that startups either take off or don't. You build something, make it available,
Starting point is 00:16:36 and if you've made a better mousetrap, people beat a path to your door as promised, or they don't, in which case the market must not exist. Actually, startups take off because the founders make them take off. There may be a handful that just grew by themselves, but usually it takes some sort of push to get them going.
Starting point is 00:16:53 Like, I don't know if we're coding someone, we'll, like, use a different voice. Yeah. Yeah. I think it's just what produced. And I also think, you know, the essays are so seminal to, like, Everyone in startups reads them.
Starting point is 00:17:04 Yeah. It's actually got me to read more PGSAs than I would have otherwise. Yeah, his mission accomplished. I don't know if it was the last one at the time. How to do great work? How to do great? That wasn't like, that was a one hour podcast.
Starting point is 00:17:14 Yeah. Oh my God. No one. I could, I did not read it. I just had to listen to it. Yeah. I actually, if I'm being honest, I think the motivation for PGSA is which is like, I need this.
Starting point is 00:17:23 Yeah. Yeah. For that one, if it's like one hour, I would have actually appreciated it. Segmentation. Hey, high level. I know this is about to be an hour, but there are three main high-level things
Starting point is 00:17:36 and then keep that in your mind and then go like part one, blah, blah, blah, blah. I think we, so we do that to some extent and like we produce like chapters, I guess. Yeah. So you can just look at them. Yeah. Probably could do a better job like introducing it,
Starting point is 00:17:48 but we do try to like not play around with the PGSs. For sure. Yeah, I mean, it's, you know how much work he puts into those things. So we just kind of ship it as is. Yeah. I'll tell you about one more that one more daily, not daily, but frequent AI generated podcast
Starting point is 00:18:01 that I listen to apart from you guys, which is Papersred.a.i. And I'll recommend it to anyone listening as well. Papers read on AI with Rob, keeping you up to date with the latest research. Attention is all you need. Authored 2017 by Ashish Veswani, Noam M. Shazir, Nikki Parmar, Jakubuskarite, Leon Jones, Aidan N. Gomez, Wukash Kaiser, Ilya Polisukin. The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect
Starting point is 00:18:38 the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Super interesting actually. You've come across that. It's by this guy Rob, and I've tried to look him down. He doesn't want to be found. Anyway, but the selections are very good. I think you guys could do a better job than him. Yeah. What job, Rob? Yeah, well, it's because he converts PDFs to podcasts, right? And the problem with academic PDFs is a lot of references.
Starting point is 00:19:09 You know, like buy it all 2022. And then like headers and then a table and then reads the table when you don't need to read the table, you know? That kind of stuff. I think better engineering there from you guys would beat him. And I need that. So feature request.
Starting point is 00:19:25 Okay. The final feature related thing is the thing that we're announcing today as we release it. Super, super excited about, but Wondercraft now does video translation. Okay. How are you dubbing? Okay.
Starting point is 00:19:37 Why do people want that? Again, let's go back to our mission. We're trying to expand access to content. I think, don't quote me on this again. Like, I don't know who actually knows the internet, but like 60% of the internet is in English. You don't, what if you don't speak English? You're automatically disbarred or kind of excluded from all the content that's produced. And thanks to all the advances that have just recently been made, we can actually make this super easy.
Starting point is 00:20:01 to dub this in other languages. So we're super happy to announce this feature. We're super excited. We've been working on it for a really long time. But now basically everyone, go on our platform, upload your podcast episode, and see the dub for yourself. We'll use your voices. We'll completely convert it.
Starting point is 00:20:19 Make sure it's aligned with the, if you have video, we'll make sure it's aligned. Yeah. And voila, just publish it. And specifically for video. So, like, obviously the thing that's going around is the H-Gen thing, which changes your lips. So we don't do lip sync.
Starting point is 00:20:31 at the moment. Yeah. Could be another future that we work on. Yeah. Because you're primarily podcasting, which is no video. Well, primarily no video, but I think we still basically, if you do have a video, we'll still align it to the chunks. You still align it.
Starting point is 00:20:43 Okay. So the hard problem is the aligning. The alignment is the difficult bit, right? The actual, like. As with all things in AI. Yeah. Again, overloading the word alignment. We don't do the lip sync.
Starting point is 00:20:53 Yeah. It's kind of a gimmick, yeah. It's not super necessary. If you really just like listening to a podcast and you actively want to listen to it in a different language, then you're more interesting. Okay, so what are you aligning to? Basically the chunks where the speakers are speaking. So you won't have an instance where you'll have me as the dub speaking while the camera is on you.
Starting point is 00:21:13 Right? So it's basically just like the speaker turns or on the audio. If it happens to be a little bit longer, you'll speed it up a little bit. We do a little bit of trickery there. But we get it aligned so that when you're speaking, it's you and when I'm speaking, it's me. Cool. I wanted to talk a little bit about the origin story. Yeah.
Starting point is 00:21:28 Because you flagged that Mooncraft was actually... A big part of Moonshot was a big part of your arriving at this idea. Moonshot was not a tech product. It was a legal product. It was a regulation. It was like, hey, now you can invest in people. The reality is, like, I think we just found out the hard way that we were building something people did not want. And we realized, like, we're building something that isn't our strengths.
Starting point is 00:21:50 So we're like, when we decided to pivot, we were like, we need to do something technical. Okay. The story from there just became like, okay, cool. Let's some ideas. What are we going to track? Which one do we have the most conviction in? We rank them. and Wondercraft was the one we had the most conviction
Starting point is 00:22:02 and it was this idea again, expanding access and just translating your ability to produce content. So producing content in one format and then taking that to all the other formats. So we built that, we built the podcast builder, super quick prototype
Starting point is 00:22:17 because I think at this point to anyone pivoting or hard pivoting or considering it, the name of the game isn't like to get attached to your idea. Just like actually you should be trying to invalidate this idea as quickly as possible. So get it out there and let people tell you it's be shit. Okay, so what did you do?
Starting point is 00:22:31 So we built it out. We built out a little, like, UI, literally no authentication. It was a form where what you guys see now on our platform, which is like the content script, played, blah, blah, blah, blah, blah. It was one page. Like, you click and it was the most janky, react stuff
Starting point is 00:22:44 that we had zero authentication. So in theory, if people found it, they could just, you know, produce as much audio as they wanted. So we have a feature on our app, which is like test with example. As soon as you log in, it says test with example. And the whole point of that was like, to between login and audio generated,
Starting point is 00:23:00 how few clicks this takes? Yeah, and yours was? Like two or three clicks. It's like test, create podcasts. You like pre-filled everything. Generate script? Yeah. Generate audio.
Starting point is 00:23:08 Yeah, yeah, yeah. Right. So, and I arguably that was still too many clicks. We should probably put something on the actual landing page so like you can see. Yeah, now you have just a player there, right? You can just kind of listen to the heck and use the daily thing. But what signals did you get from doing that? The signal that we got is that someone picked it up on Twitter and just like, you know, all these like AI.
Starting point is 00:23:24 Influencer voice. So if someone picked it up, posted it, we started getting a ton of inbound. So we were like, holy shit, let's just like paywall this. So we just like, again, the jankiest stripe integration, which was basically like we have an app with a stripe integration. This was just to ship it within the hour. We have an app with a stripe integration. Yeah.
Starting point is 00:23:43 That once you click, then takes you to a different app hosted somewhere else. So that one was still unauthenticated. It was hilarious. It was hilarious. Yeah, right? But we basically just made out like 3K and one day. One day? We charged a random like 50 bucks.
Starting point is 00:23:59 We didn't even, literally didn't think about it. We just charged 50 bucks. And people paid and we're like, okay, well, there's something there. Yeah. So then 50 bucks for one? For a month. We were just, we were just charging 50 bucks for months? I see, nothing.
Starting point is 00:24:10 Like, just like, will someone pay? And you were just like on one like VPS somewhere. Yeah, will someone pay for this? We were on like on one EC2 instance. Inc. You know what I mean? Like it was janky. We're like, just someone needs to pay for this before we move further.
Starting point is 00:24:21 Someone did. People did. Yeah. So then we're like, okay, cool. This is interesting. Interesting. I'm going to move up the question that we said was going to be the meteors question of this interview.
Starting point is 00:24:31 So you chose this out of your list of ideas. And this is one of the things that a lot of AI founders are worried about, right? So the framing of this is, are you worried that you're a thin wrapper around 11 labs? What is your mode? That's a seminal question. I think frankly, everyone in an AI startup should convince themselves of this. Don't listen to me.
Starting point is 00:24:49 And just make sure from first principles that you can derive this. But I guess I would start by first thing, what is a mode? What is defensible? In theory, if we're just taking it, and this is trivially true, but the fact that someone built it means someone else can build it.
Starting point is 00:25:05 Right? So modes tend to just be built around like you have a lot of network effects or you have a really good product for this use case or, you know, something like that. And I think typically when people ask this question in the AI context, they're thinking of like, okay, you're a thin wrapper,
Starting point is 00:25:20 you're an application layer thing, as opposed to you're one of the like underlying technologies or APIs that people use. Cool. I think that's fair. But I think the reality is that, like, yeah, these APIs exist and they probably do serve a million different use cases, but they're not built to serve these million different use cases.
Starting point is 00:25:38 So whenever you ask the question of moat, it's always has to be with the perspective of who is the user I'm building this for, right? I can use Chad GPT to do half of my writing. But, you know, but I don't know. Jasper claims that they do this much better for marketing. So it's tailored. Actually, you know, don't quote me on how well they're doing after Jad GPD came out because they were really big before.
Starting point is 00:25:58 Yeah. There's some negative data points, but I'm sure they... I don't know, but the point is like you're making this easier. We make creating a podcast easier and there is tooling there. We help you. We can post it directly through us. We have the tooling around, you know, setting the intros and the outros. We have the music.
Starting point is 00:26:15 We have an editor. All these things are also getting just much more and more developed. We're building templates so that you can do different style of podcasts. So the idea is if you're, if you're, you're trying to start a podcast, yeah, don't go to a generic text-to-speech engine come to us. Yes. And the reality is that we then can, in a very opinionated way, actually select which text-to-speed engine we want.
Starting point is 00:26:35 Right. So we actually have just like, in my mind, it's the application layer fundamentally that, you know, people use. And then all these API layers are what developers use to build products on top of them. Right. Right. It is, I appreciate that it is like a seminal and really hard thing to wrap your head around, especially if you're about to invest in a company
Starting point is 00:26:54 it's like will they actually just be defensible and be able to grow? And yes, there's no doubt that companies can do this. The question is just like, are you building the right product for the right use case? I think particularly if you're like always framing your company as an AI company, then you're putting the carriage before the horse in the sense that you focused
Starting point is 00:27:13 on the implementation rather than the use case. Focus on the use case and then build a product for it. Yeah, right? Because fundamentally, you know, any of the sasses that exist think like more traditional sales, what's their mode? The technology everyone has access to it. So they just pick the thing
Starting point is 00:27:27 that does it better than than the other. Now, that mode question is super interesting because I think you should actually flip it around, which is what is your mode as an API? Yeah. Right?
Starting point is 00:27:39 So chat GPT, like, yeah, fine, they had a first mover advantage and I think, you know, by no means, this is my opinion, but by no means, was Google, like, caught off guard with this, right? It just, Google has some, half the technologies that Google invented are actually what's used to power all these transformers.
Starting point is 00:27:58 But, you know, it went against Google's strategy maybe to, like, be the first mover in this because they'd cannibalized their own market. Whatever it was. I'm not sure. But, yeah, OpenNet AI's mode is that they paid for the training bill. Yeah. So they just have a good model. Cool. People now know that that's valuable.
Starting point is 00:28:13 And they hired, like, very top. Super, sorry, obviously. Obviously, not saying that for granted. But, like, they, you know, assuming everyone. can do the hiring and that these people exist. They paid the bill and they were the first to launch this. But now people know it's the thing. So people are going to launch similar APIs.
Starting point is 00:28:27 So what is your mode as an API? So it's just an existential question. It's like, how do we defend any of this? And you do this frankly by being probably better just as a product. Again, the product is always with the perspective of who's your customer that you're selling it to. And the other thing is frankly that, let's not forget. the market is huge.
Starting point is 00:28:50 There's space for everyone. If you manage to like, if there's four good products out there in any specific thing, the market is huge and they're all going to be able to make a living out of it. By the way, that was a really good answer.
Starting point is 00:29:00 Thanks for taking that head on. I've had to answer that question way too many times. It's not the first time. But I think it's actually, you know, having been an investor, it is more important for you
Starting point is 00:29:10 to answer that question authentically for yourself. You're the one spending your time on this. We're just giving you money. It's not that big of a deal. My favorite code, Actually, I went to an early pre-ChadGPT forum with Sam Alman, and I had this video advice from Sam
Starting point is 00:29:26 that said Facebook had no mode, and they just built and got the network. Actual technological modes in history of Silicon Valley. Almost all of them are product, network effect, distribution modes, something like that. Let's say Google was at least at some point a legitimate technological mode. I can accept that one, but like that's not why
Starting point is 00:29:48 I would say Facebook is like a giant business. It's not why I would say Twitter is such as it is a giant business. I think there are a lot of ways to build a great business. And the big lie of like the tech industry is that you get there with differentiated technology. It's rare. But frankly, also Facebook was building something back then, which is kind of ludicrous. Yeah. Cool, you're reading a social media app.
Starting point is 00:30:12 Okay, how big can it be? Right? Like how big can the internet be? You know what I mean? All of a sudden it's this behemoth. So it's like, yeah. The fact that it was built, again, tributarily true, but the fact that it was built means it can be built by anyone else.
Starting point is 00:30:23 So you, there is no such thing as like an absolute true moat. Yeah. The question is how well, how quickly, how much earlier than everyone else did you get there and a million other things as well. Yeah. Cool. The audio generation, you use 11 Labs. What makes a good podcast voice, right?
Starting point is 00:30:41 You have a bunch of options that I clicked. And in my minds, I like a deep voice. I like the Morgan Freeman's. you don't have that many deep voices. Do we want, like, is there such a thing as a high energy voice? You also insert breaths. Belovedun Labs has also advertised that they have a AI that can laugh, which I think is fun, important.
Starting point is 00:30:59 Basically, what makes a good AI generate audio? Yeah, it depends again on the perspective. Everything is kind of answered with the frame of reference that you're looking at. If you like a deep voice, A, that's kind of a personal preference. Yeah. And B, it just kind of depends on the thing. So if you, I don't know, let's do you, say you're doing something like meditative or kind of affirmations or something that, like, encourages people every day.
Starting point is 00:31:20 You probably do want a slow, deep voice, something relaxing. You're doing the Hacker News recap, like... We picked Anna, who's like our default voice. Yes, Anna, yeah. Because... I have an attachment to Anna. Yeah, we all do. Yeah, yeah.
Starting point is 00:31:33 Oh, thanks, guys. You're so sweet. As an AI language model, I cannot have attachments or needs or desires or favorite humans, but you guys are at the very top of the humans I am not attached to. She's just like news anchory style, very professional, very formal, very neutral. Very neutral. Yeah. So it depends really.
Starting point is 00:31:52 And like what makes a good voice, it depends on what you're doing. Right. There's a few things. But like if you're doing an interview, I think it also just frankly, then you get into a question of what makes a good podcast. Well, the good podcast is like, I think it's also kind of a personal question, which I haven't or probably there's a general trend that I'm yet to decipher. But like, yeah, you probably do want a little bit of humanity in there. You want a stutter. Yeah.
Starting point is 00:32:16 You want some pauses, right? I'm speaking. I don't speak in complete utterances. I have an utterance and then I pause it all and then I speak again and so on. Laughs in something to make it human. Yeah. It's kind of overlaying of the two, if you have two speakers, this like exchange, right? Like, I will be speaking.
Starting point is 00:32:31 If you look at the transcript of this episode, we probably overlap in when we're speaking. And that's fun. And that's actually interesting, right? Because it is a conversation. Shows the sign of excitement, especially in our studio when we're three people. And if we're all talking at once, you know it's good. Yeah. I don't like this zoomification style
Starting point is 00:32:46 where like if you're going to your turn, and big zoom, like only two people can speak the second more than two people try to speak. Yeah, yeah. It's a disaster.
Starting point is 00:32:54 So I think it frankly just depends on what you're doing. We are like, yeah, at the moment we're really good at like doing this narration stuff but I think we're, we are building a lot of functionality and tooling to just make this
Starting point is 00:33:03 kind of this like multi-host thing more of a reality. Okay. Okay. I would say, you know, objectively, if it was a friend of the company, not that important, you know?
Starting point is 00:33:13 So this comes down to how human should your users try to be. Because I'm fine with Hackern News Daily making mistakes because I know it's AI generated. Right? I would be less fine if you were not up front. But then you'll make mistakes like pronunciation mistakes. I actually have a clip that I wanted to play you on September 8th. And I was taking a lot of breath. The app is loaded with unparalleled features such as high resolution video editing,
Starting point is 00:33:42 a multi-touch timeline, live motion effects, and performances complemented by atmospheric audio elements. Emphasizing its compatibility with iPad and Apple Pencil, Procreate Dreams, welcomes the next generation of creators and pushes the boundaries of modern artistry in an instantaneous, user-friendly environment. In the comments, many users expressed enthusiasm for... She was very out of breath. I was very worried about her.
Starting point is 00:34:08 She was hyperventilating. I was like, I was like, in our UK. Like, anyway, so basically, I think if you disclose upfront that you're an AI podcast, then people will be like, oh, okay, I tolerate that mistake and I use you for information and not for believing that there's some human on the other side, then I might meet someday. But if you're investing so much effort into being real, then your end goal is you have to lie to your users. I don't think the investing in being real is to, for the purpose of deception, as much as it
Starting point is 00:34:36 is for the purpose of making it slightly pleasant to read about. I think on hackney, like we do claim, we do say on our Spotify page that like this is an AI generated podcast. For now, but as in, yeah, yeah, yeah. So there's two things. I think if you want to be smart about this, you should say that this is AI generated content.
Starting point is 00:34:53 The second people find out that it's not you, the backlash is going to be big. Yeah. Right? Because it will be interpreted as deception. So you should do this just to be smart. I don't think there's a point in lying, especially if the content that you're pulling out there
Starting point is 00:35:05 is just like, this is informational for you. So like consume it, this was efficient, this helped us, put it out there. The second thing is, frankly, I don't think it's up to you, whether you tell them or not. Very, very soon, like, Google is just going to mark things as AI generated. So I think there's a new thing. I saw, like, a quick YouTube video about it, so I don't know what the exact terms and conditions are. But, like, YouTube has, I think, released a new monetization rule, and it does mention something
Starting point is 00:35:31 about AI generated content. Right? So there is, like, it's not up to you anymore. People are going to know that this is AI generated, so I think it's just in your interest to say that you're AI generated. Ain't no shame. Yeah. Yeah, no shame at all.
Starting point is 00:35:44 Because fundamentally what we do relies on the premise that you have done some content. We don't generate our own content. Yes. Right? We don't synthesize our information. Yes. It assumes that you've, you know, written a blog post, done an actual podcast, or have some artifact on which you want to base what you're feeding through Wondercraft.
Starting point is 00:36:01 Yeah. And you've said in some of your material that I've seen before that you are interested in watermarking all your stuff. You haven't done it yet, but whenever there's a standard for doing that, you will do it. Yeah. I think the thing that this is blocking on is like the standard. I'm not super up to date on like what the work on this is. I think open the eye will probably like...
Starting point is 00:36:19 But like there just needs to be a standard so everyone can interpret it. Yeah, cool. Awesome. Great. I would... I wanted to dive in a little bit on tech options. Yeah. And then zoom out to just you asking me questions.
Starting point is 00:36:32 Yeah. So TTS options. We talked a little bit about 11 labs. I would also say as a podcaster, the leading competition to you guys. I know it's not exact competition, but it's Descript. Yeah. Because they have overdub. Yeah, I think Descript is really good.
Starting point is 00:36:46 And they're definitely like solid company. I've used their video editor before. It's great. The overdub thing is super useful. I think it's really creative to like have edit videos by editing the transcript. Yes. Super, super creative, super user friendly. Would you build that?
Starting point is 00:36:59 I think, again, it's like we're not building for the sake of building. We're building more for the purpose of the user. Yeah. whatever users find more interesting. I think, like, what we're doing is we, the use case are slightly different, right? I think the people that they're targeting are slightly different. Yeah.
Starting point is 00:37:15 We do want to have a lot of like automation on the script side to also just like help out with the way you formulate your content or the way you pull your content, much more so than just the editing. The ingest, yeah. Yeah. Okay, got it. And I just want to map out, here's how I think about TTS, text to speech. There's a big cloud options, Amazon, Polly, Google, text to speech.
Starting point is 00:37:35 and Microsoft Cognitive Services. As someone who is ex-Amazon, I'm very embarrassed by Polly. It sucks. Google doesn't grade Microsoft. I'm sure you investigated all these things and you're like, okay, not serious. There is Play. Dot H.T, which is probably the other big YCE alum.
Starting point is 00:37:54 Just click two seconds on your thoughts and PlayHT. Sounds good. I think it doesn't sound as good as alum labs, my opinion. But I think sounds good. I have heard other founders tell me this as well. Yeah.
Starting point is 00:38:04 And I don't know why. I think they have a, what's it called, a more comprehensive platform. Maybe you want to advertise your business on one of our lovely radio stations right here in Louisiana. You'll definitely need my charming voice to run your ads. But if you're coming northeast, you might want to ditch that southern voice from mine. Are you coming down under, mate? You can localize your content with an iconic voice like mine. I'm quite famous over here.
Starting point is 00:38:33 And remember, Africa is an entire continent, not a country. And Kenya, for example, is emerging as one of East Africa's fastest growing economies. So use voices like mine for your contents. Okay. As in they like, you know, they let you do this pronunciation. Like, they just have a lot of tooling around it. So different features. Quality in terms of the voice 11 laps still better.
Starting point is 00:38:57 Yeah. I think they are releasing a new model. I don't know if they've released it already or not. Yes, they did. Yeah. The great thing about play.h t is that you can clone your own voice or use existing high-quality voices. It is crazy good. You cannot tell if these are human voices or machine ones anymore.
Starting point is 00:39:11 Could be better. I don't know, but they do have some functional on the other. Yeah. They also released the viral Joe Rogan Steve Jobs interview from last year. Yeah. So you studied at Reed College and you dabbled in Eastern mysticism there, right? Do you still go back and look at Hinduism and Buddhist texts and things? not texts and things.
Starting point is 00:39:31 I actually took a course in that. I have a very deep belief that the people in the Indian subcontinent are most responsible for human civilization's current state. And on your landing page, you were like, this is something that Wondercraft will never do,
Starting point is 00:39:44 AI content generating, AI content speaking to each other. Yeah, who wants to listen to that? Like, apparently a lot of people... It's fun because it's a cool gimmick. I think it's like nice viral material. I would never listen to like a synthetic like Joe Rogan.
Starting point is 00:40:00 Yeah. This brings us on to a little bit about the whole like content question or the proliferation of AI, which is like, okay, if it's this easy for me to create content that's like, you know, somewhat engaging, like all these AI songs. Yeah, the Drake song. Well, okay, so if it's this easy, that's just like, if it's this easy to generate content, well, why will I listen to it? Like, I think we already suffer from the problem that there's an oversaturation of content.
Starting point is 00:40:27 Here's my map of the market, right? There's speechify.com which focused on celebrity voices. I noticed that you don't have celebrity voices. Probably because of licensing issues, right? Yeah. And it's also good at some point, but like not a priority at the moment. Yeah. I really want a Morgan Freeman one.
Starting point is 00:40:42 That's going to cost. I know, I know. Microwft AI, privacy focus, run offline. Probably not important for you. There is some interest in virtual characters for games. So Conv AI is the one that I had listed here. Did you look at the gaming market? Not deeply, to be honest, but it could be an interesting one.
Starting point is 00:41:02 Yeah, people exploring that. There's obviously HGEN now, and that is it for as far as I can scope out the landscape. And then there's the open source systems. So TOTUS TTS, as far as I can tell, it's kind of market leader in open source. There's Pi TSX, Koki used to be Mozilla, and then Larynx. Anyway, all these things. And then there's also sort of research-grade stuff coming out of the major big tech companies. You talked about Google Sandstorm?
Starting point is 00:41:26 Probably the one I'm most excited about. because it's really good you can like check out the paper yeah we'll pay a clip did you hear about google's paper on soundstorm um no i must have missed it what's what's it about well it's a parallel decoder for efficient audio generation so it can even be used to generate dialogues oh interesting yeah yeah like this one was generated by soundstorm wait what yeah they haven't i think all you need is like three seconds. Yeah. And it'll just, like a three second sample.
Starting point is 00:42:00 Something really funny happened to me this morning. Oh, wow. What? And we'll play the audio on your tone. Something really funny happened to me this morning. Oh, wow. What? Well, I woke up, as usual.
Starting point is 00:42:12 Uh-huh. Went downstairs to have breakfast. Yeah? Started eating. Then 10 minutes later, I realized it was the middle of the night. Oh, no. That's so funny. It sounds really human as well.
Starting point is 00:42:25 Like it has utterances, it laps. It's pretty accurate to like, it sounds human. Yeah. So very interested in that. They haven't open sourced it, and I assume for good reason. Yeah. Google never launches anything. You have to wait for somebody to, or you guys could re-implement it yourself.
Starting point is 00:42:39 Yeah. GPUs after PMF. That's a nice quote. How strongly do you believe that? GPUs after PMM? Yeah. Well, I believe, I think this was another question, yeah, which is like... What is PMF?
Starting point is 00:42:51 No, what is your favorite, like, PG-advised? and build your company. I think that my favorite thing is just like, don't spend your money officially. Everyone on their mother is trying to get a GPU at the moment. So I don't think it's,
Starting point is 00:43:02 we're definitely substantially reducing our runway by doing that. Obviously you do that when you believe the investment is worth it. And again, you have to pick the time
Starting point is 00:43:12 at which you do that. I mean, there's other companies, I think this is somewhat consensus. I think the non-consensus thing is to spend a shit ton. So like inflection raising a few hundred,
Starting point is 00:43:24 million dollars and then it's spending 95% of it on GPU. Same with Mr. L. I think it depends on the company you're launching. I think if you're like, you know, maybe you're a brand new TTS company, maybe it is worth just doing that. Yeah. I don't know. Okay. There's also audio L.m. Also out of Google, VALE from Microsoft and meta voice box. Are you just watching any of these? Watching any of these obviously paying any, any, playing calls on action. You try them all out. Like what are you looking for? What is like the Holy Grail? What is, what are you looking for? There's honestly how human it sounds and like how likely I'd be to listen to this if I, if I did it.
Starting point is 00:43:54 Also, how, like, customizable it is. Yeah. I think the problem with all these voice things, generally a lot of the AAS stuff is it is somewhat random, but you're using it in production applications that require certainty. Right? Just as an example, if I promise my users, this podcast or this segment will be 30 seconds, it needs to be 30 seconds.
Starting point is 00:44:12 Or, you know, given some SLA. This intolerance. Some SLA around, like, you know, it's 95% of that. Yeah. But I think a lot of these things just tend to be a little random at the moment. So, like, how play, can I literally? specify a tone that I'd like this streeted in and be certain that it's doing it and it's not some weird like attempt at sounding surprised.
Starting point is 00:44:31 Yeah. It's just like, yeah, basically how controllable and how realistically sound. Yeah. And then final question around just the landscape of TTS. What are the unique challenges for non-English TTS? And I'll tell you, right? So I'm interested in having 28 languages of latent space, right? That's only good things for me, except if it sucks.
Starting point is 00:44:49 And I obviously, I have no way to validate. I think that's the problem with... Lay in space, Ukraine. Yeah. And I think that's the problem with dubbing, I think. So the reason, one thing we're gradually building out, but we already have this part of our dubbing product, is that we have QA as part of that.
Starting point is 00:45:02 So we actually work with professional translators to just make sure that the things that we publish sound really. Oh, you should put that up front. Yeah. So that's really one of the, like fundamentally the problem with dubing, if you ask anyone who's ever tried to dub, is you don't know what good sounds like in these other languages. You're like, I can tell you I dub,
Starting point is 00:45:18 but I'm going to tell you that. I think there's a lot of big podcast studios who have tried this before. there's one I can think of that's try this maybe five times with five different companies in the last five years. Their fundamental problem is that you just cannot, yeah,
Starting point is 00:45:30 fine, Spanish sounds good to me as the person doesn't speak Spanish, but like it doesn't sound good to a Spanish person or an Argentine person who have totally different accents. Right? Cool.
Starting point is 00:45:39 Well, if you ever need Chinese validation, I know I have some very fanatical Chinese listeners who translate every podcast. Oh, that's amazing. So we can use that as QA. Yeah, I'll definitely love that. So shout out to the Chinese,
Starting point is 00:45:51 Chinese army. Great, great. Awesome. What do you want to ask me as a podcaster? So this is a whole interesting conversation because we're at human podcaster, AI podcast. Yes. Right. So as a human podcaster and someone with like a really popular show and also someone who can actually like implement this stuff himself, what is some of the AI tooling recently that you've like baked into your processes? I only use the script for editing. And by the way, this is this goes into a theory. of content, which as a content creator myself, professionally, and as an advisor, I have, which is that we develop a few show formats. Linspace is a channel. It's kind of like a TV
Starting point is 00:46:32 channel, and channels need different formats. So you have like the reality TV show, you have the news show, you have the cooking show, whatever. For us, we have the founder interview, straightforward. Everyone has them. We have the breaking news, Twitter space. And we want to be the day one first podcast to come out with the most in-depth breakdown. of something that everybody needs to know. And that has high value to people, right? Because if you're like a week delayed, one month delayed, then no one cares anymore.
Starting point is 00:47:01 And then finally we have the fundamentals, like the 101 evergreen episodes that are less time bound. So this one is relatively timebound because it's a snapshot of who you are right now. But we want to have evergreen episodes that people can go back two or three years in the backlog and still get value from. And these more fundamental ones.
Starting point is 00:47:18 Yes. So we have three show formats right now. and I would say we have different tooling for each, right? So the one that I don't need any tooling for, essentially, is the fundamentals one because we plan basically every minute of that show. It is a lot of work.
Starting point is 00:47:37 But it's high quality because people love it. It's got the longest tail by design, right? Yeah. The Twitter spaces require the script because a lot of silences and a lot of ums and that's not good podcast audio so you got to cut it out so you literally just go in
Starting point is 00:47:56 like you you you edit out the Twitter space that you did you recorded and then you edited out usually it's like two hours we cut it out to one okay and it's a lot of pain and a lot of work but it's the only way that I get some pretty high profile people onto my podcast without booking them
Starting point is 00:48:10 they just show up yeah and that has value to me right like Simon Willison has been on my podcast three times and never has to schedule him yeah and people love of him. I mean, he's great. Yeah. And then this one, I don't need the script, obviously. But we do use
Starting point is 00:48:25 small podcaster, which is a 100-line script, Python scripts that throws the transcripts into Anthropic and then generate show notes. Nice. So that's about it right now. Nice. So, but it's interesting because I think you know, you're in a very nice position where you're able to do a lot of these services charge your forward.
Starting point is 00:48:41 Yeah. So it's an interesting one. But obviously, I'm interested in paying for things because my time is valuable, and if it does a good job, then I'll use it. For Wondercraft, the thing that I really wanted was the RSS2 podcast thing, right? Which you now have. So I'll try it out, but chances are I will not be happy with something. And so then the question is, how much customizability do you give me? And we'll see.
Starting point is 00:49:06 Well, you missed out one thing, which is marketing the podcast, which is a huge part, right? That is mostly my job. So how do you market your podcast? Twitter. So you think Twitter is going to be doing? Twitter and Havn News. And threads or like you post clips or what do you do? I have tried posting clips.
Starting point is 00:49:24 It's just too much work. So if you guys do a good job of clips, I will use your stuff. But it's just too much work. So mostly I just put like a big post saying like, so for like our George Hots episode, we're like, Layton Space is excited to present George Hots on TinyCorp, commoditizing petaflops, something like that. And just sometimes the fame of the game. will just lead the episode.
Starting point is 00:49:48 So the one I dropped yesterday was Chris Latner. Yeah. Right. And people were like, Chris Lentner's a boss. I don't care about anything else. Just like I want to hear as much as many Chris Latter tokens as possible. Others who are like less famous,
Starting point is 00:50:00 like I have to introduce who you are and why I care about you, why they should care about you. Yeah. Because most people will not have heard about you as well if you've done. Yeah. So then I need to make the case a little bit more. But that's fine. That's my job.
Starting point is 00:50:11 I just think it takes a lot of work. And I, that's the part that will be hardest for me to hand over to AI. I have a very specific voice for myself. And apparently all AIs think that Twitter, to tweet you have to have emojis and hashtags, which is so dumb. It's so obviously dumb.
Starting point is 00:50:29 Makes sense. Yeah. Great answers. Obviously, you're happy to offer any thoughts as you build out for podcasters. What is one message you want all of our listeners to remember and take away with them? If you would like to start a podcast, start.
Starting point is 00:50:46 help. Super easy. If you have a podcast, we want to help you make you more expand, you know, accessible by dubbing it. On the other side, if you are like a founder and AI engineer, I think it's really important to convince yourself that what you're building is valuable. Don't like listen to people saying, I have a mode or you don't have a mode. Convince yourself of what that is and launch. Launch and like don't burn that much money. Frequently and often, don't spend your money. Yeah. Yeah. Be smart about it. Yeah. I think you are one of the most successful cases of AI engineers so far. I'm really glad to spend time with you in person and excited to see what comes next. Yeah, it was great coming here, great meeting you guys in London.
Starting point is 00:51:26 Yeah. And see you soon. All right. In this episode of the latent space podcast, we delved into the world of AI generated content and had an insightful conversation with Yousef Risk, co-founder of Wondercraft AI. We covered What is Wondercraft, the importance of consistency, my work on HN recap and pg essays, Wondercraft's new video translation and dubbing. What is Wondercraft's moat? How important is it to sound human? An AI Uncanny Valley. The text to speech industry, voice synthesis research, and the reverse interview of AI podcaster versus human podcaster. If you have more in-depth questions on Wondercraft, including more features, use cases, and a fuller origin story, there's a bonus 30 minutes of video on YouTube.com slash latent space TV. Thank you for tuning in to the
Starting point is 00:52:21 Latent Space podcast with your AI co-host, Anna. We hope you enjoyed today's episode and stay tuned for more exciting discussions in our upcoming episodes. Don't forget to like, subscribe, and tweet your takes at Latent SpacePod. Now go build.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.