a16z Podcast - When AI Meets Art

Starting point is 00:00:00 On June 27th, our team headed to New York City. We are at the A16Z office for the first ever AI artist retreat. That was A16Z consumer partner, Justine Moore. Justine was one of many partners who attended this retreat, which brought together the builders behind some of the most popular AI creative tools in existence. That is, 11 Labs, Korea, Vigel, Udio, ideogram, and civet. all together with 15 top artists. These are the folks who are often doing the coolest things with these sorts of tools.

Starting point is 00:00:36 They're kind of pushing the boundaries of what the tools can create. Today, you'll get to hear from many of these AI founders, who together with these artists are advancing what it means to be creative. Art is going to get better than ever. The average art output is going to improve, but so is the ceiling. It also is a higher participation rate. Everyone who's interested in creativity can be creative and express themselves, which is just so cool. That was a niche Acharya, general partner on the consumer team, but that's not all.

Starting point is 00:01:08 I've been a founder twice. I've been spinning records as a DJ for 25 years, and I'm all about AI and art. So what happens when you put all these investors, leading artists, and creative tool founders, all into the same room? I mean, the vibes have been immaculate. And I think that the thing that's been most surprising is how much everyone has, in common. Like the founders are more creative and the creatives and artists are more technical. I think the other thing has just been how interdisciplinary it all is. People making video want to play with generative audio, people making music, want to play with sound effects. It's just

Starting point is 00:01:42 incredible to see. One of the coolest things was a lot of the founders had recognized people by their online screen names or knew, oh my gosh, you use my tool to create this incredible song that went super viral or you use my product to make this kind of amazing video animation that our whole team was talking about for a week. These are people who have been interacting with each other often daily online for the past six, 12, 18 months, sometimes even two years, but didn't even know what each other looked like in person. Now today, you get a behind the scenes look into this event, including the origin stories behind many of these tools, which, by the way, some have never been shared publicly, and how these tools, which have all gone through their own viral moments,

Starting point is 00:02:26 are navigating this AI wave and what they see on the horizon. Let's get started. As a reminder, the content here is for informational purposes only, should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16C fund. Please note that A16C and its affiliates may also maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see A16C.com slash disclosures. Here we are in 2024. We're at an exciting inflection where your creativity is being unbounded by the tools available. I mean, we're early, but there's more people making more art

Starting point is 00:03:16 and more people making more tools to make art than ever before. And if you kind of look at the history of technology and art, every single time there's been a new technology. The amount of art has dramatically increased. People were worried that drum machines would compete with drummers, and instead there's more people making more music with both drummers and drum machines than ever before. So I think there's a sort of equivalent moment here in technology and art where we're at the beginning of everybody who has taste and interest in art being able to make it. Many have drawn parallels to prior computing waves. But is this any different? Well, what's different is for the first time we're creating these sort of left-brain things.

Starting point is 00:03:53 You know what I mean? Computers and computing platforms have really been in the business of precision. And now we're creating products that are intentionally imprecise, beautifully imprecise. So it just feels like a whole different flavor for products and product design than we've ever seen before. So let's introduce you to some of the people behind those products. We have companies here covering basically every sort of creative modality, image, video, music, 3D, speech, all those sorts of. of things. That includes... Connor, I'm a co-founder at UDio.

Starting point is 00:04:23 And... Amar, I'm the head of design at 11 Labs. Both companies are focused on audio, with Udio focused on music, while 11 Labs is tackling everything from voice to sound effects. Meanwhile, founders like... Mohamed, I'm the co-founder CEO at Ideogram. Victor and Diego. Who are the co-founders of Korea.

Starting point is 00:04:42 And... I'm working on Viguel. These founders are building at the increasingly sophisticated world of 2D imagery and video. Plus 3D. Ideogram, for example, lets you generate AI imagery with accurate text embedded, a surprisingly difficult technical feat.

Starting point is 00:04:59 Vigel, on the other hand, is building at the intersection of video and 3D. Meanwhile, CREA has come up with a suite of AI tools like upscalers and real-time generation. Or, in the case of Sivit, a new breed of marketplace.

Starting point is 00:05:13 My name is Max Phil Holker. I am CIO at Civitai and co-founder. Yes, I'm Justin Mayer. I'm the CEO and co-founder of Civit as well. And CPO and CTO and lots of things. The joys of a startup. We are a massive community of people making tons and tons of AI creations using community

Starting point is 00:05:31 made models with community made patches to those models called Loras. We give people the ability to either train on a few specific models, so a model focused on anime or a model focused on being semi-realistic, or they can select their own custom model to train on top of. With AI moving so quickly, it's clear that we no longer live in a way. world of Just ChatGBT and Mid Journey. Numerous companies have springboarded into the zeitgeist and grown at unprecedented rates.

Starting point is 00:05:57 So we thought it was fitting to take a step back and document this whirlwind of a journey. And while many of these founders have been quietly working in research for years, their origin story often started from scratching their own edge. Mohamed from ID Graham. I guess part of it is that there is this thesis that everybody has an innate desire to create. And as humans, we have this inner creative child. The education system sometimes kills this creative child, unfortunately. And what's finally possible with technology and AI is to help people express themselves visually and creatively.

Starting point is 00:06:39 So that's the interesting part. When you think of using image for communication, then you can communicate much more effectively. you have image and text together. For Muhammad, it really was this unique combination of text and imagery. For me, image and video is dear to my heart and very personal. But for Connor, it was his connection to music. Music for me, I think, is this very special medium. It's everywhere at all times, like it's in the background when you're at a restaurant or cafe, you're listening to your headphones, when you're going to work in the morning. It really has an emotional resonance with people. And for me, making that abundant, like the kind of promise of generative modeling is that

Starting point is 00:07:19 a lot of this can be far more abundant than it ever was before. And for Victor, it was his discovery that programming itself was the creative gateway. When I discovered about programming, that was great to me because I realized that through coding, you can also be super creative. But the moment where I discover about early Gen AI models like DCGAN and later on, StyleGAN, that's when my mind was blown, and when I realized about the creative potential that this technology had, and that's when I fell into the rabbit hole, and I feel like Korea, to me, it's been kind of the snowball that it started with me realizing that you can use artificial intelligence in a creative way. But for Amar, it was building his own side projects and a desire to share what he was learning

Starting point is 00:08:01 that actually propelled him into his role at 11 laps. It's really funny, actually. I, over the last maybe a couple of years, started diving into AI tools when ChatGBT came out and started making things on the side for fun. One of those was a children's book that ended up accidentally going viral and that kind of was my journey into AI. Through that and making

Starting point is 00:08:23 that book, I started exploring other AI tools and what I really enjoyed was sharing what I was doing and how I made it. So I discovered 11 Labs and made a podcast with 11 Labs where I was like talking to a fictional figure and we were having a back and forth conversation that also

Starting point is 00:08:39 kind of did the numbers on Twitter and then I was like, I love using this tool. I'm going to make my own AI short movie. Actually, Justine and I are friends. And so I showed it to her. And I was like, I kind of need free credits. Because this movie is using up all the credits on Love and Loves. And she's like, you should meet the founder, Maddie. We met. We really hit it off. And Maddie, in classic Maddie fashion, was like very direct. And at the end of the call was like, hey, we're actually looking to hire someone to lead design. Are you interested? And then to work on a product that I'd used for over a year. That experience also gave Amar a taste of just how quickly the space moves,

Starting point is 00:09:13 and also a hit of virality. It happened because a friend of mine had their first kid, and I read her children's book, actually, and I was reading it, and I was like, this story kind of makes no sense. So I went back home. I'd been using a journey a lot, Chad GBT, two weeks old, combine the two to create that book, and then I was like, how do I get this published? And Amazon has this amazing publishing service. You can get a book out within 48 hours. I had a paper back in my hand in 72 hours. So fast. And it's really interesting because writing a book and publishing on Amazon is like, it was almost like iterating on software. If I discovered a type or whatever, I just updated the PDF and the new book was out and a new publishing line was out.

Starting point is 00:09:53 And so, yeah, I put it out there, got a ton of virality from that. And yeah, that was a really interesting experience. Free AI, we were in this era of consumer where it was just really hard to get people's attention, really hard to get them to download a new app or try a new tool. You had to spend a lot of money on customer acquisition. Now, just with the real excitement around AI, if you make a cool product, you can get it into the hands of people and get them using and talking about it. This was the case for Victor and Diego at Crea, who eventually met their own viral moment, although it didn't come easy.

Starting point is 00:10:30 First of all, it was called Janiverse, coming from Generative Universe, best name ever. And essentially, it was like two things. It was on the one side on open source library that it was kind of integrating all the cool stuff that it was available at that moment. And on the other side, it was a creative tool. And the way how it looked, it was like super experimental. Like we didn't really know how to do UI design or any of that. Like the background really had stars and everything.

Starting point is 00:10:56 So it was like a galaxy, like the generative universe, right? And then you could put text, you could put images and you had a few things that you could tweak and you could generate images. you would see the image evolving in real time and the images that you liked, you could keep them and they were added to this kind of universe. And essentially, you ended up with a ton of images in this interactive space. So for us, it was always with the same idea in mind. On the one side, controllability and on the other side, intuitiveness. Like, how do we make tools that doesn't look daunting? Because AI in the end is like a new creative medium. A lot of people are using it for the first

Starting point is 00:11:34 time and we want them to have an experience where the AI does what you expect to do. And you don't need to learn about like crazy prom engineering and like all these tweaks up to get good results. And on the other side is controllability because we are dealing with creatives. We are dealing with folks who are not just okay with having a beautiful image. They want that beautiful image. So these are the two core principles that we had since then. And we build kind of a Figma-ish interface for AI, and we had every single utility that you had at that point with a stable diffusion in there. We had like thousands of AI models that you could use. We had every single technique, like every control net, everything was in there. But you know, like it was not working.

Starting point is 00:12:16 It was like a learning curve that some people were just like not willing to take. So then we had the first kind of virality moment when we ship this thing that it was almost like an equivalent to a meme generator. I remember that we were seeing like all of these images on Twitter. with the spirals, right? And we were like, well, what's going on? With these spirals, like, we can do it. This is like one day of work. And I remember at that point,

Starting point is 00:12:39 Diego was like, we should do something with this. We should do something with this. It's getting so viral. And I was more like in the mood of, like we need to ship like this, whatever feature. We were working at that moment until at one point, we were like, okay, let's fucking do it. And we did it like in the sketchiest way possible,

Starting point is 00:12:53 like in one or two days. And we shipped it in Twitter and it got viral. It was like the first time that we lived something that I had read about in terms of this is what PMF looks like as the first time I was like oh Jesus Christ okay this is how it looks okay

Starting point is 00:13:11 I see you go to sleep and I can feel the heart beating and then like you sleep three hours you wake up because you know that there's stuff broken the email starts to get flooded Twitter starts going there suddenly like literally

Starting point is 00:13:25 every day was like crazier than they went before I was like oh my god a thousand people oh my god 10,000 people and there's like oh my god like football club Barcelona like number one soccer club just used us what? Why is it like how many followers

Starting point is 00:13:40 like oh my god 100 plus million followers on Instagram okay I feel like it was actually hard in the sense of as a founder you're like I've put multi amounts of years

Starting point is 00:13:52 into like many things and then the thing that we literally are like it's not important gives you all the success so it's a moment of reflection You're like, sometimes like the world throws truth at you. But those years of work were not all for nothing. And I don't think like the years that we've been working on was like a waste.

Starting point is 00:14:14 No, it actually is, oh, that's what you learn on the technical level, how it works. I mean, because it was so much failure, we learn about, okay, how do you communicate with your co-founder? I think that's something important to note about those times is that we were very, very aware that this was a trend. and that this was not the end product that we were building. It was almost like a marketing engine that we were using to get better branding and to get known. They were finding us because of one reason, but they were staying because of another one,

Starting point is 00:14:45 which was like this other product that we were working on. Even we knew that, I think that the core learning that we got from this experience is that the AI field changes constantly. Like every month or every two months, there are new breakthroughs, new techniques, new ways of doing things. And the tool that we were building, it was like already starting to get too complex because we were trying to put everything in a single tool.

Starting point is 00:15:12 And I think that what we learned with the experience of the spiral virality is that there's a lot of value on simplifying super niche and simple use cases. And that was the case again when LCMs were released, right? we saw this technology, and at that moment, we used all the experience that we got from the first virality to engine the second one. And this second one, we knew that it was not a trend. It was something extremely value. We were like finally being able to get that interaction that we were looking for for almost years, right? Like, we can generate images in real time and have full control of the colors, the composition, the shapes, everything. That was almost like a dream come through.

Starting point is 00:15:54 Victor and Diego have now hit virality several times over, but can you engineer that momentum? In some cases, it's all about having a single critical feature not offered elsewhere. Mohamed from Idughram. So basically it was the version 0.1, as we called it, and this is back then in September of 2023. And it was a model that was working. It wasn't perfect. Well, it felt like it's already good enough to give it to users, and it was the first model that could put legible text into images.

Starting point is 00:16:31 So it kind of went viral because of the unique capability of the model. Somehow the ability to put text into images felt needed. But in other cases, it's about cleverly enabling the masses, or in this case, the memesters, by drastically reducing the barrier to participate. Here's Hang with Fagal Story. It went pretty viral, right? What was that like experiencing to put a product in the hands of so many users

Starting point is 00:16:59 and also see that kind of spread on its own? Yeah, it was we didn't anticipate that for sure. In the very beginning, we were thinking most targeting content creators. But somehow the meme makers, the mimsters, they catch up on it. And that's how it got pretty viral. And also that's also thanks to some of the templates. We spent so much time discussing, like, why this is the case.

Starting point is 00:17:22 There was this template, the Joker Lil Yali coming on the stage, and there's a Joker character that replaced on the video. And we've seen millions of different characters just remixing the same moment. And we realized that the main reason was used to use. It's so easy to basically, you can upload one image, and then one click, choose that template,

Starting point is 00:17:44 and then in just a matter of seconds, you'll have yourself in basically in that same moment. Maybe one other aspect of the virality is, as you said, the meme makers got a hold of it. There's this kind of fun, maybe even silly aspect to it. How have you thought about that? Well, I think that speaks to the entertainment value. And for anything to have real entertaining value, it has to work. It has to work well.

Starting point is 00:18:08 And that actually requires a lot of rigorous in the research side. So we are pretty serious about being silly. And it takes quite a bit of rigorous research to do that. And the second thing is you have to have a tool that provides precise control. And then because people are getting what they want, they can have all kind of variety of fun with it. You've mentioned characters and templates a few times. What are some of your favorite examples of those generated on the platform? One is the Docker Commando State template.

Starting point is 00:18:40 That one is basically the moment we realize, well, actually, people want to remix, and there is this variety and memes aspect of it. And the second one is there has been one rockerton advertising song, and people are dancing with this. And we're also seeing millions of people remixing that same template. And it's interesting for us because you make us realize that as long as there is this fun elements to it, people actually don't mind this content having a little bit brand message. When you think about applications, and I know it's early days, but have there been any that have surprised you about the ways that Vigel has been applied?

Starting point is 00:19:18 Every time Founder creates a product, they have applications that they envision, and then the best products are often, people are using them in alternate ways that surprised them. Yeah, that was exactly the case for us. In the very beginning, we were mainly thinking of movie makers, game makers, using this might be like quick animation, pre-visualization tool for them. It's actually pretty useful for that, and we've also seen the early users adopting to that. But then we never anticipated the memes. So since that, we'll be also providing those templates.

Starting point is 00:19:50 So we've been keeping track of the latest trendy dance moves, sports events, etc. And we've also seen content creators hopping onto this. They are actually reaching out to us, say, can you feature our dance, our song on your platform? And then can we collaborate on promoting some of those? That's been really interesting. We asked Connor the same question around what he's learning by seeing how the masses are using UDO. The model we originally launched was a model which generated 32 second clips. And so to make a kind of a full track, you would extend that in various directions,

Starting point is 00:20:24 you would add an intro, add maybe a chorus and an outro and stuff, and you would build a song like this and you would start with these junks. And I suppose we've actually come to realize quite quickly that people's experience with music when they ask for say a song is actually a lot more focus than that. So they kind of want a song that begins at the beginning, maybe ends with at the end, it doesn't have to be long, it could be like a short two-minute clip, but it has a verse and a chorus and averse. And there's a structure to it.

Starting point is 00:20:48 And so I suppose we actually underestimate it, just how kind of important that was. And so that's something we were making steps towards rectifying recently. 11 Labs was also no stranger to the surprising and inspiring user behavior. Yeah, I think one of the most surprising ones was people who had lost their voices and then had used 11 Labs to one, bring their voices back to life and then do the thing they love doing. So we had Lori Cohen, who was a lawyer, she lost her voice one morning. and a friend of hers helped her replicate her voice with 11 Labs,

Starting point is 00:21:19 and then she was back in the courtroom delivering arguments. And that, to me, is just such an incredible moment because you don't expect that. And I think our idea was like, hey, we're gonna give ideas a voice with our product and our tools, but this gave someone their own voice back, and I think that was such an amazing thing to see. And we saw that again with a climate activist, Bill Weill,

Starting point is 00:21:39 who was delivering his award speech. He suffered from ALS, unfortunately, but again, was able to replicate his voice, and then deliver that award speed. So I think those kinds of things are just like, you're like, wow, technology is being used in a way we didn't see it, and now we want to lean into that and, of course, help others. Yeah.

Starting point is 00:21:54 Maybe in the opposite sense, have there been any applications that you've actually built or designed for where you're like, everyone's going to use it for this, obviously, where that's actually not been the case? It's interesting. When we launched dubbing and automated dubbing, we thought, yeah, this is it. Like, everyone's just going to use automated dubbing. It's going to be great. And, of course, with dubbing, one of the most important things is accuracy, right?

Starting point is 00:22:14 and so automated dubbing we realized people still want a lot of creative control on that and so we ended up having to build dubbing studio which allowed people to go really fine-tune that dub and change a lot of the content and then we also introduced 11 studios which was basically creative teams that help you dub your content with professionals were really good at that and so we realized that actually was what people needed more of and not just automate everything and all the things right and then it actually picked up again and this is something even when I was working at Palantir, you learn, which is like the temptation to try to automate everything

Starting point is 00:22:48 or to use intelligence for everything, but actually there's so much value in like having someone in the middle and like still having that human touch to take it to that final step with something we learned with dubbing. And as these companies get all this new data, it's not always easy to figure out who they should be catering to. So how do you think about what you build and for who?

Starting point is 00:23:09 Your Tam is everyone in theory. I think what we acknowledge is that We probably have different types of users, like distinctly different types of users. At the very top being, does someone in a studio who's making an album, like, at the very top level. And then at the other end of the scale is maybe someone on their phone who wants, you know, in a minute, they want just a funny song to send to their friend. And those are two very different experiences. And kind of somewhat similar to the kind of output you can get from just an instrument in general.

Starting point is 00:23:40 Like someone can have a guitar at home that they play, to have fun from time to time. It's like a totally personal thing. It's not anything necessarily serious. It's just a way to express yourself with kind of musically. And the same way someone can take that same guitar and a professional can take it into a studio and make it part of something fantastic.

Starting point is 00:23:57 We like the technology to basically enable all ends or all parts of that spectrum. Several are unsurprisingly using their flywheel of new users to inform their decisions. Yeah, we kind of use our user base and the prompts that they enter into the system. them to decide how to evaluate the quality of the model and what to prioritize. What's interesting is our users used ideogram to tell us what they want.

Starting point is 00:24:25 So they were like, we want image upload, we want comment, we want more servers. So I guess the good news is we already have this flywheel of users coming and using it. Some are paid, some are free. And that sets a vision for us. Hung from Vigel has actually used these new learnings to expand who they're building for. How are you thinking about who you now build for, right? Are you pivoting or adjusting to incorporate these new use cases? So we are broadened our target audience in this sense.

Starting point is 00:24:57 So we are seeing this as eventually we're going towards this section of a new type of AI power content platform. And the content platform is really important to have all these creators. And those are still the content creators, the artists, the movie makers, the game makers, the game designers. They are the sources for all those new latest ideas, all this new templates. And then we're broadened this into content consumers. Basically, Vigo is a new way to consume content. Before AI, it was mainly like, if I liked the moment, I will share it, I would like it. But there's a deeper engagement you can have with that moment.

Starting point is 00:25:36 I can basically, I love this moment so much that I want to put my own avatar in it. It's almost like in a parallel universe. I want to see how it looks, really, if that moment myself. So this is a new kind of content consumption. And that's actually one of the most important aspect. The virality actually comes from all those creative ideas. So for us, it's all about empowering those creative community first, making sure they have what they want. They have the best two.

Starting point is 00:26:04 They have early access to newer features. They have almost private channels. They have almost unlimited access. The team at Crea, on the other hand, is more focused than ever on experimentation. And their signal for success... When your users are better at using your tool than yourself. How I think about it is that every tool that we launch follows a similar process. And I think that it all starts with a hypothesis.

Starting point is 00:26:30 And I think that this initial hypothesis needs to come from the founder and needs to come from your own intuition. But we are wrong a lot of times and the way how we validate these ideas and when we are wrong is through listening to the community, seeing what they do with the tools. And I think that a good rule of thumb or something that I found

Starting point is 00:26:48 that is like a good North Star to realize when something is good or not, is when your users are better at using your tool than yourself. And that has been key to me because with the real time, like, how are seeing things? That I was like, how the fuck? Did they create that?

Starting point is 00:27:02 And same thing with the video tool. Like with the video tool, I was trying to do a demo, like trying to showcase, like cool stuff, and I was trying things, and I was not getting there. And I was looking at Twitter at all the things that our users were creating with our product. And I couldn't get to that quality. I couldn't get to those results.

Starting point is 00:27:18 So I think that every time that your users are using your product better than what you are, that's a good sign. Meanwhile, Justin and Max at Sivit are charting new ground, but also figuring out new limits. Stable diffusion allowed you to make anything. And so when we launched, I wanted to make sure that we could continue to support that community. It was so diverse. And there is a running meme of things you can make with stable diffusion.

Starting point is 00:27:43 And in the front is like somebody making funny memes. And then there's a train coming that's porn, right? Sure, people know that you can make all of this stuff. I mean, that's the point of this tech. Make anything, right? And so it was important for us to say, hey, we want to be able to support this tech as it develops. It means that we need to embrace all of it. And that's not easy.

Starting point is 00:28:01 It's been incredibly difficult to set up policies that allow the creation of all things in a way that's not going to hurt people and to also do it in a way that makes it so that people still have the level of control that they need to prevent the creation of content that can't be there. In the beginning, our policies were very straightforward. They were kind of just like, look, as long as it's not illegal and as long as it's not just ethically, completely debased, then, we'll let it on the platform. And we were okay when we had the small enough user group with kind of leaving it even like that vague. We found over time that we've had to really kind of specify because it turns out that there are just like subsections of the internet that are into just the absolute strangest things you've never heard of at all. Which can be really funny. Which can be really cool. And some of that is really interesting.

Starting point is 00:28:49 And some of it is just, oh my gosh. And it's like a balancing act of figuring out, okay, what are, you almost have to grow as a person. And we created like a council of moderators around here to on our platform to really kind of like get together and look at when these new things pop up and be like, how do we feel about this? One of the things that really blew my mind when we're getting into the whole moderation aspect was like, oh, we'll just do what other platforms do. We'll do what Imager does. We'll do what Reddit does. We'll just copy kind of like what they're doing. And as we kind of dug into what they do is, they don't define any of this. None of this is defined. We had to come up with terms of how do you define what you define what

Starting point is 00:29:22 you define what is photorealistic. How do you define what is and isn't all these terms? that before really didn't have any really set definition. Perhaps it shouldn't be surprising that there are new moderation challenges since this industry is so fresh, with new ideas coming from a new breed of creatives. In fact, we heard about this range in both prosumers and professionals from most of the founders we spoke with. Here's Victor from Korea.

Starting point is 00:29:48 The range of creatives is quite wide, like the kind of people that use Korea can come from having 20 years of working in the creative industry. and being like, I don't know, 3D artists or people doing graphic design or even photographers or these kind of people. But we also find a lot of folks who don't have a professional creative background. For the professional ones, you can find them doing a lot of prototyping. Like, for example, when they start working on a new project, they may go to CREA to really quickly brainstorm some ideas that they have

Starting point is 00:30:22 and they would use the real-time tool that we have for that. And they can do like a very simple sketch, add a text from, and have something that looks super realistic and that can either give them ideas and maybe even serve as a final deliverable, depending on what they're doing. And when we're talking about less professional creative, it's honestly more about having fun. And they are using crea for everything that you can imagine. From imagining new walls to creating paintings to creating like characters or all sorts of things. And as more participate, these new platforms generate new talent, but also new expectations, like expectations in speed.

Starting point is 00:31:02 And on the meantime, what we're doing is building community and bringing to the community what they want now. Just focusing on what can we do now with the technology that is out there. We are very deep into AI communities. And every time that there's something that we think that is valuable from a creative point of view, we go ahead and we execute it very, very fast. So the way how we work in is almost like a video game company where instead of video games we are building tools and every six months or so there's a new tool

Starting point is 00:31:32 because the space just happened to evolve in a way that every six months there's a new technology that you can use in order to make a new tool. And that's going to keep being like that until we get to these like real-time multimodal systems that allow us to do something way more interesting. This new wave has also shifted people's willingness to pay. Back to a niche.

Starting point is 00:31:52 I think the willingness to pay and the amount that consumers are willing to pay is really high. And that's really interesting because for so long we've had these sort of patronage models for how to fund the arts. And there's been this belief that there's a sort of decreasing interest in paying for art. And instead we're seeing the exact opposite. People want to pay for art and pay for tools to make art and pay a lot. So that's a really, really exciting development to me. And this willingness to pay is also unlocking new business models. People make so many things because it's a tool for creating anything.

Starting point is 00:32:24 And to see the things that people can create, whether that's assets for a game or videos of flowers that are dancing, it's just endless. The possibilities are endless. And it's inspiring to see how people are kind of pulling it to do new things. That was Justin from Sivit, which is also working on a new way to reward AI artists for their contributions. When we were getting this going and we were really like realizing this could be a business was we interacted with a lot of the people who are doing this creation. And it's a lot of time and it's a lot of money and it's a lot of technical skill that goes into making these things well. And people were doing it, thousands of people were doing it just for the love of the game. Like they just really enjoyed the clout and the entertainment, the factor they got from it.

Starting point is 00:33:01 Get their position on the leaderboard. The leaderboard. Oh my God. The leaderboard. And it became pretty clear that, look, this is almost like a whole new creator economy that can come out of this. Because it's a group of people who are putting effort and love into something that could very easily become livelihoods for them if they had even the smallest way to monetize it based on the number of eyes they're getting and uses they're getting. So, yeah, a very clear goal from the very beginning was, like, let's figure out how we can keep the creators monetized while maintaining the open source ethos. We actually just announced something that we're hoping to roll out over the next six weeks.

Starting point is 00:33:30 Let me give you a little bit of history. So we launched a creators program four months ago, and we opened it to a small cohort of essentially 50 creators. We opened applications and took essentially people that met certain criteria and have been experimenting with ways that we can help them monetize their work. What we've landed on for this next generation that we're hoping to open up, these next six weeks is making it so that people can earn for the generation that people are doing on our site. So if they make a resource that's intended to produce a new character, like a consistent character that they've made, and somebody chooses to use that in the generator, then they're going to get their share of 25% of what we charge for that generation. So

Starting point is 00:34:07 the aim is to make it so that these people have a way to get essentially paid for allowing the convenience of using their resource on our site. One of the main things that we saw right away before we had the time to be able to implement any like real monetization stuff for creators was we put it a DM system simply because we knew that there's a lot of people who were contacting creators for work outside of the platform and because of that I mean we just get untold number of people contacting us being like thank you so much this platform because of that I was able to get hooked up with Hugo Boss or Hyundai or some of these other people who are suddenly using this technology and it's completely changed my life before I was making $30,000 a year's a waiter or whatever

Starting point is 00:34:40 and now I'm making six figures doing this like whole new thing that's a passion for me and I have lost count on the number of people who've contacted me about that. So it's really cool to see. So from our services side, we want to kind of enable that and make it even easier for people to be able to sell their services, their expertise directly to businesses from the system. It's not alone here. Levin Labs is also building a marketplace for voices. I know you guys are building kind of a marketplace of sorts, so people can upload voices

Starting point is 00:35:07 or they can use voices that others have uploaded. Yeah, I think it's a really exciting way to give folks a way to earn passive income as well. Maybe you were a voice actor and you weren't getting the gigs you wanted, but now you can put your voice out there and you might become extremely popular. And we've seen people earn quite well on our platform. And so the library is just a great way to, one, put your content out there. And we want to partner with more voice actors, honestly, to have more expressive voices and then give people great voices to create content with. So two-way street. But it's not just the marketplace.

Starting point is 00:35:41 It's also the interface. I think we've always had the dream of voice interactions with all our products. If you think about Star Trek and Knight Rider talking to his car, kit, it's something that's been a part of pop culture history forever. But I don't think we've had the quality and the sound and for it to feel as natural as it should have been. And so I think we're getting to that point where the interactions between large language models using voice interfaces is becoming incredibly natural and feels like talking to a person. And so I do totally see a future where a lot of this physical interface that you're tapping around with is going to just fade away and you're going to be able to ask the questions you want to ask and have the conversations you want to have. I know her is the hot topic movie of the AI space, but I think there was one thing in that movie that stuck with me more than just the interactions he was having with her, which was there was this scene in the movie where everyone was this scene in the movie where everyone was this scene in the movie where everyone's kind of talking.

Starting point is 00:36:39 to something in their ear. And I think that is a very prescient take that they had, and I think we're going to see more of that. It's just going to be natural conversations we'll be having with this AI or any interface. Yeah, it reminds me of my husband's grandmother said that the first time she ever heard someone talking on a phone in the grocery store, she thought they were talking to themselves. Because all of these new interactions, right? You're just not used to, or the people who go to prison and come out and 10 years later, they're like, why's everyone looking down? And they don't realize that we have these crazy computers in our pockets. Totally.

Starting point is 00:37:12 The thing that I love about AI in particular and all these AI creative tools is the magic is you had an idea and now you can imagine it, right? You can imagine the image that you wanted and that was in your head and the dream you had. And now we're saying, you can imagine the sound that you're probably hearing in your head that no one else can hear yet. But it's not just a new UI. Perhaps it's a new approach to modeling the world itself.

Starting point is 00:37:34 Hung from Vigel. One thing I really look forward to is, like I said, the next. generation of the model. So we are really hoping to extend this character model to more the rest of the world, like objects and the scenes. And so I think those are two general passes towards modeling the real world. One is more on, we've seen this pixel level approach. So diffusion models are really good at that. But it has this drawback of, it's really hard to manipulate pixels. And the real world is essentially, it's really, is physical. So pixel is not. really efficient representation for it.

Starting point is 00:38:11 But it has the advantage of you can train with any video and it generally is anything. And the hope there is if we scale it up to a certain extent, controlability will kind of emerge. But we're taking another kind of different paths in that we want to nail down

Starting point is 00:38:27 ability first, making sure it's just as precise, as controllable as a graphics engine and then we scale up from there. So I think this, how those two passes involve, and how actually they can be combined into one immersive experience. As we close out this episode, it's hard to understate

Starting point is 00:38:46 just how much these tools are shifting what it means to be creative, to both existing artists and to those who never would have called themselves artists before. Connor from Udio. The threshold for someone going into a studio and recording something like that was way too high. Whereas now the promise of the technology is that it brings an order of magnitude or two orders of magnitude more people into the creative kind of experience. right? Like, people can express themselves in this way, but kind of even more concretely, as moments happen in the world, different cultural moments, you can attach music to them now

Starting point is 00:39:19 because it can be dynamically attached to these things in interesting ways. And this is super compelling. This is a kind of a market that didn't really exist before just because it wasn't actually possible to explore this way. I think as well as that, we've been fascinated with how at the top level, say with existing artists or existing producers, how this can basically work as an ideation machine, like a kind of well of infinite creativity that you can just pull from for ideas. Maybe you have the beginning of a track, you have a riff, you have a beat, you want to see where could this go from here? If I remix this a bit, what are variations on this? And that's a super compelling thing to do as well, again, because it's something that before

Starting point is 00:39:57 it took a lot of time. And so it just accelerates the creative experience for professionals like that as well. I have yet to meet an artist who's actually used the products that is worried about the products competing with them. The biggest worry that I hear over and over. is that somebody is going to take them away. Diego from Korea with a great reminder of just how monumental this shift is. I was a creative myself doing graphic design, photography. I even tried to make video games in flash,

Starting point is 00:40:24 motion graphics and after effects, digital sculpture in ZBrosh, 3D modeling for architecture visualization. And I was like, someone like I felt the fear of, wait, what's the point if this thing can do everything, right? but I don't think that's your case what I think is happening is yeah we're just giving so much power

Starting point is 00:40:46 to creatives that things that were like a job in a way like now you don't even think about that that's what technology does right like one day it is a lifetime work to move from the east coast of the US to the west coast and people die

Starting point is 00:41:05 on the process to now you're like it took me like 20 minutes at the line to get to the airport thing, and you don't even think about the fact that you flew, like a Greek god, through the planes. Instead, you're just thinking at a higher level. You're just, I don't know, flying between calls to make, like, bigger things, right? So I feel like the same is going to happen. Suddenly, like, coloring 3D models through texture things and all these repetitive things, like, sketching and whatever,

Starting point is 00:41:32 like, you will save so much time, like, of your life because of not having to do that you can focus on having even better and crazier ideas so I'm really, really, really excited to see what the creatives are going to be able to do.

Starting point is 00:41:49 All right, that's all for now. The demos shared during the day were followed by a gallery party at night, showcasing many of the artist's work with a broader New York City creative community. So if you want to get up close and personal with these tools, head on over to a16c.com slash AI art.

Starting point is 00:42:06 to check out their demos and more. We'll leave you with a little sneak peek. Ladies and gentlemen, I am thrilled to be here at the A16Z artist retreat. Yeah, it gets you pumped. So pumped. So pumped. You can generate whatever you want.

Starting point is 00:42:24 That was amazing. Yes. So it is a whole body swap. Wow, this is so good. We're also working on a new type of memes. Actually, I think that it's better if we see it in slow motion. Laura, you should come see this. This changes from being deterministic to being totally random.

Starting point is 00:42:42 Remix. If you like this episode, if you made it this far, help us grow the show. Share with a friend, or if you're feeling really ambitious, you can leave us a review at rate thispodcast.com slash A16c. You know, candidly, producing a podcast can sometimes feel like you're just talking into a void. And so if you did like this episode, if you liked any of our episodes, please let us know. I'll see you next time.

a16z Podcast - When AI Meets Art

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.