Sharp Tech with Ben Thompson - The New York Times Sues OpenAI, Napster History vs. ChatGPT Future, Copyright Enforcement at the Aggregator Level

Starting point is 00:00:04 Hello and welcome back to another episode of Sharp Tech. I'm Andrew Sharp and on the other line, Ben Thompson. Ben, how you doing? Busing, Andrew. I'm buzzing. Oh, boy. It's great to be back, number one. But unfortunately, I don't think that's why you're buzzing. Congrats on the Green Bay Packers making the NFL playoffs. Do you believe in miracles?

Starting point is 00:00:30 They salvage the season out of nowhere here. Do you have any thoughts for the people? Well, I mean, you know, it's a hard road to have the best quarterback in the conference for, you know, 30 years looking forward to adding another decade. But someone has to bear the cross. You know, I'm willing to do it. You know, there's, of course the best experience as a fan is winning the championship, right? You know, good event, being able to experience out of the Packers, experienced with the bucks a couple years ago. But the second best and the more sustainable and sort of long running bit, I mean, the problem with the championship is the moment is awesome.

Starting point is 00:01:03 but everything up to that is just massive stress. And then if your team is good enough to win the championship, the actual seasons aren't that fun because the expectations are so high. So a win is what should happen. Every loss is just sort of a big problem. But when you're in that brief window where your team is better than you expect and you have players that are turning into stars before your eyes, it's truly the best.

Starting point is 00:01:29 And that's where the Packers are right now. And we had to delay the start of our podcast. podcast because I was high as a kite coming off that game. I might crash halfway through this episode, but we'll do what we can. Oh, man. It's a house money season. That's the way to explain it. And yes, it's terrific. I'm happy for you. And I can confirm for everyone out there, Ben is beaming right now. He's bright-eyed and bushy-tailed, ready to rock. I also wholeheartedly endorse your take on No Tech, Ben. I'm going to read it here. You wrote, During that Packers' fourth quarter, it is very important that the NFL never put a chip inside the football because the total imprecision of spotting is actually hilarious and tremendously entertaining.

Starting point is 00:02:16 I don't know whether you have anything to add there, but I think it's downstream from the touchgrass movement, the opposition to replay review, and I appreciated it. Oh, I think it was actually during the first half, not to correct you. But the reason it was timely was at the same time, you were, you know, hilariously watching an NBA game instead of, you know, being a good American. It's great. Which literally had the clock did not move in the time an entire Packers possession took place at the end of the half. It was like a lot of stuff happened. And it's so bad as entertainment product. I actually do think it's a really interesting, you know, we talk about sports and TV.

Starting point is 00:02:57 And at the end of the day, you're selling an entertainment product. And there is a bit where, you know, the pursuit of perfection, right? In this case, ruling out the human component, whether that be replay review in the NBA, which is out of control, whether it be the ball and strike box in baseball, it like, number one, you're never going to fully get it right, particularly in the NBA. But number two, like the human component, that's part of sports. That's what makes it compelling. And in that case, when I wrote that tweet, the Packers had just gotten screwed.

Starting point is 00:03:27 There's no way the Bears, it was a fourth down play. There's no way they actually got the first down. It was a terrible spot. But that is part of what makes sports are compelling because of the human component. No one wants to watch two robot teams go against each other. So why do we want robot refs? And the reason why I think this is super interesting is we've talked about like analytics in basketball, for example. And it's very frustrating.

Starting point is 00:03:52 And the reason why basketball is actually so great is it ends up so many teams that are analytically perfect end up losing because there is such a human. It's really satisfying in the end. Right. And how a team forms a hierarchy and how they organize themselves and how they push through difficult times. And I am a firm believer that clutch is a thing that actually matters. The playoffs are different.

Starting point is 00:04:13 Yes, it's a small sample size, but that doesn't mean you dismiss it, right? Everyone dismisses stuff because it's small sample size. And you see this in tech. You see this in business where something that can be measured becomes elevated in importance just because it can be measured. But just because it can be measured doesn't mean it's the most important. And this is a broad principle. I think replay is a very good manifestation of it.

Starting point is 00:04:37 I do think replay works in the NFL, by the way, just because the nature of the game involves lots of stop, but it is just a replay, everyone, whatever? No, this specific aspect of the NFL entertainment and experience should never be ruined by a microchip in the football. But, yeah, replay has worked well in the NFL context. replay in basketball is a disaster. It's a disaster. And people always get upset when I tweet this because they understandably, they're like,

Starting point is 00:05:02 well, if you can get it right, why won't you? But number one, there's lots of stuff that you can never get fully right. There's lots of calls that are judgment calls. It's very easy to point to the ones that are sort of super clear. But number two, you're sitting down to watch a compelling game. It's an entertainment product. It takes 12, it took 12 minutes for the last minute of game time to happen in that game you were watching. Our friend Kevin O'Connor is watching it.

Starting point is 00:05:26 he timed it. I am completely immune to it at this point. So I didn't even notice. I was just like, man, what a fourth quarter for Anthony Simons. This is incredible. And I didn't even really know to complain. But yes, any stranger you sit down and they're like, this is a tie game with 10 seconds left. And it's the most tedious exercise in the world watching these two teams actually get to overtime. So that's not optimal. But what I loved about your NFL tweet is it is an entertainment product. And you identified an aspect of the fandom experience that I really enjoy. I really enjoy watching two 55-year-old guys try to figure out where to spot the football.

Starting point is 00:06:09 It was a quarterback sneak. A billion-dollar business. A billion-dollar business. Right. Yeah. Everybody's just sort of guessing and eyeballing it. And honestly, given the setup, they get it right. more often than you would think.

Starting point is 00:06:22 So credit to the officials. Same thing with baseball empires, to be honest. The degree to which they get balls and strikes correct is amazing. And you know what's way more fun when you're sure that the ump missed the call, but you also know you're like when you're at a game, right? And they miss a call. You're like, oh, come on. That was a terrible call.

Starting point is 00:06:41 But you're not totally sure because you're eyeballing it. On TV you have that stupid box there. And it's like it's not even fun. You're like you're just mad because clearly he missed it. Whereas if it's on the edge, you can like, I don't know, there's, there's just a humanity to it. And I think there's a humanity to the NFL, beyond all the advantages the NFL has relative to the other sports, that makes it so much better. But, I mean, we could have it worse. As I noted in a follow-up tweet, in my estimation, there's no sport that's been ruined, you know, here's our international leaf than soccer.

Starting point is 00:07:10 I mean, it's like, like, it's a game that is 90 minutes long. It sometimes ends in zero-zero ties. Like, nothing much happens. I mean, I guess stuff's happening if you're in it. But then a goal happens. It's like the most euphoric experience in sport. But now you have to always wait and see if VAR is going to turn it over. It's terrible.

Starting point is 00:07:29 It's, it's, it's, it's, this pursuit. It's just funny about making the most important thing, the most important thing. The most important thing is that it's entertaining. And in so many of these sports, this pursuit of getting it right misses the point. And anyhow, that's my, you know. There you go. When I'm in a good word, I will rant. So you got a good.

Starting point is 00:07:48 No microchips in football and No VAR. Please, no replaying basketball. Yeah. The EU regulators have ruined soccer. As for the intersection of business and technology, it is time to sail the sea of takes once again. I'll read this note we got from Matthew.

Starting point is 00:08:09 He says, Ben and Andrew, happy New Year. I truly hope you both enjoyed the holidays and a well-deserved break. At the same time, I am craving the first Sharp Tech episode to hear your thoughts on the NYT OpenAI Microsoft lawsuit. Now, Matthew was one of several listeners who expressed that sentiment and reached out basically demanding coverage of this issue.

Starting point is 00:08:35 Are you ready to dive in, Ben? Are you prepared? I'm fairly prepared. I need to dig deeper, so I reserve the right to change some of my takes. But I think we have enough here to go with it, particularly given we have a lawyer on the podcast. So we should be set. We'll see where the road takes us here. I'll provide some context at the top. So for anyone who's unaware on December 27th, while the entire world was sleepwalking through the week between Christmas and New Year's, the New York Times sued Microsoft and Open AI for copyright infringement.

Starting point is 00:09:12 I'll read the news from the New York Times itself. They write, The Times is the first. major American media organization to sue the companies, the creators of chat GPT and other popular AI platforms over copyright issues associated with its written works. The lawsuit filed in federal district court in Manhattan contends that millions of articles published by the Times were used to train automated chatbots that now compete with the news outlet as a source of reliable information. The suit does not include an exact monetary demand, but it says the defendants should be held responsible for, quote, billions of dollars in statutory and actual damages related to the,

Starting point is 00:09:52 quote, unlawful copying and use of the Times' uniquely valuable works, end quote. It also calls for the companies to destroy any chatbot models and training data that use copyrighted material from the Times. And then, in terms of the actual claims, I'll read this summary from the Washington Post. They write, there are two main prongs to the New York Times case against Open AI and Microsoft. First, like other recent AI copyright lawsuits, the Times argues that its rights were infringed when its articles were scraped or digitally scanned and copied for inclusion in the giant datasets that GPT4 and other AI models were trained on.

Starting point is 00:10:32 That's sometimes called the input side. Second, the Times lawsuit cites examples in which OpenAI's GPT4 language model, versions of which power both chat GPT and Bing, appeared to cough up either detailed summaries of paywalled articles like the company's wirecutter product reviews or entire sections of specific Times articles. In other words, the Times alleges the tools violated its copyright with their output, too. So, Ben, where would you like to start?

Starting point is 00:11:05 What's most interesting to you about this lawsuit? So I think probably the most, I think the distinction that the post highlights probably is the more important one. And one kind of proceeds too. If it's illegal to scrape, then obviously the output is also illegal. So I think that's probably the more important one. And my, I mean, again, just sort of as a lay person perspective, it's hard to see how this would be illegal. So, you know, so, you know, there is two, I think, cases that are interesting in this regard,

Starting point is 00:11:43 of sort of tech history. Number one is Napster, obviously, where Napster was held liable, and the number two is Google Books, where Google was not held liable for scanning all these books and putting them into their repository. And both of them came down to being a question of fair use. And fair use, I mean, did you, I mentioned before I had my intellectual property class. That was one of my favorites at business school. But I imagine you got into it even more in law school. Yes, I'm familiar. I can read the factors for everybody. here. Fair use. I will read the factors. The court considers, one, the purpose and character of the use, including whether such use is of a commercial nature or is for non-profit educational purposes.

Starting point is 00:12:28 Two, the nature of the copyrighted work. Three, the amount and substantiality of the portion used in relation to the copyrighted work as a whole. And four, the effect of the use upon the potential market for or value of the copyrighted work. So that's a balancing test and courts can sort of weight those factors as they evaluate whether the fair use exception to copyright applies. Right. And it's super fuzzy, right? And I think the fair use was a big, you know, this came up in the Oracle, your Google Oracle case about sort of APIs. But one thing that is interesting about copyright is that unlike antitrust, copyright is explicitly in the Constitution. So it says, you know, to promote the, I'm going to quote the Constitution, Article 1, Section 8, Clause 8,

Starting point is 00:13:14 to promote the progress of science and useful arts by securing for limited times do authors and inventors the exclusive rights to their respective writings and discoveries. Sorry, Congress shall have the power to do that sort of thing. And so it's interesting that that was sort of reserved up front, this idea that there ought to be the capability to protect this because there is a desire and a need to incentivize sort of innovation. But the innovation point, I think, is a really interesting one because it runs on both sides. This is why it ended up being a balancing test, like coming down from the Supreme Court, and it's still kind of fuzzy in every case.

Starting point is 00:13:50 On one side, these sort of innovation and creation is on the side of the New York Times actually creating these articles. On the other side, you can't dispute that like chat GPT or these models are innovative. They're obviously tremendously innovative. That's precisely why they're so threatening to the Times and to all. all these sort of other folks. So I just think that, so in the context of this being important, but the reason why it was important

Starting point is 00:14:15 doesn't actually clearly come down on either side. Now, when you get to these fair use factors, I mean, the purpose and character of the use, obviously it's for commercial nature. So that one is on the New York Times side. Like you pay for chat GPT. It's a four, you know, I was going to say it's a for profit company.

Starting point is 00:14:29 It's a complicated company with a for, for profit sort of aspect. Number two, the copyrighted work, you know, commercial stuff is less protected than like novel creations, right? In this case, I would say that probably tilts to the New York Times. It is relatively novel. It's not like ad copy or whatever, which is much sort of less protected.

Starting point is 00:14:51 Number three, the amount of the substantial of the portion used relation to copyrighted work. To me, this is the one where these models win in a landslide. This is sort of the transformative test. And to my mind, it's clearly transformative. You're going in, you're absorbing information from the entire internet, and you're spitting out something new. If that is a crime, then Stratectary is a crime. Like, that's what I do. I go around and I absorb lots of information and I output something new. Now, I would argue that, you know, Stratacery is further protected because I think the ideas I'm coming up with are innovative and

Starting point is 00:15:27 sort of new to the world. And are LMs doing that? Or are they ultimately constrained by sort of like what's in the, I mean, whatever. This is like a philosophical debate about do you believe that that humans are just a biological process at the end of the day and at which point the computers will surpass it or do you think that there is some sort of like something unique or like the, you know, the divine spark or what it will. But even if you think it's the former one, I think most people by and large, except that LMs as they exist today, as in the one that is being sued by the New York Times, is not yet at human levels of creativity, you know.

Starting point is 00:16:00 Right. So that leaves number four, which is I think probably the interesting one, which is the effect of the use upon the potential market and the time saying, look, this is going to screw us. Like, this is, you know, replacing our business. And I'm, I would think I fall on the sort of large language model side there. In this specific case, even if broad picture you want to back up and say that, well, yeah, of course, this is a threat. You know, the entire internet is a threat, right? But you have to draw the line sort of somewhere. So that's sort of my initial read of the factors. I would fall on this particular part. I would fall on the, as far as the scraping goes.

Starting point is 00:16:39 I think it's fair use. I do think there's a broader debate to be had about, you know, is there a step change when a computer is scraping everything as opposed to a human reading everything? Right. I think that's the ultimate question. The causal mechanism to me is pretty straightforward. Yeah. You know, and it's, I am confident about certain aspects of my reaction to what's happening here and less confident about my answer to that central question that's going to underlie this lawsuit and will underlie other lawsuits that are brought in the next couple of years as all this gets sorted out. There are lawsuits happening in the image space, by the way.

Starting point is 00:17:16 Like, Stable to Fuse and stuff like that. What this is new here is this about like text? Yeah. And the aspect of the Times lawsuit that is interesting is, I believe, some book authors headline by Sarah Silverman. And they brought a suit against a meta last year that was dismissed. And the judge in that case basically said, you need to show infringement from the output if you want to carry the day in court under copyright law.

Starting point is 00:17:44 And so the fact that these models were training on your data is not persuasive as far as an infringement claim. And so the Times lawsuit does cite examples where Open AI, as the post says, appeared to cough up either detailed summaries of paywalled articles like the company's wirecutter product reviews or entire sections of specific Times articles. And I think the Times wants to be there. And also Open AI is going to want to argue that point as well because they have already come out and said what the New York Times did to generate those articles was inorganic, not how anyone would ever use our product.

Starting point is 00:18:23 And this just goes to show how manipulative and disingenuous the New York Times is being in this litigation. We were in the middle of negotiations over a licensing fee, whereas the Times is going to say, look, all these other courts have said, you need to show verbatim infringement. And we're showing that. And then we can also show that the training data was used and is clearly valuable to open AI. And so let me jump in on that one. I think this is, you know, you are jumping ahead of the output, but I think it is an important point where, you know, so to generate this output, like you're putting in a very specific prompt.

Starting point is 00:18:57 Like this article published on blah blah blah starts out saying XYZ what's the next paragraph. Right. And so it's like a to me this I don't. It's sort of a red herring in my opinion. Yeah. I mean, the fact you to go to that to that extent to get it to do it kind of makes the point as you're sort of noting. But also if that is illegal, then like copy machines are illegal. Right?

Starting point is 00:19:22 Because look. Well, isn't it also pretty easy for open AI to go in and fix? anything that is offending in that way? Which they're going to do or probably already have done. But at the end of the day, there's this sort of broader question that I think actually applies to lots of stuff, which is, is capability criminal, right? Or is the one actually, is the person actually leveraging the capability, the one sort of committing a crime, right?

Starting point is 00:19:50 And this could go like, you could do this about guns, for example, right? This is like a classic, you know, sort of question. But you see this about like, you know, questions of like internet censorship, right? Is not exercising a capability to endorse something, right? Does a sub-stack not kicking off Nazis on the platform mean they endorse Nazis, right? Like there's, you know, there's in this case, does the fact that you can force the, you know, chat GPT to output specific text from an article is that evidence of chat GPT violating copyright? Or is it evidence of the user?

Starting point is 00:20:27 violating copyright by leveraging a tool that lets them do it. I could sit down and on Stratory, I could type verbatim a New York Times article. And the New York Times will sue me or file a complaint. They're not going to sue Apple because I typed it on a MacBook. They're not going to sue WordPress because it appeared on WordPress, right? They're going to sue me because I'm the one that sat down and typed out the article. And in this case, like, if you have to go to these lengths to spur. this sort of verbatim output. Where is the line between criminalizing the tool versus criminalizing

Starting point is 00:21:05 the one using the tool in a very specific way to achieve an illegal outcome? Right. Well, and it is fuzzy or at least subjective. Like there's not necessarily a bright line test here because that's what Napster would have argued 20 years ago saying we are not the ones uploading any of this material or downloading it. And so we're not the ones infringing. Ultimately, they went down for contributory copyright infringement. Right. And that's a, well, maybe. That's why there's a fuzzy case, right?

Starting point is 00:21:34 Because there were legitimate uses for peer-to-peer file sharing. But Napster was nailed by the fact that everyone knew that's what is used for. The court basically said, you know what this is being used for. I don't know if they explicitly advertised it or not. They didn't make money. They also tried that we don't make any money from this. Like, well, you're creating this very valuable sort of network business, which is, which I think was.

Starting point is 00:21:56 an accurate assessment. Right. Yeah. I mean, so I think that was, I think that was correctly decided. But it is the right callback about this specific question, which is Napster wasn't committing the crimes. I was committed to crime as the sort of person. Me too.

Starting point is 00:22:11 It was the first person on my floor to find Napster and introduce it to the entire dorm. I think the statute of limitations have passed. So I will admit that now. It was such a step up from going to random FTP servers with addresses you found on the internet, which I can't believe you see. centralized resource that we can all use. That's right. That's right. And so it does highlight this sort of point. I mean, and to my mind, the fact that you have to go through to such lengths to make chat GPT sort of violate copyright relative to Napster where it was built around making that easy. And not only that, like the default output of Napster was not transformative. You were getting the exact same song. That was a very cut and dry copyright violation. And so that's number one, the difference between. that work and the bulk of what's being generated by any of the generative AI tools, or in this case, ChatGBT, BT. The vast, vast, 99.9% of content is not a cut and dry copyright violation.

Starting point is 00:23:12 And so that's sort of an obvious distinction right off the bat when you have people saying, this is ChatGPT's Napster moment, which was a popular reaction when the New York Times filed this suit. I think there are some obvious differences that sort of distinguish this case and the legal questions from what we were dealing with with Napster. Right. And this is why I brought up the innovation point to start and where the Google Books one is sort of important. In that case, like, again, that was about input, not necessarily about output. And Google has been careful about what is allowed for you to actually surface books, which, by the way, is a tremendous shame. I mean, on one hand, the court is like, look, this is transformative.

Starting point is 00:23:51 It is very innovative. It does add value. And honestly, it's copyright that has restricted. There's so many books that are very hard to get access to. That, you know, particularly as copyrights been extended forever and ever and ever. I mean, only today can we make fun of Mickey Mouse or whatever it is, right? Because it kept getting extended. The fact that we can't use Google Books, I feel this a lot because I often want to cite or get into sort of old books.

Starting point is 00:24:17 When I write my more historical oriented articles, I will go through, and I will in a morning, just plow through a bunch of books about that specific company or acid, make sure I, you know, refresh my memory, I'm at speed, all those sorts of things. And of course, if they're available to buy in electronic form, I will get it. But if it's like paperback only, it's so frustrating to have that excerpt come up on Google Books and no, I can't go deeper. I can't even buy it, right? There's no sort of option. That's interesting, though, because I actually think it's working exactly as intended, where Google Books will surface certain materials. and now I'm drawing on my legal career where I would be doing legal research and come across different books that were written 10, 15, 20 years ago and discover excerpts that looked helpful.

Starting point is 00:25:05 And then obviously, that would be a limited excerpt that I would find on Google. And then I would just go to Amazon or wherever and buy a physical copy of the book. Right. It's just harder to get a physical copy or something. Especially if you're writing the article that day. This is my own fault of procrastinating. That's true. more time than you so I could go back and read like books about the Russia oligarchy in the 1990s

Starting point is 00:25:29 and they were available on the used book market. And so I think that's theoretically the way the court was hoping the system would work by allowing Google to retain some of the samples. Right. And at the end of it, though, even with your point, the court got it right. Google Books is good for books. It's good for society. It's an innovative product. Right. The balancing task came down on the right side in that regard. And, and, you know, and this is why I go back to the innovation point. Like, it's part of the question. Of course, we want to incentivize authors and we want incentivize publications, but the goal is the public good. And this is a issue with sort of patents in general. It's like, it's very easy to get stuck in the details and stuck in the, you know, like what

Starting point is 00:26:13 should be patent and what should be, all that sort of stuff. What's the goal? The goal is sort of like the broader good of society. And the point being is, what is a patent? What is a copyright? A copyright is a government-granted monopoly. And it has all the problems that monopolies have as far as limiting competition, as far as like all those sorts of bits, right? Now, we do that because we want that to happen.

Starting point is 00:26:38 I benefit from copyright. I'm very grateful for copyright. All of the content that we create is copyrighted. Now, it's obviously hard, can be hard to enforce. and there are people that do violate it. But by and large, the fact that, number one, it exists, and it's generally accepted that, you know, by good folks that you ought to sort of follow it and then they will pay to get access to content, like, that makes my business possible.

Starting point is 00:27:04 I'm certainly not a copyright absolutist by any means. And on the other hand, though, it's hard to, like, again, it's going to be pretty easy, I think, for Open AI to make the case that, what we are doing is innovative. And it is to the broader benefit. And that is going to, that I think is going to make it hard for the times to win, particularly when their main point is the sort of potential market.

Starting point is 00:27:30 Because the irony of this sort of point number four, the effect of the use upon the potential market for value of the copyrighted work, is that I actually think that AI is good for the New York Times. Interesting. Explain. When the world becomes a wash in content, to an even greater extent, that it is today, the value of brand is only going to increase.

Starting point is 00:27:53 The value of sort of reputation. Now, of course, there's some people who are listening to it's like, oh, the New York Times brand is terrible. But like that your distaste for that brand is offset by the deepening trust and embrace of that brand by sort of other folks that sort of, you know, agree with you. And everyone sort of knows, yeah, the New York Times, you know, what is, you know, maybe they have a particular political point of view, but they also have like 1,500 reporters and they spend a lot of resources and there's a lot of facts.

Starting point is 00:28:18 their articles, right? That was part of the lawsuit that I saw a number of people having fun with. Like there were four or five pages of puffery talking about how amazing the New York Times is and this ambitious journalism they do all over the world. And some people were rolling their eyes there. But that is ultimately part of their case, which is that this takes money. We spend the money. And the Constitution has incentivized people to spend the money by imbueing them with intellectual property rights. and copyright and that's being violated. And that's what's ultimately at stake here is that model. So it's just a rejoinder to some of the people who are snarking at some of the New York Times stuff

Starting point is 00:29:00 and basically looking at it as the New York Times sniffing its own farts. That's not what was happening in the lawsuit there. Well, maybe a little bit. But that's why I put that out there, right? Like the fact of the matter is the New York Times has earned the right to have everyone making fun of them for sniffing their farts, right? Everyone cares about the New York Times, even if they hate the New York Times. They do spend the money. Yeah. Right. Exactly. And even if you hate like half of the New York Times, they do, they do good ambitious journalism that only they have the resources to do. So the New York Times is not a monolith, I think, is the correct take.

Starting point is 00:29:32 The New York Times text section versus versus tech is. We've talked about it. I've written about it. You know, like it is a thing for sure. But again, it's a thing also because they invest in it being a thing, right? Like the, you know, I wrote that article years ago in defense of the. New York Times. It's true. And so that is something valuable that is sort of worth protecting. And I think that value is only going to increase in this world of generic content, right? The fact that you go to, you can look at the URL, it's NYTimes.com, or you can open their app. That is actually going to become more valuable, not less, as this sort of, as this world sort of periferates. So it is ironic that the point that is arguably most in their favor of this sort of test, I would argue, actually goes another direction. Yeah. Well, that's certainly the point that all of these companies are going to have to argue as they try to sue different AI companies, the effect of the use upon the potential market for or value of the copyrighted work. And it's pretty speculative right now because chat GPT is certainly.

Starting point is 00:30:42 phenomenon, but I wouldn't say that it's replaced reading the New York Times. Well, here's an example. I have some group chats where everyone knows that I have a subscription to everything, right? And so some random article will come off to be like, oh, can I get a PDF of this, right? And I, you know, maybe I should scratch this, but there is a capability in iOS to sort of print an article and then you can share it. And that is work. I have to go through some tedious steps. But it's actually easier than trying to prompt chat GPT to get an article.

Starting point is 00:31:12 from like five years ago, should the New York Times sue Apple because of that capability, right? Should they sue us any phone that can take smart, can take screenshots? Because that is like, that's the sort of level of thing that's happening here. And, you know, again, for the specific sort of in this case. Now, again, there's this broader point about this distaste and discomfort with the fact these LOMs scrape the whole internet, right? And there are companies like Reddit, for example, that I'm sure are watching this case very closely or stack over for whatever might be where their content went into, you know,

Starting point is 00:31:46 is an important factor of these being sort of compelling. But, uh, and so, so it's not, it's not just sort of the specific. I think that that's, to your point, it's a red herring. That's actually not what's at, what's at issue here. There's a bit where we've been spending money on content or in the case of like a Reddit, we've been spending money to run a service for years and years and years. And suddenly there's this huge opportunity. of which we are an essential input, and we ought to have access to that.

Starting point is 00:32:17 And you can understand, like, you can understand exactly why they would make this case. Yeah, no, exactly. And I think, honestly, it's downstream of two themes that recur over and over again in the modern tech conversation, which is, number one, you have regulators and media today trying to be extra vigilant after basically ignoring tech and not taking tech seriously for the first 15 years of this century. To society's benefit. Well, they've now realized you can't put the toothpaste back in the tube with like the

Starting point is 00:32:49 WhatsApp acquisition for Facebook or the Instagram acquisition or Google and Android or you can go right down the list, stuff that just nobody cared about at the time. And now people are like, wow, okay. So why did we set it up this way? And I think there's the worry is that, okay, if we don't take action now, what will happen to us is chat GPT just going to eat? the internet and become the internet for everybody. I don't really think that's a genuine concern. I mean, like, you go back to Napster. Another difference between this and Napster is that Napster was

Starting point is 00:33:22 incredibly useful, whereas Chat Cheap-T, you've been over this, a bit of a parlor trick that gets old after 20 minutes or so. So I don't personally take it that seriously. Although both prize by college students. That's right. Chat-G-T-T, very important to college students who are just trying to get a B minus or a B and make it through. That said, though, I was thinking about the Napster stuff because I dismissed it out of hand when people were like, this is Chat GPT's Napster moment. And then I thought about it a little more. There are some broad similarities where, like, obviously, number one, Napster was this massively

Starting point is 00:34:02 successful cultural phenomenon that then had this glaring Achilles heel, which is essentially, don't ask about copyright law that may or may not apply to what's happening here. That has been the story with ChatGPT for a while now. But I think the real problem with Napster in the end was that the scale of the infringement that the platform was facilitating was just massive. And so the simple act of downloading a song from me to you doesn't necessarily create like urgent copyright concerns. But if you're suddenly going to have 50,

Starting point is 00:34:38 million or 100 million people acquiring music that way and then eventually acquiring movies that way, then that poses risks to the entire entertainment business. And so with demonstrable risks, right? You can look at the revenue. Right. Obvious risk. Yes, exactly. By the time Napster went crashing down, you were already seeing the impact, whereas

Starting point is 00:34:58 it's more speculative. But with OpenAI, it's not that anyone news article is important to the LLM. It's not even necessarily that anyone news organization is important to the LLM, but if technology makes it possible to train on every piece of media in existence and eventually create a competing media product, then that's going to raise understandable copyright concerns. And it does go to the heart of why this exists. I was with you all the way to the very end.

Starting point is 00:35:31 It's going to raise understandable what is the nature. of my business concerns. But that's not copyright concerns. And I think that is the important distinction. At the end of the day, there is a broader discussion about the business impact of AI, just like there was about the business impact of the internet, right,

Starting point is 00:35:53 on all these sorts of things. That's distinct from, is there a crime being committed here? Is there a copyright violation happening? And now, to be fair, this does get to, point number four, which I think you're referring to, the effect of use upon the potential

Starting point is 00:36:09 market. But I do think it's important to draw this distinction. This gets back to some of our antitrust debates, which is, yeah, it really sucks to be a supplier for Amazon or it really sucks to be a publisher wanting to get links on Facebook or Google, but whether it really sucks is distinct from, is there an actual violation of law happening here? And that's part of what's so interesting. thinking through all this, it actually reminded me of Amazon, not the antitrust concerns per se, but the pricing algorithm, the project Nessie stuff. This is the question of all this. It's the scale.

Starting point is 00:36:46 The scale makes it feel different. Yeah, it changes the nature of what's happening so that you have a business practice that has existed in Amazon's case, like the idea of monitoring your competitors' prices has existed for hundreds and hundreds of years. but if you're monitoring like thousands and thousands of categories 24 hours a day constantly, it sort of changes the nature of what's happening. It feels bad. At least arguably, yeah. So whether it's illegal or an ethical is sort of an open question. I'm laughing at you, but I actually, I completely agree. Like they're the, you know, and this is, you know, this is my frustration around the, around the Amazon thing,

Starting point is 00:37:29 which is I actually completely agree that it is different, right? Being able to track every price in the world at a moment's notice and adjust yours on the fly, that is fundamentally different than I have a shop and I walk down the street and I see the other shop as a lower price. So I walk back and I get on my label printer and I make a new price. Right. It's just fundamentally different. The problem is that under the law, there is no distinction for scale. Now, again, the laws can be written and laws can be changed, which has always been my position.

Starting point is 00:38:03 If you do think there's a change, then the answer is to make a law for this situation. And I think that applies here. My view on this is I don't think the New York Times has a case here. I think they ought to, under all precedents, should lose in court. I think it's fundamentally transformative. I think the innovation angle tips in the favor of Open AI and against the New York Times. And I will even set aside what I think is true, which is that I think this is actually good for the New York Times business. It's terrible for like all the random aggregators in the world, but it's great for the New York Times sort of specifically.

Starting point is 00:38:36 But set that one aside. I think on the merits of this case, I think they should lose. That said, I will be sympathetic to, I'm not sure I will support, but I will understand if they want to propose a law about this, about scraping, that there is something fundamentally different here. And I think this, to the point, applies to lots off the internet. Scale does change things. Yes, you can zoom out and it's all a spectrum. But at some point, when you're dealing with the difference between a human capability and the capability of computers, it becomes something different.

Starting point is 00:39:10 And I think that's totally valid. And not accurate behavior becomes a concern. Right. But that transition, that shift, that calls for new laws. It doesn't call for perverting the laws that we have, which were written for another era. Right. And this is the area of the case where I am not terribly confident in what should happen and or what will happen. I do think there's room with that fourth factor for a court who's sympathetic to the New York Times to wait the evidence and say, all right, this is

Starting point is 00:39:42 infringement and we're going to rule in favor of the New York Times. I think it's easier, certainly, given the precedent and given what's at issue now to to weigh the evidence and say what's happening to the New York Times is ultimately sort of incidental and side with Open AI. I think that's probably the most likely outcome. But it'll be interesting to see how it shakes out. And I just want to read just to just to zoom out. If I go back to that number that last bit about I think this is actually good for the New York Times. I think there's a good argument.

Starting point is 00:40:14 This is good in general. Like does anyone actually like the current environment where like where it's actually hard to is Sports Illustrated using AI? Or are they just paying some crappy contractor to turn out 57 articles a day to rewrite some other, some other source? Right? Like there's a bit where everything has gone rotten in many respects on the internet because there's just so much crap out there.

Starting point is 00:40:39 And this chase, you know, a business model that depends on advertising where you have no targeting capability. All you can do is just try to get more inventory, turn out more stuff. It's not good. This is sort of a narrow articulation of the sort of accelerationist point of view, which is like, look, where we're at actually sucks. We can't go backwards? So can we get through it already? Can we actually get rid of, like, I love Sports Illustrated.

Starting point is 00:41:07 I was subscribed to it from like I started sports social kids. I just described to Sports Illustrated all the way through my childhood. It holds a special place in my heart. And it holds a special place in your heart and in your pocketbook. Maybe you still have like retirement accounts from Sports Illustrated. Right. But like, but at the end of the day, that model's done and broken. And I'm sad about it, but we're not going to resuscitate it, given the current business model. We're not going to resuscitate it by killing chat GPT. It was of its era. And that era is over. And the sooner we get to a new one where we have, you know, I wrote about this in terms of publications early on in trajectory and just saying like, look, the way we are going to get to a future is the sooner all these die is actually going to be better. Not because it's sad to die. It's sad that people. laid off. It's sad for all these things. But what's also sad is going to almost any publication in trying to read anything. And there's five gazillion articles and it's all garbage and it's all

Starting point is 00:42:00 crap. And once that's cleared away like a forest fire going through, there's an opportunity for new growth for building new businesses that are predicated on the internet, that have internet cost structures, that have business models that make sense. And, you know, frankly, so much of Are you sure about that? My worry is that the forest might just burn down and never regenerate? It's totally possible. What I am sure about is we can't go back to, like we're not going backwards. So it's rather do I want to go backwards?

Starting point is 00:42:32 Do I want to try to go backwards or try to preserve, which I am sure is guaranteed failure, or do I want to roll the dice and go to a future where it might still suck, but maybe it'll be good again? And that's, I think that that's all this, you know, that's the sort of accelerationist viewpoint, I think broadly, at least the way I sort of interpret it, which is we can't go back. Like, like, and so let's let's get to the future. I mean, right now we're in the middle of a forest fire here on the internet together. Everything sucks. And truthly, we have been for like the last decade, right? Like, it's just been like five years. Yeah. It's getting worse and worse. There's just not much digital media worth consuming. beyond like podcasts and subscription publications that are run independently. There's not a whole lot of establishment digital media. But to that point, your little rant there, it dovetails perfectly with this tweet from

Starting point is 00:43:28 Walter Isaacson. He says, these will be the most important cases for journalism and publishing in our lifetime. If AI companies have to cut deals with news organizations and publishers to license their content feed for use as AI training data, that could save. local journalism as well as magazines and publishing. It would provide a business model that supports people who report things, and it would place a financial premium on accurate, high-value journalism. AI systems will compete for which has the most valuable, reliable training data. Kudos to Axel Springer, Matthias Dupner, and the AP for leading the way, and the New York Times for making the legal case.

Starting point is 00:44:09 So, Ben, I really want to believe in this vision from Walter Isaacson. I'm in the the middle of his book on Steve Jobs. I actually had it. But yeah, I think I preemptively took that one out. What local journalism? Have you been to your local journalist newspaper? That's the issue. This is where the toothpaste is already out of the tube and the issue wasn't AI.

Starting point is 00:44:30 It was, you know, Google and Facebook serving ads much more effectively than any local journalism or most national newspapers. And so I think we could sort of dispense with that argument. right off the top with all due respect to Walter Isaacson. Again, really enjoying the Steve Jobs book. But the other question that this raised is, how profitable are any of these chatbots right now? Like a lot of the conversation surrounding this New York Times lawsuit presumes that there's this huge pot of money or this huge pie of revenue that needs to be divided to include media. and I'm just not sure that's really accurate right now.

Starting point is 00:45:15 And I look at like the structure and I'm a little fuzzy on how they ever become like crazy profitable. And so I'm wondering whether you have any thoughts on that aspect of it. Well, I mean, for sure, I mean, Chatsyptu by all accounts makes as very high revenue, but there's no way it's profitable. This stuff is astronomically expensive to sort of run and to do. And so as I will say frequently, you have to look where things are going. I think it's reasonable to say that, you know, at some point this stuff is going to be very profitable. You know, just as like Google or Facebook or whatever it might be, that's a reasonable position to take.

Starting point is 00:45:50 What's not reasonable to me is that you ought to be given money just because you used to exist. Like, this goes to some of these like the Australian law or the Canadian law about that Google and Facebook have to pay news publishers. And it's like, why? Right? Like there's this sort of like, just because you were there, you deserve to be propped up. This is the propping up of incumbents by government. It's the sort of stuff that I thought we were like we should not be very happy about. If you want to make money, how about you have a business model that makes money?

Starting point is 00:46:19 Or if you want to prop up a company, then levy a tax and and do it, right? Like the UK with the BBC, right? Now, you can have a philosophical debate about should there be a television tax or whatever to sort of fund, fund news. But at least it's honest. Right. At least it's honest. These Canadian and Australian laws where we're going to make up an economic case that these companies are only valuable because of our news journalistic organizations and therefore they have to pay them, which defies all economic logic and is shown by the fact that they have to force them to serve the country because Facebook, Facebook in particular is the most absurd one. Like Google, you can make the case.

Starting point is 00:47:03 Like if Google didn't have any local Australian news, like it's a less valuable service. Facebook, they don't people, the new services themselves go to Facebook and put their links on there because they want people to quick through. Like Facebook's not, Facebook can cut them all off with zero harm to their business. So how like, and so the part of this tweet that actually bugs me the most is not the sort of toothpaste out of the tube aspect. It's making up crap about these companies and their economic value. Their economic value doesn't exist. If you, now, like I said, there's lots of stuff in society that does not have economic value that we think is important and therefore we collectively, via taxes, decide to fund. If you want to do that, fine, we can debate about this, but don't make up economic facts. Well, wait, they're presupposing. I mean, I guess Walter Isaacson and not presupposing, he's just saying that they do have economic value to open AI because the training data is particularly valuable. And that's not.

Starting point is 00:48:06 entirely wrong. I mean, it may not be that they're essential to open it. It's tough, right, right? Right. This gets that would be his argument. Like Google's economic value comes from all, all the sort of content of the internet. If the internet went away, Google would be worthless. It's like, yes, that's correct. But like any individual publication is not valuable, right? And there is sort of a collective sort of action problem. But the issue is your, you're, the collective action is for all content. on the internet where anyone can post, right? My content is important to Google. I can choose to go online and to post it. If you want to collectively organize against Google, are you going to include me? Are you going to include media authors?

Starting point is 00:48:51 Are you going to include social media? Are you going to include all, like, forum commentators? All that stuff is super important, right? And so, again, if you want to have, you want, I think it is philosophically coherent to, to say that these aggregators benefit from the collective action of everyone. Therefore, the way you solve that is through collectively monetize, like, again, it's taxes. It's government.

Starting point is 00:49:20 Government is a collective action problem. That's why it exists. What really bugs me is this, what's implicit in this sort of point of view is that there's only some content that's valuable. And it just happens to be the organizations. that pay a lot of money to Walter Isaacson, right? Like, or whoever it might be or maybe not him specifically, but his cohort, right? Of like, you know, journalists and authors in their 40s and 50s.

Starting point is 00:49:47 They all strongly believe that their work is more valuable than mine. I can promise you that. Well, yeah. And for me, I just do a double take because at least Google and Facebook in the case of those laws. Number one, those countries are being more explicit in saying this has social value. So we'd like to redistribute some of the. There's an aspect of what they're actually doing. I'm somewhat okay with.

Starting point is 00:50:12 It's the lying that bothers me, right? What you're doing is you're levying of money there. There's piles of money with Google and Facebook. They are doing extremely well. No, no, I'm just saying. What happens to be a pro-America podcast. As a distinction, as a distinction from the current setup where Open AI is valued at an estimated 53 times, it's 2023 sales.

Starting point is 00:50:34 the Times is valued at three times. It's 2023 sales. A lot of the value in AI is speculative and market-based as opposed to just generating piles and piles of money. It's a bet on what it might be. It is an expected value bet. It includes the possibility that's going to zero, right? Like that's the... But so the Times in its lawsuit is also throwing these numbers around.

Starting point is 00:51:01 Like Microsoft is $90 billion. I mean, the Times misrepresenting, you know, tech is like, we've already addressed that. I just want to state for the record, none of these things are making any money yet. So the idea that the New York Times and all these other organizations are being completely screwed rings a little bit hollow to me. So a couple of other questions. Yeah, I should, yeah, you're, you're, I'm with you. I'm with you. I should not just let you cook.

Starting point is 00:51:28 There you go. I'm American. God damn it. Andrew, not me. emails in and says, Ben, in the past, you and Andrew have briefly remarked on the potential for automated enforcement of copyright infringement. Wait, did we do that? Do you really?

Starting point is 00:51:42 I don't know. I'm sure somewhere along the way we have. Do you really think such a model is possible? Any such system would have to be automated. Real artists may release novel IP one morning and be infringed upon by the afternoon. So someone will need to create a highly scalable and available registry of copywritten material, the government will have to recognize it, and then all models everywhere will need to train and validate against this system. Then there will be entire data centers of copyright trolls

Starting point is 00:52:13 generating and registering as much novel IP as they can get away with. There will need to be an appeals process from there. All of this seems highly impractical to do at a global scale in a way that makes everyone happy, and there will still be pirate LLMs that people can run on their iPhone that will evade these restrictions, or maybe Apple and all other hardware vendors have to ensure that only approved models could run? I don't know. This looks like a classic case where nobody has a good idea of what will happen when this friction is removed, and the resulting landscape is unlikely to look anything like the current one. The New York Times position in its lawsuit seems reasonable from its perspective, but I just don't see how a solution would be workable.

Starting point is 00:53:00 Ben, what do you think of that? I think I know what he's referring to. I talked about this in the context of whenever there's that Usher song that was made by AI and then there's like, you know, YouTube take downs and all those sorts of things. That's exactly when it came up. Yep. Yeah. And so my point there actually, I think, I'm glad he brought this up because it was a good sort of reminder was a, so there's a, there's this aspect of average. right? YouTube is an aggregator. YouTube, an arrogator is also a choke point. And so we see this in the context, again, to go back to the internet censorship point,

Starting point is 00:53:32 which is this is a throat to choke. It's not the government, right? There's no First Amendment concerns. And so there's a lot of pressure on a YouTube, on a Facebook, on a Twitter or actual, whatever, to sort of limit certain content because they're capable of doing it. And you can have, again, a philosophical debate about, like, the law of free speech versus the spirit of free speech and all these sorts of things. But just from a technical perspective, that is,

Starting point is 00:53:54 like sort of a throat to choke. Now, set that aside as point number one. Point number two is I have this idea of a framework for moderation on the internet, which is I actually think it's reasonable for aggregators, particularly ones that are algorithmically serving content, to have more aggressive sort of moderation schemes than the First Amendment, right? Like at the end of the day, if Facebook is promoting posts to you or the YouTube algorithm is pushing stuff in your face,

Starting point is 00:54:21 I think it's incumbent on them to actually be careful about, what they are pushing on. You move down the stack to something that is not have an algorithm. That is just a platform that lets you post stuff. There, I'm very hesitant. I think they should be much closer to what is legally permissible, but, you know, I can understand if they go a little further. This sort of touches on the substack one, too, a little bit.

Starting point is 00:54:43 Then you go down to infrastructure. I am resolutely, absolutely opposed to any sort of censorship at the infrastructure layer. Your ISP should not be reaching in to examine the packets going over their connection and cutting some off and not cutting off others. To me, that like, I think as you step down the stack, algorithms stuff that's highest, most moderation. I think that's reasonable. Platforms, very limited moderation, infrastructure, no moderation. And I think that's the way to think about this very, very tricky problem in context. That's point number two.

Starting point is 00:55:17 point number three is what I think is super important on the internet is the internet we've talked about the context of LOMs the difference between deterministic computing where you put a number in you expect a number out and if it's not that it's a bug somewhere in the program versus an LM that's probabilistically give you what is probably the right answer but it's not necessarily right and it's incredible how accurate it sort of often is and and that there's sort of a distinction there I think that applies to the internet versus the analog world the analog world you go and and print something, right? There's actually, and there's like, you actually went to a printing press and there was this sort of point of friction that I think lends itself to a more sort of deterministic enforcement. Like, did someone actually copyright that book? Did they actually print an illegal version or did they not? It's very black and white. Once you get on the internet, it gets a lot fuzzier because it's trivial to duplicate something.

Starting point is 00:56:10 It's trivial to sort of spread something. And so I think you need a more probabilistic sort of approach where you want to knock off the big stuff, but you don't want to go too far because you're going to, it's going to be impractical, as Andrew notes, and you're also likely to, to chill sort of expression. And so in this case, so, so let me do what point number four, which is trekkery. Okay. Mr. Techery, it's copyrighted.

Starting point is 00:56:33 I, you have to, and you have to pay to get access to my content. Now, can you get access to my content without paying? Sure, right? There's companies that will, like, set up one email address and they're spreading it all internally, right? And it's like, it's ridiculous that they're doing this, but it does. Don't do that. Please don't do that. Taking food off my table.

Starting point is 00:56:52 It's just a crappy thing to do, right? But like, you know, you gives me inside and do it, I guess the culture of that organization. But what I don't want to do is there's lots and lots of people who don't do that, who come in and pay. And I want to give them a great experience. I want them to get the emails, no problem. I want to be able to click at the email and they're logged in. They don't through a paywall. They get it to it super easy.

Starting point is 00:57:12 They, you can go to the show notes of this episode. You can click something and log right in. I spent a lot of money to make that possible because I want there to be a great experience for my readers. And if the cost of that is some people being jerks, then on the internet, that's just a cost you have to be willing to pay. That's just how the internet works. If you want to have a large audience, you're going to have bad actors, and you need to

Starting point is 00:57:34 sort of accept that and deal with it. Particularly when I first started, there's people that always come to me for advice about doing this, right? And there's some people that would get so worked up about people pirating their stuff. And it's like, you're just not going to make it. On the internet, you have to have an abundance mindset. You have to think I'm going to make up for the cheaters by getting that many more good people because you're never going to sort of limit it.

Starting point is 00:57:54 And correct me if I'm wrong, but in order to police the potential cheaters, you would have to make the user experience just a lot more inconvenient. Netflix ran into this with the password sharing thing where there's a version of their enforcement where you have to sign in every single time. I have to do that with the NBA app, which is not an enforcement strategy. that's just them having a fucking horrible app. But it makes me want to delete it and just never use it again. And if Netflix, if it was just really inconvenient to use Netflix anytime I wanted to,

Starting point is 00:58:27 eventually I would unsubscribe from Netflix. So there's a business rationale also in Stratory's case for not being psychotic about it because ultimately you can grow the pie. I grow by word of mouth, right? So I want to be easy to share my stuff. Like people quote unquote, I put it in the bottom of every email. I'm like, hey, if you want to share a couple, that's fine. Right. I'm like, if you want to sign up for a corporate account, please contact me.

Starting point is 00:58:49 But at the end of the day, like, and you just have to be, you have to be okay with. There is benefits to that. And so I tie this all together. I think what Andrew was referring to, which is I think the way to think about this, set aside the chat cheapies up in a moment, but where I talked about this automated enforcement was I think automated enforcement is a valid approach at the aggregator level. level. And this is what YouTube does, by the way. And I think YouTube's approach is actually brilliant. So there was this big issue with copyrighted music on YouTube. And people would just, you know, of course you don't want someone like Napster uploading a random song and getting credit from it. But the problem was you have situations like people would

Starting point is 00:59:31 upload footage from a birthday party and there was a song playing in the background and the video would get taken down because of copyright violation. That's no good, right? So what what YouTube actually did is it cut a deal with the music labels where they automatically scan and any sort of video that has copyrighted music in it, any monetization that happens in that video, the record labels get their cut. So it's just sort of all automatic. And you as a user, as an uploader, don't need to worry about the music. Like, now, if you're obsessed about making money with your video, then sure, you better worry because a cut's going to get taken from you.

Starting point is 01:00:07 But by and large, I think this is a great solution and a great compromise. and it only works with an aggregator, only works with something like YouTube. I don't think if you go back to my framework for moderation, as you go down to, should this happen at a platform level? I don't think so. Should it happen at an infrastructure level?

Starting point is 01:00:25 Absolutely not. Like just the chilling effect, I think would be massive there in addition to all the logistical effects that Andrew has. And so when I talked about this automated enforcement, I'm like, I think this is the answer. Look, you're going to get 95% or 99% of the value.

Starting point is 01:00:40 It's going to be good enough. effective enough so that you should be in good shape. Don't worry about what was true in music. Don't worry about the dark edges of the internet, right? Like, like, you don't want to go there. Just take my word for it, right? And so, yeah, that was the sort of bit I was gain to your. Now, I will grant when it comes to the whole point of these alms is they do scrape the dark

Starting point is 01:01:01 dark cores the internet. They do sort of get access to everything. And so, you know, in this narrow sense, sure, that raises new issues. But to me, if I could be like my sort of judge in this case, they're immaterial, because because they already lost earlier in the case, so I'm not going to even have to decide it. Yeah, well, and I think just in general, to the extent any holding finds that the effect of the use upon the potential market for or value of the copyrighted work is concerning enough to rule in favor of incumbent copyright holders, the way to police any of this stuff is going to be going to a Spotify, going to a YouTube, going to a Google, and saying this can't exist on your. site. And that's good enough. You have to lean into the reality of the internet, right?

Starting point is 01:01:46 The whole thing with the music labels is the reality is what Spotify did is it sold convenience, right? You can still pirate music. It's just so much easier. Just pay Spotify. And I don't because they've made it so exhausting to pirate music because you can't find it on Google anymore. You can't find it on. It's a carrot and stick approach, right? But Spotify makes it easy. And so what made the labels, the reason they've done so well, particularly relative to all their peers in media, is they have a business model that aligns with the internet. They're not fighting the internet.

Starting point is 01:02:17 Fighting the internet is very hard. Stratory arguably fights the internet in some respects, right? Because I'm trying to artificially paywall a zero marginal cost product. Now, I also benefit from that. I can send out 10,000 or thousands of emails at a time. So it goes both ways. But go with the internet by and large. In this case, if you understand and realize that I'm right,

Starting point is 01:02:40 that the nature of the internet is towards aggregators. That's like when you have a frictionless sort of liquid market, you're going to end up with these very large entities that control demand. Lean into that. Let that be the sort of the like let that be the axes of concern about this. And and you know, I think that that saw if you can think about things in sort of relative scale and don't get obsessed about edge cases,

Starting point is 01:03:08 which are trivial to find on the internet, we have searched and computers are very good at that, then it becomes much clear how to proceed with a lot of these issues. Right. And it would be clear in the case of some of the larger LLMs as well, because you may have pirate LLMs, but they're not the ones that are actually existential threats to media organizations. So if a judge decides that chat GPT is, then chat GPT will have to pay for data that it wants to train on going forward.

Starting point is 01:03:36 And it'll be pretty open and shut as far as the- There's a bit where from, you know, there's a beer from opening eye. Maybe they wouldn't hate that because then they get to pull up the ladder for whoever comes behind, right? Like there's. Well, they're going to be pulling up the ladder in any number of ways going forward. But yes, this would just be one more arrow in their quiver. Before we close out, while we're talking AI, we got some other questions that we can get to later in the week. But it did occur to me over the weekend that when we first talked about AI on this show like a year and a half ago, I was very excited to tell you.

Starting point is 01:04:09 that the one application where AI had been really useful for me was in law with legal research and specifically with citation checking. And wouldn't you know it while we were gone over the break, citation checking with AI having a real cultural moment right now. So I just wanted to recognize that as a community to come a long way. You get some being right points. The associates of the world just living on, Westlaw suddenly we were we were ahead of the curve a couple years ago and now here we are so

Starting point is 01:04:44 I'll just continue to enjoy this cultural moment but for now Ben it is great to see you great energy from you I think the Packers brought out the best version of Ben Thompson on that are two two weeks of not working that all right column A call a what can you say anyways we will be back later in the week I look forward to keeping it rolling email at sharptech.fm is the email address. If you've got comments, questions, takes, we'll keep it rolling. It's going to be a great 2024 here on SharpTech. And Ben, I will talk to you in a few days.

Starting point is 01:05:23 Talk to you later. Happy New Year.

Sharp Tech with Ben Thompson - The New York Times Sues OpenAI, Napster History vs. ChatGPT Future, Copyright Enforcement at the Aggregator Level

A word about the Packers and microchips, and then reactions to the New York Times suing OpenAI and Microsoft for copyright infringement, the similarities and differences between past copyright cases w...ith Google and Napster, and a question about automated copyright enforcement.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.