Everyday AI Podcast – An AI and ChatGPT Podcast - EP 618: RSL vs. the AI Scrape: Can LLM licensing save the open web?
Episode Date: September 25, 2025AI scraping vs. the open web. Who wins? 🥊Let's say the quiet part out loud: AI companies have trained their models for years on your company's website data, regardless of if you want them... to. Fast forward to today: many publishers have lost up to 70% of website traffic (and huge chunks of revenue) because of this. So what happens if some of these publications.... die? Then the AI companies have fewer, human-created source material to train future models on. Rut roh. Maybe the Really Simple Licensing protocol can save us all. We talk with RSL Collective Co-founder Doug Leeds on if a new licensing structure can be a win-win-win for publishers, AI companies and end users. You don't wanna miss this one. 👇EP 618: RSL vs. the AI Scrape: Can LLM licensing save the open web?Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion on LinkedIn: Thoughts on this? Join the convo on LinkedIn and connect with other AI leaders.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:The Open Web Threat from AI OverviewsDecline in Website Traffic Due to AIProblems with Current Web Content LicensingDoug Leeds’ Publishing Industry BackgroundIntroduction to Really Simple Licensing (RSL)How RSL Web Standard Enables LicensingRSL Collective Blanket License ModelImpact of RSL on AI Search ResultsBig Publishers’ Participation in RSL CollectiveBenefits for AI Companies Licensing ContentAddressing AI Hallucinations with Licensed DataCollective Enforcement Against Content ScrapingLegal and Economic Sustainability for PublishersTimestamps:00:00 "Threat to the Open Web?"06:08 "Edgar Walter's Web Innovations"09:10 "RSL Collective: Unified Web Licensing"12:37 "Join RSL Collective Initiative"15:56 RSL Collective: Monetizing Content Access19:47 "RSL Collective: AI Publishing Solution?"20:34 Human-Led Training Enhances AI Quality23:52 "Sustainable AI: A Hopeful Future"Keywords:Really Simple Licensing, RSL, open web, AI licensing, content licensing, online publishing, search engines, AI overviews, large language models, content rights, machine readable licenses, robots.txt, AI web crawlers, content monetization, publisher ecosystem, website traffic, ASCAP model, collective rights organization, AI abstracts, blanket license, RSL Collective, creative commonsSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
Is the open web in danger?
Sometimes, I think it is.
I mean, think of your own use.
Are you still going to dozens or hundreds of websites a day?
Or are you like me?
And you're just using, you know, maybe Google's AI overviews
or you're using Chad GPT or perplexity.
So I think part of it is, well, there's great benefits.
to us humans who can maybe spend less time with 300 browser tabs open and maybe get better,
more accurate answers faster.
But what about on the other side?
What about all of those clicks and all of those people maybe that aren't going to those
websites that are actually feeding these large language models?
And this is, if you've been listening to the show for a week or a month or a year,
this is something I talk about a lot and I care about as a former journalist and as someone
that just wants good information, good accurate information in not just today's large language
models, but the AI of the future. So I think one company that we're going to talk with today
is doing something that I've been asking someone to do. So I love that they're doing it. And we're
going to be talking today specifically about real, really simple licensing or RSL and kind of how
they might hopefully save the open web. All right. I'm excited for today's conversation. I hope you
are too. If you're new here, welcome. What's going on? My name's
Jordan Wilson and welcome to Everyday AI.
This is your daily live stream podcast and free daily newsletter helping every day
business leaders like you and me, not just keep up with AI, but how we can make sense
of it, get ahead to grow our companies and our careers.
If that's what you're trying to do, it starts here with the unedited, unscripted,
live stream podcast, but to take it to the next level, make sure you go to our website at
your EverydayAI.com.
Sign up for that free daily newsletter.
We're going to be recapping the highlights of today's show as well as all the other AI
news from today that you need to know.
All right, enough chit-chat.
Let's bring the actual expert on.
I'm excited for today's show.
So please help me welcome to the stage.
Doug leads, the co-founder of really simple licensing.
Doug, thank you so much for joining the Everyday AI show.
Thank you, Jordan.
Thanks so much for having me.
It's great.
All right.
So first, let's first start a little bit of your background, right?
Because you didn't just start this out of left field.
Can you tell everyone a little bit of your background, specifically on the publishing side?
Yeah.
So, well, I started really in.
in search.
I ran a search engine called ask.com or ask jeeves, one of the original search engines.
Really, you think back about it, it was trying to do like what AI is doing now, but with humans,
answering questions instead of sending you to links.
But over the years, Google took more and more of the market share and started to look in the early
2010s for a new business model.
and we were investigating where all the traffic was going from search,
and it was going to content and to media.
And given that we had that foothold in search, at least for a little while longer,
we started, I started to acquire companies to build a media company
that acquired about $600 million worth of companies,
investipedia, about.com, dictionary.com,
a bunch of things like that, rolled them all into one group under the IAC
holding company. And that is now called People Inc because they bought Meredith. And so it's that thing that
we started is now the largest publisher of print and online in the U.S.
Yeah. So obviously, if you don't know, Doug, I know many of you probably will. He's top,
you know, top of the class when it comes to just publishing on the web. So let's just skip straight to it.
What is really, really simple licensing or RSL?
And what is the problem that you all are trying to solve?
Yeah.
Well, let me start with the problem because then I can tell you what the solution is.
And it's very simple.
Look, AI is great.
I use it every single day.
And it is a better product than I had when we were giving you search and a bunch of links
because it gives you the answer right away.
The problem with that is it hurts the ecosystem that the whole web,
the open web is built on, which is take some of my content and then index it and then give me a link.
So when someone looks for something that I have, you send them to me and I get the traffic as the
publisher, as the creator.
Then it's up to me to monetize that traffic with ads or subscriptions or what have you.
But with AI, I just get the answer.
No need to go to the site.
Great user experience.
Terrible for the ecosystem, because without that traffic as a publisher, I can't.
pay for the content that I'm producing.
So let me pause there and make sure it makes sense.
Yeah, no, that makes perfect sense.
So, you know, it's something I talk about all the time, right?
Because at a certain point, you need all of the open Internet to keep publishing, right,
if you want AI to be smarter and smarter.
You know, I'm curious.
Was what actually led you to say, all right, we need to do something about this?
Was there kind of like, you know, an epiphany?
that you had, you know, or how did you land here with RSS?
Yeah, well, my partner, by the way, Eckerd Walter, who we worked together 20 years ago at Yahoo,
he had created even before that back in the 1990s at Netscape.
He had been one of the co-creators of RSS, which is called really simple syndication,
and it's a web standard that underpins a billion websites, I think, at one point.
And he came, we were talking, I teach a class at UC Berkeley, and I asked him to come to talk to the class about his journey as a professional.
And at that time, he shared with me what he was working on, which was the updated idea on really simple licensing, which is a machine readable way to just declare your license terms.
So very simple, robots.
dot text is a way to say, yes, you can crawl me or no, you can't crawl me.
But that's all it can do.
It can't do more than that.
And we've seen companies like Google say, hey, if you want to be in search, you have to be in our AI abstracts.
You can't say yes to search and no to AI abstracts.
It's the same thing.
We decided we could do something better.
And really, Eckert was the one who started with the standard saying, look, we can create
a open standard that allows anybody to articulate their license terms for any piece of content.
We can put that right in robot slot text, and crawlers can see it.
And they can say, oh, I now know what the licensing terms are.
And I can then go away if I don't want to do it or I can agree to those licensing terms.
So that's the standard.
And then we built a collective rights organization on top of that.
But let me get to that in a second because I want to make sure I'm covering.
All right.
So, Doug, maybe if there's a business owner out there, someone that works at a big media publication.
And I know a lot of companies, maybe a year or two ago, just blog.
all crawlers when AI started to come out because they're like, we don't want to lose traffic.
And maybe now they're saying, I might want to do something else. So tell us how does the,
how does RSL work? How do companies enable it? And also like what options do they have,
you know, to, you know, communicate to these AI companies, you know, how they want them to crawl
or not crawl the site? Yep. So, uh, look, blocking was a technique that made a lot of sense
when you could still get traffic.
But things like Google that says you can't block
RAI abstracts and be in search
means you're basically blocking your ability
to show up in search if you do it this way
and simply block.
So RSL was created to solve this problem.
RSL standard.org, if you go to the website,
you will see all the information about how to use
the web standard to declare your rights
and those rights can be,
but in a measurement,
machine readable form using the standard language in your robots.
.text or in your HTTP headers or any piece of content.
So anywhere a crawler comes across your content,
it will now know what your license terms are.
And you can put any license terms you want.
And there's descriptions of some like Creative Commons ones up there on the
website, but also, you know, pay me or you can use it for this and not for that.
All those examples exist on the website.
And you can check out and implement it.
it now. But that said, it's not enough to solve the problem. Because let's talk about there's
millions of websites out there and each one puts a different license term on. And the AI companies
really will face a choice there. Either I'm not going to negotiate, if I'm the AI company,
I'm not going to negotiate a million times with a million different publishers just because I might
want their content. So I'll either steal it, which is what they're doing now and just take
without paying attention to what your rule set is,
what your license terms are,
or I'll just not take it,
and I'll go to something else
because the web is filled with close competitors of content, right?
So that's why we are building the RSL collective,
which is a collective rights organization.
And that's modeled on like ASCAP,
which has been around for 110 years in the music business,
and basically says,
okay, let's take all the content,
really our role is all the content out.
And make it available under a single blanket license
that the AI companies can then say,
oh, this is the RSL collective license.
I know about that.
I will ingest that content.
And then they will pay for it
and we will distribute that payment to the content owners.
So it becomes really easy to use.
It's in our name, really simple to use
and really simple to license,
solving problems on both sides.
All right.
So I do want to follow up on the collective
side. But before we do, take a quick, take a quick 30 second break for a word from our partners.
This podcast is supported by Google. Hey folks, Stephen Johnson here, co-founder of Notebook L.M.
As an author, I've always been obsessed with how software could help organize ideas and make connections.
So we built Notebook LM as an AI-first tool for anyone trying to make sense of complex information.
upload your documents and Notebook LM instantly becomes your personal expert, uncovering
insights and helping you brainstorm. Try it at notebooklm.com.
All right. So, Doug, can you walk us through a little bit on the difference maybe between someone,
you know, signing up and implementing RSL and putting it on their website with their own terms
versus this collective, right? Like, are there certain criteria or, you know, is it paid content
only to be a part of this collective? Because that does sound, you know, pretty powerful to be a part
of that group and maybe it also increases your likelihood of ultimately showing up in AI search.
Yeah, look, all that is true. So what we are doing right now is we're saying, please go,
let us know you exist, go to the site to the RSL Collective.org, different website for the
collective, RSLcollective.org, and tell us you exist and that you're interested in this.
We will be releasing a standard license agreement. So think of RSL standard as the language.
and you can use that to describe anything you want,
and then think of RSL Collective
as one particular articulation of that language
for one particular agreement that you can join.
And joining is free.
It doesn't cost anything to join.
You can opt out at any time.
It's non-exclusive.
But basically what you're telling us is,
hey, we exist and please go negotiate on our behalf
for compensation for our content.
And then you will get paid when someone uses the content.
So when that shows up,
up in an AI answer, then you get paid.
Ultimately.
So I do want to talk a little bit about some of your featured supporters, right?
Yeah.
When I first looked at this, I'm like, okay, these are the big names out there, right?
I mean, you have people, you brought up people earlier, you know, Yahoo, Quora, Zip Davis,
medium, stack overflow, and also Reddit, right?
Because we saw a recent study a couple of weeks ago that said Reddit was actually the most
cited or sourced kind of
out of large language models.
I saw 40%.
Yeah, which is such a high amount.
So like how might this work, right?
Because we also, I've heard that some of these, you know,
companies have their own deals, right?
So like, is that like doubling up or, you know,
how does it work with some of these bigger publishers?
And does that mean we're going to see them more in AI search results,
less or the same?
Yeah.
So a few questions in their,
to unpack. So yes, we have these big companies and they're joining us for two reasons. One,
for themselves, which makes sense because they're businesses and they're looking out for themselves,
but two, for the open web. They really are leaders in looking forward and saying we need an ecosystem
that's going to survive AI and allow creators of content to get paid. Otherwise, there will be
not there will not be content to train AI on and there won't be AI. And we all sort of agree
that AI is a better product. So how does that work so that these guys, you said, they have some
of them have separate deals? Absolutely. And there's nothing, like I said, nothing exclusive
about us. You can go do your own deals. What we believe will happen once we get this critical
mass, which these guys, these big guys have helped us get so far, is a better deal with the AI
companies. And the reason why is because
it's so much easier for them. Instead of having to do 20 deals, even with the big guys or 50 deals
or 100 million deals, you do one deal and you pay only when that content is used. Very similar
to how Spotify works when they licensed music rights from ASCAP or other collective rights
organizations and they pay out when someone plays a song. But if you don't play the song,
you don't get paid. And it's up to them to determine what's the best use of their product. And they can
use the content that makes their product the best, but just pay for that's the idea.
So you obviously have a very long history of, you know, being in the internet publishing space.
I'm not asking you to get out your crystal ball here, but it seems like fewer and fewer people
are actually visiting websites, right? So I get how RSL and the RSL collective can help
in general because maybe companies and publishing companies can
still make money without people ever going to their website, you know, if they're getting paid,
you know, per crawl, per citation. If there's anything on the agentic side, I think you mentioned that,
right? But what else may still have to happen, right, for even AI models of the future to still
have high quality human written content to even train on? Yeah. Look, first of all, RISL works
with Agentic Web. It works with RAG. It works with, it very well works, um, uh, with,
MCP and NL Web, the architect of NL Web was also a co-creator of our standard.
So it works in all these formats.
It doesn't have to be a human readable site.
So you can put it into DRM content.
You can put it into otherwise secure content.
So whatever you're publishing, you can put an RSL license on it.
And that makes sure that you get paid by the big AI companies.
They want this.
Now, they may not be saying that in those words, but they kind of are.
They've said, we need a new protocol.
Sam Altman said, we need a new protocol so this can happen.
In some of the litigation that they're involved in, they've made a statement like,
well, the reason we can't license everything is because it's technically impossible.
And that, to me, makes me think of the early days of music on the web.
Napster was taking everybody's music.
And the entire music industry was very worried because it was a better product.
You didn't have to buy a CD for 20 bucks and 12 songs you didn't care about.
You could make a playlist.
You could have just the song that you wanted.
But it was not going to be sustainable in that model because they weren't paying for anything.
And to Apple's credit, they went to ASCAP and said, we need a license structure where we can actually license all the music and pay for it.
And they created the extension of ASCAP that works for streaming, for digital steering platforms.
We're doing the same thing for internet content.
It's exactly the same thing.
hey guys, you're suing. Anthropa just lost, you know, settled for $1.5 billion.
I'm sure you covered that.
Yep.
Yep.
So that's not the path to go down.
The path to go down is license everything you want.
You build your product the way you think it makes sense for your consumers,
but then when you use content, pay for that content, just for the ones you've used.
That's how it would work.
So obviously a lot of momentum on the publishing partner side.
It seems like you said, right, if the best.
big tech companies are saying we want something like this, it'll make it easier.
Can you share with us maybe a little bit?
Has there been any reception or momentum on the other side yet?
And what might that eventually look like?
Yeah.
Well, I mean, we're, as I said, we're a nonprofit.
And we are governed by what we call our publisher steering committee.
So we are, that group is determining what the license terms are going to be.
We have received interest.
and I can't say more than that.
We have received interest from some of the major LLM companies,
but we're putting together what those licensed terms are
with the perspective of our publisher steering committee
because we're a nonprofit.
So I expect that in not too long, we will have our first deal.
But again, if this is interesting to anybody out there,
please come to RSLcollective.org and just click join
and we'll make sure to include you when we have those.
deals. Is, is, if this works out, right, because in, in my mind, I'm like, this seems perfect.
I've been saying for, for multiple years, I'm like, publishing companies have three choices.
You can either try to sue the big AI labs. You can partner or, you know, something now that this
exists, the RSL collective, or you can maybe just go out of business, especially if, if you
rely on people going to your website, you know, signing up for your email, you know, buying things, right?
But is this, if it works out exactly how you're envisioning this?
Is this something that can both protect the open web?
It can protect the humans writing for all of these organizations and still on the other
end, guaranteed higher quality human-led training data?
Absolutely.
And let's talk about that on the other side.
So everything you said, I completely agree with.
But there's one point that I think hasn't been covered as much, which is the benefits for
the AI companies, not just for licensing, which is great.
great, so they have legal access to this content. But what happens when you have legal access
to the content? You can start using it in a way you can't use it now. So they're spending a lot,
billions of dollars to mash up different sources of content that they've scraped so that they can
claim that they're not copying it and reusing it. So what happens when you do that? Billions of
dollars in compute costs and subpart answers that have, they're prone to hallucinations. Instead,
Instead, if your content, if they have access to content that actually answers a question, just give that content.
You don't have to mash it up, spend billions of dollars, hurt the planet.
You don't have to, right?
You don't have to worry about hallucinations.
You're like, oh, we have the perfect article for you.
It already exists.
It's written by a very knowledgeable human.
And here it is because we have the rights to give you that.
So the quality is going to improve when they start licensing this.
The cost will go down, not just the transaction cost, but the actual compute cost.
costs will go down. And so the ecosystem benefits on both sides. Creators get paid and AI companies
get a lower cost of higher quality content. So it's one of those things almost sounds too good to be
true, right? Like, but what are what are the challenges still, right? Like why? Because again, when you
lay this out for me, Doug, I'm like, this sounds great. Sounds like it's going to solve. It's a win for
publisher, wins for AI companies. In the long run, it's a win for them, right? And it's a win for
consumers using all the large language models, where's the short straw, right? Like,
where's the downside to all of this? I really don't, I think it's, we're in the initial phases.
We just launched, like, less than two weeks ago. And we're just getting an incredible momentum.
And it's fantastic. We have to build that group up so that the AI companies will say, this makes
sense for us to talk to you about this, because we're trying to save our costs of reaching everybody.
And if you can do that, that's fantastic. So the big,
hurdle we have to go through is getting people signed up. But again, it's free. It's non-exclusive.
And one other point, by the way, you talked about legal enforcement and it's like sue us if you
don't like it has been their approach. Well, again, modeled after ASCAP, collective enforcement
is part of this. So if you join us and you start getting infringed or scraped, the whole
collective can battle in court for you instead of just you. So this is again,
a model that ASCAP and other collective rights organizations have used for over 100 years.
But instead of just saying, oh, you don't like it, we have billions of dollars and you have
whatever you made last week, lost you in court.
You now have the entire industry behind you saying no.
If you hurt them, you heard all of us.
All right.
So, Doug, we've been over a lot on today's conversation.
But as we wrap up, what's maybe the one most important takeaway that you think it's important for our audience to hear when it comes to just license?
content in the future of large language models.
What do people need to leave this conversation with?
I think they need to leave it with hope because that's where I am.
The LLM companies and the AI companies are understanding that in order for this to be sustainable,
they have to start paying for it.
And they need a mechanism to do that in an efficient way and we provide that.
So there's an exciting moment here where we get a better product than we've ever had before.
it's more usable and it supports the people that help create it.
Love it. Love it. Can't wait.
And Doug, you know, when you're ready to announce that first big partnership, you know, come back and talk with us.
We'd love to have you back.
But Doug, thank you so much for taking time out of your day to join the Everyday AI show.
We really appreciate it.
Thank you.
All right.
If you missed anything, y'all, don't worry.
It's all going to be in our newsletter.
A lot of URLs, a lot of information.
It's all going to be in the newsletter.
So if you haven't already, please go to your EverydayaI.com.
Sign up for that free deal in the newsletter.
We'll see you back tomorrow and every day for more everyday AI.
Thanks, y'all.
Meet Firefly AI Assistant.
Now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words and the assistant handles the rest,
orchestrating multi-step workflows across Adobe Creative Cloud apps,
including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com
and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next time.
