The AI Daily Brief: Artificial Intelligence News and Analysis - Can AI Detect AI? And Other Frequently Asked Questions
Episode Date: October 14, 2023NLW excerpts Professor Ethan Mollick's recent essay "What people ask me most." https://www.oneusefulthing.org/p/what-people-ask-me-most-also-some Today's Sponsors: Listen to the chart-topping podcast ...'web3 with a16z crypto' wherever you get your podcasts or here: https://link.chtbl.com/xz5kFVEK?sid=AIBreakdown TAKE OUR SURVEY ON EDUCATIONAL AND LEARNING RESOURCE CONTENT: https://bit.ly/aibreakdownsurvey ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're reading what is actually a very useful frequently asked questions around artificial intelligence.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.network for more information about our Discord, our YouTube channel, and our newsletter.
Hey, hello, friends. Welcome back to the AI breakdown.
It is Saturday, which means it's time for another long read.
Today we are once again in Professor Ethan Mollock's world.
his blog is called One Useful Thing. You can find it at One Useful Thing.org. And this week he published
a piece called What People Ask Me Most, also some answers, an FAQ of sorts. Here's the setup.
Ethan writes, I've been talking to a lot of people about generative AI, from teachers to business
executives to artists to people actually building LLMs. In these conversations, a few key questions
and themes keep coming up over and over again. Many of those questions are more informed by viral news
articles about AI than by the real thing. I can't blame people for asking because, for whatever
reason, the company's actually building and releasing large language models, often seem allergic
to providing any sort of documentation or tutorial besides technical notes. I was given much better
documentation for the generic garden hose I bought on Amazon than for the immensely powerful
AI tools being released by the world's largest companies. So it's no surprise that rumor has been
the way that people learn about AI capabilities. Now, even then caveats that of course he doesn't
have perfect information, that he makes mistakes, that things are changing fast, but that he will do
his best to answer these most common questions. So what we are going to do is we're going to go
through the essay section by section, question by question, and in some places I'll give Ethan's
exact answer word for word, in other places I'll sum it up, and where I think it's relevant,
I'll add my own little notes. His first section is about detecting AI. And interestingly,
in his disclaimer, although he said take all of his other answers with a grain of salt,
he said on this point about AI detectors, he is absolutely sure that he is correct. So, with that
in mind, first question, can you detect AI writing? Ethan gives simply a one-word answer. No. Question.
But what about AI writing detectors that claim to do that? Ethan writes, AI detectors don't work.
To the extent that they work at all, they can be defeated by making slight changes to text.
And what might be worse, they have high false positive rates, and they tend to accuse people of using
AI when they don't use AI, especially students to whom English is a second language.
The falsely accused have no recourse because they can't prove they didn't use AI.
You can't detect AI writing automatically.
Even OpenAI says you can't.
The question continues, but I am sure I am really good at detecting AI writing myself.
Look, I am going to cut you off here.
You might think you are good at detecting AI writing, but you are just okay at detecting
bad AI writing, and you combine that with your own biases and heuristics about who
might be using AI.
I'm sure the teachers who know their students well can guess at who might be cheating
as they always could, but you are going to miss a lot of cheaters.
who are doing it more subtly, which is a problem of fairness in and of itself. I hate to say it,
but homework as we know it is over. We educators are going to have to adjust. There are plenty of
paths forward, but it is not going to include cheat-proof homework. Now, I want to hang on this point
for a moment because I share Ethan's conviction here, on this idea that homework as we know it
is over. Now, yes, I know there are certain domains of knowledge where, of course, AI can't
solve everything, but I think that the broader idea is that there is an absolute, unescapable
paradigm shift here, in which these tools now exist and the genie is not going back in the bottle.
The entire impulse to have AI detection software, specifically for writing and especially
in the educational use case, is frankly a clinging to an old world which simply doesn't
exist anymore. The good news is that homework the way that traditional schools have done it
isn't some perfect invention that's inextricable from human learning. It's just a
a convention that we happen to have adopted. But I think it's an interesting example of how there
are all these things that have come to be totally accepted as normal, that are just basically
null and void in the era of artificial intelligence. But there's still a little bit more in Ethan's
section on detecting AI, so let's move back to his essay. Question, what about AI generated images?
Ethan writes, well, there are more techniques to detect AI images, they are already very hard
to identify just by looking, and in the long term, likely impossible. All the hints you think you know,
like bad hands on fingers are no longer true. So once again, Ethan is making the point that in a world
where we can't easily identify what has been created by AI, we have to change our assumptions
to begin with the fact that anything that we're consuming might have been created with AI.
Ethan's next section is called using AI. His first question is, who knows how to best use AI to
help me with my work? Ethan writes, I have good news and bad news. The answer is probably nobody.
That is bad news because there is no instruction manual out there that will tell you how to best apply
AI to your job or school, so there is really no one to help you get the most out of this tool,
or to teach you to avoid its specific pitfalls in your area of expertise. This can be challenging
because AI has a jagged frontier. It is good at some tasks and bad at others in ways that are
difficult to predict if you haven't used AI a lot. The good news is that by using it a lot,
you can figure out the best way to use AI. Now, once again, this is an area where I agree wholeheartedly
with Ethan, but I also personally think that this is likely to change fairly quickly. I know,
for example, that I am churning on ways to better help people figure out how to use AI to help them
with their school and their work. I think it's a fascinating frontier, an incredibly important thing
to spend time on, but I think that Ethan's answer that right now there's not really anyone that
you can easily point to is correct. So what then does Ethan recommend in terms of ways to get good
at using AI? Simply put, he recommends using it. Get your hands on GPT4 or Google Bard or Anthropics
Claude. Then he says, use it to do everything you are legally and ethically allowed to use it for.
Generating ideas? Ask the AI for suggestions. In a meeting, record the transcript and ask the
AI to summarize action items. Writing an email, work on drafting it with AI help.
Now, one really important thing that Ethan notes here, on top of what is already good advice,
is that this is a never-ending process. Just when you think you understand how a particular modality
of artificial intelligence works and you figured out what you can use it for,
something new comes out that changes that dramatically. The example that he gives is the difference
between something like mid-journey, where you have to learn over time with lots of trial and
error the perfect prompts to get what you want, versus the new Dolly 3 tool that's embedded
in ChatGPT. As Ethan describes, Dolly 3 works very differently than other previous AI image
tools because you tell ChatGPT4 what you want and the AI decides what to create. For example,
he fed it the entire article that I'm excerpting now and asked it to come up with its own
ideas for what illustration would be good cover art.
It came up with an image of a stage that had on one side AI myths and on the other AI reality.
Last in the using AI section is the question, I found something AI can't do.
Does that mean that it is outside the jagged frontier?
Ethan writes, maybe, but I wouldn't feel too certain that a capability is outside the realm of
AI until you have spent some time with different approaches.
And if it truly is impossible for AI to do, wait a few months and try it again when a new
model comes out.
And now a word from today's sponsor.
Are you interested in how two top-of-mind trends AI and crypto can work together?
If so, I have the perfect podcast recommendation for you.
Web3 with A16Z Crypto, the chart-topping show brought to you by venture firm Andresen Horowitz.
Web3 with A16Z Crypto is your definitive resource for the future of the internet.
Whether you're already building in these spaces or simply curious about what's next.
If you need a place to start, they recently released an excellent episode with Stanford Cryptography Professor Dan Bonay
and form a Google Xer Aliyaa in conversation with host Sonal Choxi about the intersection of AI and
crypto. From fighting deepfakes and proving humanity to large language models like ChatchipT, they cover it all.
I highly recommend checking it out, especially if you'd like to learn more about how AI and
crypto will impact our everyday lives. Beyond crypto and AI, this show is for creators seeking
more ways to truly own their work, for business leaders trying to prepare for the future today,
and for innovators exploring trending tech topics. So go ahead, listen to Web3 with AsiC
16 Z crypto, wherever you get your podcasts.
All right, so far we've been through detecting AI and using AI, and I think that you can
probably see why I decided to share some of this piece. It's got a lot of utility packed in
word for word. And so now we move on to the policy stuff to use Ethan's phrase. Here we have
an additional disclaimer and caveat that Ethan is not a lawyer, and that the perspective is coming
from someone who just watches these issues quite a bit. Question one, our company won't let us use
AI because we don't want our data stolen. Is that right? Now, in response to the
In response to this, Ethan agrees that it is a challenge right now that all of the big
LLMs haven't been particularly forthcoming about what materials their models were trained
on, in part because that might have been copyrighted material.
And yet he writes,
The privacy issue that many people talk to me about is likely less of a barrier than you
think.
As a default, AI companies say they may use your interactions with their chatbots to refine
their model, though it is extremely hard to extract any one piece of data from the AI,
making direct data leaks unlikely.
But it is relatively easy to get more privacy.
Individual users of chatGBT can turn on a privacy mode where the company says they will not retrain or train AI on your data.
But large organizations have even more options, including HIPA-compliant version of the major AIs.
All of the big AI companies want organizations to work with them, so it is not surprising that all of them are eager to offer data guarantees.
Now, Ethan's conclusion is that the short answer is that data privacy is probably not as big a concern as it might seem at first glance,
but here I'm going to depart slightly.
First of all, when it comes to individual consumers, we unfortunately, I think,
think have had far too many times where terms of services and expectations about data privacy
were broken or stretched to the limit to really be comfortable just believing that a little switch
on chat GPT means that there's no guarantee that what we're saying isn't going to get back to
humans on the other side of the system. A few years ago that may have sounded conspiratorial,
and ultimately there's not much more that a company can do than pledge that they're not going to
use data and say yes, this button turns it off, but that isn't necessarily going to increase trust
a priori. I think that there's going to be a big trust barrier that remains, in other words.
When it comes to companies, I think Ethan is absolutely right that all of the big AI model
companies are trying to give enterprises assurances that their data is safe, but we are certainly
seeing a situation in which many of those enterprises are choosing to either, A, customize their
own solutions because it's the most controllable environment, or B, work with vendors that they
already trust with their data rather than looking too younger, perhaps more risky startups.
It's a really interesting dynamic, and one that I think is going to continue to shape how the field develops.
The second question in this section is, what's the deal with copyright and AI?
Ethan writes, as I understand it, current U.S. copyright rules around AI material are sort of unclear and in flux.
However, large AI companies seem eager to ensure their customers that using their AI output commercially is safe.
For example, Adobe and Microsoft offer legal guarantees that if you are sued over the output of their AIs,
they will protect you at least under some circumstances. But also remember that legal use is an
always going to be ethical use, especially as we consider cases where AI work displaces human labor
or produces art in the style of a living artist. The thing that I like best about this answer
is that it really does break apart two separate things. One is the legality of a situation,
which is going to be fought out in cases that I'm sure will go all the way up to the Supreme Court,
and the other is societal expectations and norms, which are going to be a much different
and frankly messier fight that plays out over years in the Court of Public Opinion.
The final section in Professor Mollocks FAQ, The Future.
question. Aren't AIs like GPT4 getting worse with time? Anyone who is on Twitter over the summer
will have seen this idea quite a bit, and what's more, it seemed to be confirmed by an academic
paper. But Ethan says not so fast. He writes, no, this turned out to be an incorrect conclusion
from a working paper examining the performance of AI on certain math problems. Professor Arvin Narayanan
and Professor Saish Kapoor found that AI models are not getting worse at these sort of problems,
but they are changing, which alters the way you need to prompt the AI. You see,
what you call GPD4 or Bard or Bing today is not the same thing as what Bard or Bing or GPT4
was a few months ago. Models are continually getting additional training and tuning that improves
performance in some ways while also changing behavior in others. It is part of why it is so hard
to treat AIs like normal software, and sometimes easier to treat it like a person even though it isn't.
Next question. Won't AI development ground to a halt as the internet fills with AI data or as it
runs out of data to train on? Ethan writes, I hear this a lot. It may be true. Some papers argue that
we will be out of training data in the next decade or two, or even by 2026 if we restrict ourselves
to high-quality data, and another paper suggests that AI models will indeed start to struggle
as the web fills up with AI content. But many computer scientists argue that neither of these
are actually long-term problems, and offer various solutions, including ways of training AIs on data
that the AI makes up. Ultimately, these issues are unlikely to stop LLMs from improving over the
next couple years, which I think is what people are really asking when they ask me this question.
If you listen to my show on the State of AI report a couple days ago, you might have heard me talk about this particular issue and how I think it is going to be a bigger source of discussion in 2024.
I think Ethan is probably right that what people who are asking him that are asking about is the immediate term.
But I think that the longer term is a really fascinating question with a lot of disagreement among very smart people.
In that state of AI report episode, I referenced documents that came out when Lama 2 was launched that seemed to suggest that an unreleased version that had been trained exclusively.
exclusively on AI synthetic data, had actually outperformed all the other models.
If that's true, it obviously has huge implications for this discussion.
And so, at least for me, I am eagerly awaiting to hear more about that, if and when meta decides
to share it.
Finally, let's get to the last question in this essay.
How good does AI get?
Ethan writes the only reasonable answer, honestly, I have no idea.
He then continues, and I suspect no one else does either, given the debates among prominent
AI experts. Right now, models get better as they get larger, which requires more data and more
computers and more money. At some point, technical, economic, or regulatory limits are likely to
kick in and slow the advance of AI. But at the same time, there is a lot of experimentation on how to make
smaller models perform like bigger ones, and similar experiments on how to make larger models perform
even better. I suspect there is a lot of room left for rapid improvement. What all of this means
is absolutely unclear. Do we reach the feared slash long-for level of artificial general intelligence,
where AIs are smarter than humans? Thus, depending on who you ask, creating a machine that will start
starving, killing, or ignoring humanity? Do we, quote-unquote, just get order of magnitude improvements
in AI that are already performing at high human levels on many tasks? Do AI stop improving quickly?
There is no clear consensus, which, uncomfortably, means that we should be thinking about
all three scenarios. The only thing I know for sure is that the AI you are using today is the
worst AI you are ever going to use, since we are in for at least one major round of AI advances
and likely many more.
I will leave on this note as well.
One of the reasons that you hear such diverse perspectives on the AI safety debate on this show
is that I, like Ethan, apparently, think that the only reasonable answer to a complete unknown
and frankly unknowable is to create space for all different possibilities, to treat them as
serious and is worthy of consideration and exploration.
Unfortunately, the debate around future AI outcomes, particularly AGI, has calcified almost
immediately into a brittle, caustic battle, in which both sides view each other with extraordinary
skepticism. And of course, in this, I'm even leaving out the people in the middle who want us to be
focusing on AI risks, but different ones than the extinction risk question. I think it is possible
to keep all of these different perspectives in mind and proceed accordingly. But it takes a lot of
will and consideration. And so hopefully having a lot of different perspectives on it, at least
helps give you guys the tools to make up your own minds. In any case, one more big thanks to Professor
Ethan Mollick for writing this essay. Again, check out his blog at 1.1.
Unusefulthing.org.
And until next time, peace.
