Science Friday - How Are AI Chatbots Changing Scientific Publishing?
Episode Date: September 25, 2024Since ChatGPT was released to the public almost three years ago, generative AI chatbots have had many impacts on our society: They played a large role in the recent Hollywood strikes, energy usage is ...spiking because of them, and they’re having a chilling effect on various writing-related industries.But they’re also affecting the world of research papers and scientific publishing. They do offer some benefits, like making technical research papers easier to read, which could make research more accessible to the public and also greatly aid non-English speaking researchers.But AI chatbots also raise a host of new issues. Researchers estimate that a significant amount of papers from the last couple years were at least partially written by AI, and others suspect that they are supercharging the production of fake research papers, which has led to thousands of paper retractions across major journals in recent years. Major scientific journals are struggling with how to set guidelines for generative AI use in research papers, given that so-called AI-writing detectors are not as accurate as they were once thought to be.So what does the future of scientific publishing look like in a world where AI chatbots are a reality? And how does that affect the level of trust that the public has with science?Ira Flatow sits down with Dr. Jessamy Bagenal, senior executive editor at The Lancet and adjunct professor at University of North Carolina at Chapel Hill, to talk about how generative AI is changing the way scientific papers are written, how it’s fueling the fake-paper industry, and how she thinks publishers should adjust their submission guidelines in response.Transcripts for each segment will be available after the show airs on sciencefriday.com. Subscribe to this podcast. Plus, to stay updated on all things science, sign up for Science Friday's newsletters.
Transcript
Discussion (0)
Tools like ChatGPT are changing how scientific publishers judge research papers.
We're in the business of text. We're in the business of language. And so there's no part of our work that could not be disrupted by this new tool.
It's Wednesday, September 25th, and you're listening to Science Friday.
I'm Cyfry producer Dee Petersmith. Since it came out almost three years ago, chatGyT and other generative AI chatbots have changed how we think about the role artificial intelligence plays in all walks of life.
They played a huge role in the recent Hollywood strikes, energy usage is spiking because of them,
and they're having a chilling effect on various writing-related industries.
But their effects are coming into clearer view for another industry, scientific publishing.
So how are scientific journals navigating this new environment?
Here's our flato with an editor at a top scientific journal to answer these questions.
Here to talk about the effects these chatbots are having on scientific publishing is my guest, Dr. Jessamy Bagginal.
Senior Executive Editor at the Lancet, physician and adjunct professor at University of North Carolina at Chapel Hill.
Welcome to Science Friday.
Hi, thanks so much for having me on.
You're welcome.
Okay, tell us, where does this start for you?
Tell us the first time you saw an example or an article that showed the power that AI chatbots could have on scientific publishing.
Well, it's been a couple of years now, but obviously it's a rapidly moving field.
and I've come at it from an editor's point of view, but also from a clinician's point of view
and how we think about evidence and how we think about knowledge. I followed the story very closely
after its initial launch. And I think some of the things that I found most striking were
those original small studies where, for example, trained researchers would look at abstracts
that had been generated by a researcher and abstracts that had been generated by a generative AI,
a large language model. And for the most part,
part they couldn't tell the difference. I think that study came about perhaps, you know, within the
first six months of chat GPT first being sort of released onto the market. And it was very clear
that if experienced researchers aren't able to tell the difference between generative AI abstract
and one that has been written by their colleagues, then this is a really big problem for us.
Is that because they lack the effective tools to detect AI generated content?
I mean, that's right to an extent that we don't have any effective tools that will reliably and sensitively pick up when generative AI has been used.
But I was sitting on a panel recently, you know, this is a huge discussion within the field.
And in fact, my colleague, who's our deputy editor, made a joke the other day that we'd had an agenda item on a meeting for generative AI.
And she said, you know, we always know that we need at least 45 minutes to discuss anything about.
generative AI because editors, researchers, you know, we're very alive to this topic and
thinking about the best way that we can use it all the time. But a colleague of mine on this
panel was saying that, you know, we are, we're in the business of text. We're in the business
of language. And now we have this amazing tool which can generate language. But there's
actually no part of our value chain that might not be disrupted by this innovation.
written, you know, peer review is done for the most part through the written word.
Articles are still written in a way that, you know, hasn't changed for a very long time.
All of this is based on text, on language. And so there's no part of our work that could not be
disrupted by this new tool. Interesting. Now, when you say disruption, and I hear you speaking about
this as being negative. No, I don't think it's negative. No, you don't. I don't think it's just negative,
but it has to be thoughtfully and sensitively implemented.
And that's challenging because it's a very rapidly moving field.
And we're all just getting up to speed with how people are using it and what appropriate use looks like.
So for example, at the Lancet, we implemented a new tick box about six months ago where we ask authors at the submission stage,
have you used generative AI in any part of this study?
And if they tick yes, then we ask them how.
And then in our editorial manager, we have a little sort of red A, which appears to alert
editors to the fact that generative AI has been used in some way in this manuscript.
And then we're able to follow, you know, sort of external widespread policies on how generative
AI should be acknowledged in a manuscript.
But these things are changing all the time.
And I think there's a huge opportunity for generative AI to.
to be a great positive influence on scientific publishing, but there are also dangers.
And so it has to be very carefully thought about.
I understand that.
You know, I have been reading research papers for decades, and I'm always struck about how
poorly some of them are written.
Not that the data is bad.
And I'm thinking, may, if you unleash AI on this, maybe you can get a better written
narrative here going.
Would that be some positive?
Definitely, that's one positive.
And I think from an inclusion and diversity point of view, we still transmit so much knowledge through the English language, which excludes an enormous amount of people because English is not their first language.
We're very lucky at the Lancet to have a very large team of internal assistant editors who make everything into Lancet style.
But that's not the same across the scientific publishing field in many journals.
There isn't that internal expertise.
And so actually if you submit something which isn't written particularly well, then of course that will impact the likelihood of whether that editor decides to send it out or not because they might have problems understanding what the actual research is saying.
So I think there's an enormous opportunity there to make it more inclusive, make scientific more inclusive and fairer.
And I think also when we're thinking about people who might be neurodivergent, I've got lots of clinician friends who are dyslexic and actually,
being able to use large language models to help them structure sentences and how to structure an article
is a very efficient way of them being able to articulate themselves and their ideas
in what most people consider a sort of socially acceptable manner.
Yeah, because most people think of AI as cheating, but you're not talking about that here,
and you're pointing out the positive aspects.
And when you talk to scientists who do use AI, generative AI, what did they say about using like
chat, GBT to help write their research papers?
I think scientists and clinicians across the world are for the most part doing amazing work
under incredibly stressful situations and they're often overloaded with work.
You know, their to-do lists are extraordinarily long.
And so having ChatBTBT as an efficiency tool, which can allow them to, you know, put together an article very quickly or might allow them to write a cover letter in a more compelling manner.
And from our point of view, from a scientific publisher's point of view, generative AI that might be able to allow our submission process to be easier for authors and for us as editors to be able to interact.
with them in a kind of more slick and easy fashion, that type of efficiency could have real benefits
from, you know, to patients and to people's lives and to scientific progress.
I know that since chat GPT came out, the major journals have provided some guidelines and
policies to researchers about the use of generative AI in papers.
But it also seems like a pretty messy landscape right now.
I mean, are these guidelines standard across all the journals and research papers?
Well, we obviously have external bodies which bring together a number of different journals.
So, for instance, the ICME, which is the sort of bringing together lots of medical editors and journals.
They have published guidance.
And so any journals that sort of consigned to them also tend to take some of their guidance on for generative AI.
And equally, organizations like Cope, which help.
with editorial guidance for journals, also have their own sort of set of guidelines.
So I think it's right that there are these external sort of benchmarking places which are
releasing, you know, sort of loose guidance. But obviously each journal is different. Each journal has a
different topic. They have different article types. They have different things that they're trying
to do with those journals. And, you know, let's not forget that journals are actually very
human endeavors. They are how we as humans interpret scientific progress and how we put it into
context and what that means for either patients or for scientists and for that field. And so I also
think that it's right that each journal should be very clear on how they want generative AI to be
used. So for example, at the Lancet, we have a section which includes commentary,
correspondence, perspectives, art of medicine.
And this is an area of our journal
which really requires human interpretation.
And so we're in the process at the moment
of thinking about the fact that we would like to limit
the use of generative AI for this section
because we feel passionately that human ingenuity
and putting things into context
and being able to see what's new,
not just trawling through what's on the internet
and putting together what sounds,
good about a particular topic, but actually expertise, experience, and vision.
We are thinking about limiting the use of generative AI in that section to just using it
for English grammar and spelling, so that we're not excluding people who don't speak English.
Of course, the elephant in the room here are paper mills, and I'm not talking about factories
that make paper. Can you explain what those are and why they're such a big issue?
Yeah, so paper mills are sort of nefarious organisations that essentially have understood the scientific publishing landscape and are gaming it and selling authorship for manuscripts that often are filled with nonsense.
And so you may all have heard of mass retractions from different publishers having to retract articles that essentially were not based in any scientific fact and were not really science, but often complete nonsense.
And they're a huge problem.
There are a problem for publishers.
But in the wider context, there are a problem for science because this, you know, really
breaks down the trust.
This is phony.
We're talking about phony papers here.
We're talking about phony papers.
So it could literally be a manuscript about complete nonsense where the results are fabricated,
the context is fabricated.
And authorship is sold to academics for these publishers.
And so they've kind of got into the editorial process by,
perhaps having guest editors, they've manipulated the peer review process. And they're an enormous
problem for the scientific publishing world. And so there's a real question there as to how,
in the context of generative AI, how as editors do we make sure that what we're reading is real?
Yeah. You know, this may sound like a crazy question, but why not use, if they're writing it
in chat, GBT or AI, why not use chat GBT or AI to find them, you know, to weed out some of these
papers. That's exactly right. This is a big data problem. And I think Elsevier, which is the
company that owns The Lancet, is putting in an enormous amount of resources and effort into thinking
about research misconduct and research integrity in the context of big data. You know, how can we
use some of these patterns across many, many different papers to be able to pick out what's real and
what's not real. But in the larger context, in some ways, generative AI will sort of turbocharge that,
right? Because you're able to very quickly put together a nonsense manuscript that looks and sounds
like it should be published, but actually might be about nothing. On the other hand,
you know, paper mill business model is based on people paying for authorship. And actually,
if people at home on their own can put together this type of paper, why would they pay a paper mill
to do it for them. They might just do it themselves. So I think, you know, this is a huge problem and one
that I know a lot of people are thinking about very seriously. And some of the solutions,
potential solutions here, can you offer any? I mean, I think they lie in sort of big and small
changes. And for the most part, they're probably going to be pretty costly and difficult.
I think a major step is to recognize that over the past decade, two decades,
we've had a major trend towards the open science movement,
which nobody can disagree with from an ethical or moral way.
We all want science to be accessible and available to everybody.
But in reality, what that's meant from a scientific publishing business model aspect,
it's meant that authors pay to get their articles published open access,
and so there has been a focus on quantity over quality.
And I think that that's rapidly adjusting and changing.
And many scientific publishers are changing their way that they're thinking about that.
So we need some better business models.
We need some other ways of thinking about open access in the context of generative AI.
And then I think, you know, another major step is thinking about the environment within which we all work.
You know, there is a serious problem with academic environments which often reward, you know, the quantity that an academic has published over the quality that they've published.
Public or perish.
Yeah, exactly.
Publish or perish.
So there's an incentive there to publish, publish, publish, regardless of whether it might constitute a bit of research waste in terms of does this question really need to be answered.
Has it already been answered?
But then on the other end of the scale, there is an incentive there.
to try and get things published, which aren't necessarily adding to human health or to scientific progress.
Dr. Baganoli, you write about the steps necessary to take and where you think this might be going.
How well is this progressing? How well are we getting toward the goals that you talk about?
I think when there's been any huge innovation in technology, there's always a bit of a policy gap
between people trying to catch up with what's happened and create policies and ways of
working which will adapt to this huge new innovation. And that's certainly what we're seeing now.
You know, it's been a couple of years since chat GPT. And only really now, I think, are journals and
editors really getting up to speed with the types of things that we might need. But also, because
large language models are incredible tools with the ability to improve all the time. And we've
seen that with the versions that have already come out. Each time, there's an improvement in how
they are performing. We need to be very flexible and adaptable to change to those different things
because we might start seeing more hallucinations. There was a very interesting paper in nature
a couple of months ago about the fact that large language models are, you know, when they come
to the end of what's already been published, what's already on the internet, how do they get new data
and what happens if you use synthetic data? And actually, for the most part, it's,
It looked like those models almost completely fell apart.
They stopped being able to work.
So, you know, there are lots of issues that are going to become clear over the coming months
that we'll need to be very alive to and be able to adapt to.
But at the moment, I think we are certainly at the Lancet,
and I know many other journals, spending an awful lot of time thinking about this.
We are implementing practical, tangible policies,
which are meant to be able to improve the process for authors,
but also to make the content that we publish very high quality and very useful and usable for our readers.
So you have to keep up with it as chat GPT gets better. You have to.
We have to get better. Exactly. Exactly. We must.
Very interesting stuff. We're going to keep track of all of this. Thank you very much for taking time to be with us today, Dr. Baganall.
No problem. It was lovely to chat to you.
Dr. Gessami Baganall, senior executive editor at The Lancet and adjunct professor at the University of North Carolina at Chapel Hill.
And that's all the time we have for today.
Lots of people help make the show happen, including
Kathleen Davis, Diana Plasker, Beth Ramney, Emma Gomez.
On tomorrow's episode, former National Institutes of Health Director Francis Collins
talks about his new book and what it was like being on the forefront of scientific discovery
for 40 years.
I'm SciFRI producer Dee Peter Schmidt.
See you then.
