Science Friday - How Are AI Chatbots Changing Scientific Publishing?

Starting point is 00:00:03 Tools like ChatGPT are changing how scientific publishers judge research papers. We're in the business of text. We're in the business of language. And so there's no part of our work that could not be disrupted by this new tool. It's Wednesday, September 25th, and you're listening to Science Friday. I'm Cyfry producer Dee Petersmith. Since it came out almost three years ago, chatGyT and other generative AI chatbots have changed how we think about the role artificial intelligence plays in all walks of life. They played a huge role in the recent Hollywood strikes, energy usage is spiking because of them, and they're having a chilling effect on various writing-related industries. But their effects are coming into clearer view for another industry, scientific publishing. So how are scientific journals navigating this new environment?

Starting point is 00:00:53 Here's our flato with an editor at a top scientific journal to answer these questions. Here to talk about the effects these chatbots are having on scientific publishing is my guest, Dr. Jessamy Bagginal. Senior Executive Editor at the Lancet, physician and adjunct professor at University of North Carolina at Chapel Hill. Welcome to Science Friday. Hi, thanks so much for having me on. You're welcome. Okay, tell us, where does this start for you? Tell us the first time you saw an example or an article that showed the power that AI chatbots could have on scientific publishing.

Starting point is 00:01:28 Well, it's been a couple of years now, but obviously it's a rapidly moving field. and I've come at it from an editor's point of view, but also from a clinician's point of view and how we think about evidence and how we think about knowledge. I followed the story very closely after its initial launch. And I think some of the things that I found most striking were those original small studies where, for example, trained researchers would look at abstracts that had been generated by a researcher and abstracts that had been generated by a generative AI, a large language model. And for the most part, part they couldn't tell the difference. I think that study came about perhaps, you know, within the

Starting point is 00:02:07 first six months of chat GPT first being sort of released onto the market. And it was very clear that if experienced researchers aren't able to tell the difference between generative AI abstract and one that has been written by their colleagues, then this is a really big problem for us. Is that because they lack the effective tools to detect AI generated content? I mean, that's right to an extent that we don't have any effective tools that will reliably and sensitively pick up when generative AI has been used. But I was sitting on a panel recently, you know, this is a huge discussion within the field. And in fact, my colleague, who's our deputy editor, made a joke the other day that we'd had an agenda item on a meeting for generative AI. And she said, you know, we always know that we need at least 45 minutes to discuss anything about.

Starting point is 00:03:00 generative AI because editors, researchers, you know, we're very alive to this topic and thinking about the best way that we can use it all the time. But a colleague of mine on this panel was saying that, you know, we are, we're in the business of text. We're in the business of language. And now we have this amazing tool which can generate language. But there's actually no part of our value chain that might not be disrupted by this innovation. written, you know, peer review is done for the most part through the written word. Articles are still written in a way that, you know, hasn't changed for a very long time. All of this is based on text, on language. And so there's no part of our work that could not be

Starting point is 00:03:44 disrupted by this new tool. Interesting. Now, when you say disruption, and I hear you speaking about this as being negative. No, I don't think it's negative. No, you don't. I don't think it's just negative, but it has to be thoughtfully and sensitively implemented. And that's challenging because it's a very rapidly moving field. And we're all just getting up to speed with how people are using it and what appropriate use looks like. So for example, at the Lancet, we implemented a new tick box about six months ago where we ask authors at the submission stage, have you used generative AI in any part of this study? And if they tick yes, then we ask them how.

Starting point is 00:04:28 And then in our editorial manager, we have a little sort of red A, which appears to alert editors to the fact that generative AI has been used in some way in this manuscript. And then we're able to follow, you know, sort of external widespread policies on how generative AI should be acknowledged in a manuscript. But these things are changing all the time. And I think there's a huge opportunity for generative AI to. to be a great positive influence on scientific publishing, but there are also dangers. And so it has to be very carefully thought about.

Starting point is 00:05:04 I understand that. You know, I have been reading research papers for decades, and I'm always struck about how poorly some of them are written. Not that the data is bad. And I'm thinking, may, if you unleash AI on this, maybe you can get a better written narrative here going. Would that be some positive? Definitely, that's one positive.

Starting point is 00:05:21 And I think from an inclusion and diversity point of view, we still transmit so much knowledge through the English language, which excludes an enormous amount of people because English is not their first language. We're very lucky at the Lancet to have a very large team of internal assistant editors who make everything into Lancet style. But that's not the same across the scientific publishing field in many journals. There isn't that internal expertise. And so actually if you submit something which isn't written particularly well, then of course that will impact the likelihood of whether that editor decides to send it out or not because they might have problems understanding what the actual research is saying. So I think there's an enormous opportunity there to make it more inclusive, make scientific more inclusive and fairer. And I think also when we're thinking about people who might be neurodivergent, I've got lots of clinician friends who are dyslexic and actually, being able to use large language models to help them structure sentences and how to structure an article

Starting point is 00:06:24 is a very efficient way of them being able to articulate themselves and their ideas in what most people consider a sort of socially acceptable manner. Yeah, because most people think of AI as cheating, but you're not talking about that here, and you're pointing out the positive aspects. And when you talk to scientists who do use AI, generative AI, what did they say about using like chat, GBT to help write their research papers? I think scientists and clinicians across the world are for the most part doing amazing work under incredibly stressful situations and they're often overloaded with work.

Starting point is 00:07:09 You know, their to-do lists are extraordinarily long. And so having ChatBTBT as an efficiency tool, which can allow them to, you know, put together an article very quickly or might allow them to write a cover letter in a more compelling manner. And from our point of view, from a scientific publisher's point of view, generative AI that might be able to allow our submission process to be easier for authors and for us as editors to be able to interact. with them in a kind of more slick and easy fashion, that type of efficiency could have real benefits from, you know, to patients and to people's lives and to scientific progress. I know that since chat GPT came out, the major journals have provided some guidelines and policies to researchers about the use of generative AI in papers. But it also seems like a pretty messy landscape right now.

Starting point is 00:08:08 I mean, are these guidelines standard across all the journals and research papers? Well, we obviously have external bodies which bring together a number of different journals. So, for instance, the ICME, which is the sort of bringing together lots of medical editors and journals. They have published guidance. And so any journals that sort of consigned to them also tend to take some of their guidance on for generative AI. And equally, organizations like Cope, which help. with editorial guidance for journals, also have their own sort of set of guidelines. So I think it's right that there are these external sort of benchmarking places which are

Starting point is 00:08:49 releasing, you know, sort of loose guidance. But obviously each journal is different. Each journal has a different topic. They have different article types. They have different things that they're trying to do with those journals. And, you know, let's not forget that journals are actually very human endeavors. They are how we as humans interpret scientific progress and how we put it into context and what that means for either patients or for scientists and for that field. And so I also think that it's right that each journal should be very clear on how they want generative AI to be used. So for example, at the Lancet, we have a section which includes commentary, correspondence, perspectives, art of medicine.

Starting point is 00:09:39 And this is an area of our journal which really requires human interpretation. And so we're in the process at the moment of thinking about the fact that we would like to limit the use of generative AI for this section because we feel passionately that human ingenuity and putting things into context and being able to see what's new,

Starting point is 00:10:00 not just trawling through what's on the internet and putting together what sounds, good about a particular topic, but actually expertise, experience, and vision. We are thinking about limiting the use of generative AI in that section to just using it for English grammar and spelling, so that we're not excluding people who don't speak English. Of course, the elephant in the room here are paper mills, and I'm not talking about factories that make paper. Can you explain what those are and why they're such a big issue? Yeah, so paper mills are sort of nefarious organisations that essentially have understood the scientific publishing landscape and are gaming it and selling authorship for manuscripts that often are filled with nonsense.

Starting point is 00:11:01 And so you may all have heard of mass retractions from different publishers having to retract articles that essentially were not based in any scientific fact and were not really science, but often complete nonsense. And they're a huge problem. There are a problem for publishers. But in the wider context, there are a problem for science because this, you know, really breaks down the trust. This is phony. We're talking about phony papers here. We're talking about phony papers.

Starting point is 00:11:28 So it could literally be a manuscript about complete nonsense where the results are fabricated, the context is fabricated. And authorship is sold to academics for these publishers. And so they've kind of got into the editorial process by, perhaps having guest editors, they've manipulated the peer review process. And they're an enormous problem for the scientific publishing world. And so there's a real question there as to how, in the context of generative AI, how as editors do we make sure that what we're reading is real? Yeah. You know, this may sound like a crazy question, but why not use, if they're writing it

Starting point is 00:12:08 in chat, GBT or AI, why not use chat GBT or AI to find them, you know, to weed out some of these papers. That's exactly right. This is a big data problem. And I think Elsevier, which is the company that owns The Lancet, is putting in an enormous amount of resources and effort into thinking about research misconduct and research integrity in the context of big data. You know, how can we use some of these patterns across many, many different papers to be able to pick out what's real and what's not real. But in the larger context, in some ways, generative AI will sort of turbocharge that, right? Because you're able to very quickly put together a nonsense manuscript that looks and sounds like it should be published, but actually might be about nothing. On the other hand,

Starting point is 00:12:59 you know, paper mill business model is based on people paying for authorship. And actually, if people at home on their own can put together this type of paper, why would they pay a paper mill to do it for them. They might just do it themselves. So I think, you know, this is a huge problem and one that I know a lot of people are thinking about very seriously. And some of the solutions, potential solutions here, can you offer any? I mean, I think they lie in sort of big and small changes. And for the most part, they're probably going to be pretty costly and difficult. I think a major step is to recognize that over the past decade, two decades, we've had a major trend towards the open science movement,

Starting point is 00:13:43 which nobody can disagree with from an ethical or moral way. We all want science to be accessible and available to everybody. But in reality, what that's meant from a scientific publishing business model aspect, it's meant that authors pay to get their articles published open access, and so there has been a focus on quantity over quality. And I think that that's rapidly adjusting and changing. And many scientific publishers are changing their way that they're thinking about that. So we need some better business models.

Starting point is 00:14:19 We need some other ways of thinking about open access in the context of generative AI. And then I think, you know, another major step is thinking about the environment within which we all work. You know, there is a serious problem with academic environments which often reward, you know, the quantity that an academic has published over the quality that they've published. Public or perish. Yeah, exactly. Publish or perish. So there's an incentive there to publish, publish, publish, regardless of whether it might constitute a bit of research waste in terms of does this question really need to be answered. Has it already been answered?

Starting point is 00:14:55 But then on the other end of the scale, there is an incentive there. to try and get things published, which aren't necessarily adding to human health or to scientific progress. Dr. Baganoli, you write about the steps necessary to take and where you think this might be going. How well is this progressing? How well are we getting toward the goals that you talk about? I think when there's been any huge innovation in technology, there's always a bit of a policy gap between people trying to catch up with what's happened and create policies and ways of working which will adapt to this huge new innovation. And that's certainly what we're seeing now. You know, it's been a couple of years since chat GPT. And only really now, I think, are journals and

Starting point is 00:15:40 editors really getting up to speed with the types of things that we might need. But also, because large language models are incredible tools with the ability to improve all the time. And we've seen that with the versions that have already come out. Each time, there's an improvement in how they are performing. We need to be very flexible and adaptable to change to those different things because we might start seeing more hallucinations. There was a very interesting paper in nature a couple of months ago about the fact that large language models are, you know, when they come to the end of what's already been published, what's already on the internet, how do they get new data and what happens if you use synthetic data? And actually, for the most part, it's,

Starting point is 00:16:28 It looked like those models almost completely fell apart. They stopped being able to work. So, you know, there are lots of issues that are going to become clear over the coming months that we'll need to be very alive to and be able to adapt to. But at the moment, I think we are certainly at the Lancet, and I know many other journals, spending an awful lot of time thinking about this. We are implementing practical, tangible policies, which are meant to be able to improve the process for authors,

Starting point is 00:16:58 but also to make the content that we publish very high quality and very useful and usable for our readers. So you have to keep up with it as chat GPT gets better. You have to. We have to get better. Exactly. Exactly. We must. Very interesting stuff. We're going to keep track of all of this. Thank you very much for taking time to be with us today, Dr. Baganall. No problem. It was lovely to chat to you. Dr. Gessami Baganall, senior executive editor at The Lancet and adjunct professor at the University of North Carolina at Chapel Hill. And that's all the time we have for today. Lots of people help make the show happen, including

Starting point is 00:17:32 Kathleen Davis, Diana Plasker, Beth Ramney, Emma Gomez. On tomorrow's episode, former National Institutes of Health Director Francis Collins talks about his new book and what it was like being on the forefront of scientific discovery for 40 years. I'm SciFRI producer Dee Peter Schmidt. See you then.

Science Friday - How Are AI Chatbots Changing Scientific Publishing?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.