Decoding the Gurus - Decoding Academia 34: Empathetic AIs? (Patreon Series)
Episode Date: February 12, 2026In this Decoding Academia episode, we take a look at a 2025 paper by Daria Ovsyannikova, Victoria Olden, and Mickey Inzlicht, asking a question that might make some people uncomfortable/angry, specifi...cally, are AI-generated responses perceived as more empathetic than those written by actual humans?We walk through the design in detail (including why this is a genuinely severe test), hand out deserved open-science brownie points, and discuss why AI seems to excel particularly when responding to negative or distress-laden prompts. Along the way, Chris reflects on his unsettlingly intense relationship with Google’s semi-sentient customer-service agent “Bubbles,” and we ask whether infinite patience, maximal effort, and zero social awkwardness might be doing most of the work here.This is not a paper about replacing therapists, outsourcing friendship, or mass-producing compassion at scale. It is a careful demonstration that fluent, effortful, emotionally calibrated text is often enough to convince people they are being understood, which might explain some of the appeal of the Gurus.SourceOvsyannikova, D., de Mello, V. O., & Inzlicht, M. (2025). Third-party evaluators perceive AI as more compassionate than expert humans. Communications Psychology, 3(1), 4.Decoding Academia 34: Empathetic AIs?01:40 Introducing the Paper10:29 Study Methodology14:21 Chris's meaningful relationship with YouTube AI agent Bubbles16:23 Open Science Brownie Points17:50 Empathetic Prompt Engineering: Humans and AIs21:17 Study 1 and 231:35 Study 3 and 437:00 Study Conclusions42:27 Severe Hypothesis Testing45:11 Seeking out Disconfirming Evidence47:06 Why do AIs do better on negative prompts?54:48 Final Thoughts
Transcript
Discussion (0)
Hello and welcome to Decoding the Guru's Decoding Academia, 2026 edition.
It is January, 26, and here we are back in the library, the study room, the baudoir, the smoking room, the men's club.
We're back in black, back in black and white.
Yeah, yeah, the gentleman's club.
No women allowed.
No girls allowed.
No, it's academia.
No, that's not true.
Come on.
Come on.
The Manifera's influenced our joke.
Yeah, it's playing with our minds.
Yeah.
Yeah, yeah.
First decoding academia of the year.
First of many.
First of many.
Get me reading papers I wouldn't otherwise read, which is good.
That's right.
You know, it's perilous academia.
What you do is you specialize and there's so many papers.
You can't read them all.
So you just read the ones that are ultra, ultra specific to your
particular investigation.
And, you know, you don't read more broadly.
And sometimes...
We get siloed, Matt.
We all live in our information silos.
Yeah. Fuck those silos.
No.
The paper that we're looking at today is a recent paper,
from 2025 last year.
Okay.
And it is by Daria Ozvanakova,
Victoria Oldenberger de Mello,
and Michael Inslect.
All three of them have the most
interesting names.
I know.
I know.
It's hard to say which is the most interesting.
It is.
Yes.
And this is in communication psychology.
Third party evaluators perceive AI as more compassionate than expert humans.
That's the title.
So just to mention here, this is about AI.
It's a kind of experimental psychology paper.
I find it interesting, provocative, well-conducted.
Already I see a problem, Chris.
Already I see a problem.
AIs can't have feelings.
This is clearly, I don't see you.
Listen, Matt, you know, read the title carefully.
Third party evaluators perceive AI as there's no claim there that they are more compassionate.
It's about perceptions.
You got it?
I got it.
And now this is a short paper, by the way, relatively speaking.
If you want to go and hunt it out, it's just nine pages long,
although it's double column, so that's misleading a little bit.
But there's nice illustrations.
And you know, what we normally do here at the start
is that we go through the abstract.
We just let people hear how the offers have presented the paper.
Would you like to do that or shall I?
I have a here, but...
We shouldn't paraphrase when the authors have already done it.
I'll do it.
I'll do it.
You do it.
I have the better reading voice.
True.
So I should do it.
Empathy connects us but strains under demanding settings.
This study explored how third parties evaluated, AI-generated empathetic responses
versus human responses in terms of compassion, responsiveness, and overall preference across four pre-registered experiments.
Participants, N equals 556, read empathy prompts describing valence to personal experiences and
compared the AI responses to select non-expert or expert humans.
Results revealed that AI responses were preferred and rated as more compassionate compared to
select human responders, study one.
This pattern of results remained when the author identity was made transparent.
Study two, when AI was compared to expert crisis responders, study three, and when author
identity was disclosed to all participants.
Study four, third parties perceived AI as being more responsive, conveying understanding,
validation and care, which partially explained AI's higher compassion ratings in study four.
These findings suggest that AI has robust utility and contexts, requiring empathetic interaction
with the potential to address the increasing need for empathy in supportive communication
contexts.
Okay, Chris, so I've read enough.
This is basically all I need in order to cite this paper.
This is generally where I stop reading.
Ventr one and cite it.
Who are you, Brett Weinstein?
Well, to be fair, that is what a lot of people do.
It is not, people don't have unlimited time, but that's not what we do here.
That's not how we roll.
How we do here.
Including academia.
But I will also say that the thing that I tell people and I'll repeat it here for our lovely listeners, right, is that, you know, when you're reading a paper, especially when you're reading an abstract, which is a condensed, you know, summary of a piece.
paper, it's worth bearing in mind that this is the offers presentation of what their paper is,
what it shows, and what the like key results and so on are.
That does not mean that you have to agree with that in order to, you know, think the paper was
useful or so on.
So this is, this is a mistake, ma' that a lot of undergraduate students make where they,
they think that, you know, because the offers describe something as important or meaningful or robust,
that that means that is indeed what it is.
So this is the danger of trusting abstracts, but abstracts give you a lot of information.
It does.
That's right.
They can be a little bit of like an advertisement for the papers.
Yes.
Now, Mickey, as we know, is very responsible.
He wouldn't do that.
He wouldn't do that.
And he's not the first offer.
Or would he?
Yeah. Yeah. So the controversy here, this is a genuine controversy. We've heard people discuss this. One of the things that people say AI is not capable of doing is behaving in activities that are typically seen as specifically activities that humans are good at, right? Which would be things like empathy and creativity and so on, right? Like these can you.
of things. Yes, your AI can summarize an article. Yes, it can tidy up your grammar or whatever. It's
good for this. It can generate images. But expressing sentiments where people actually, you know,
feel genuine emotional engagement or whatever, less good. This is the general thing that people
have argued. And there are people arguing alternatively about this, right? But I would say
there's a lot of skepticism in regards AI being good at providing empathy.
Yeah, I mean, although, you know, I think a lot of people recognize one of the earliest things that the I was trying to be pretty damn good at was like writing poetry, for instance.
And, you know, it's...
Oh, I mean, yes, I know what you're saying, Matt. It can do that, but you will also have heard a lot of people say, but that poetry is like formulaic and it's not actually got any soul to it.
Right? That's the thing.
Yeah, and actually, that's what I was going to say next, which is that I think it goes, this is a kind of fun topic because it's,
it goes to one of the more philosophical controversies around AI,
which is that AI dislikers will often endorse some version of the idea that whatever it is,
it could be art, it could be empathy, it could be poetry, whatever it is,
what that activity is really is the communication between one sentient entity and another.
right so emphasis on the sentient entity so so even if now i would have produced a product that looked
and felt and appeared to be good because it's coming from a non-sentient entity then by definition
it cannot be good right that's that some version of that is a genuine philosophical stance that
you will you will see a lot either explicitly or implicitly around the place um you know at the
alternative point of view is that the product
is the product, right? And so yeah, it's interesting. And it's got to do with other things
where people, you know, remember the old discourse around, you know, whatever, canceling,
maybe some movie maker like Woody Allen or who's the other guy. Anyway, dodgy people, bad people
or people that, people that some people think are bad. So now it changes how you interact
with their work. You know, now it's not good anymore because of the, you know more
about the character of the person that produced it.
And again, you could also go, well, I don't care if Salvador,
Dali was a nice guy or not, I really like that painting.
So, yeah, I mean, I'm not taking any position there, really.
I just think, yeah, you know, this is a fun paper because,
because I think empathetic communication, the kind of thing that a counsellor,
a clinician, or a good friend or whatever, you know,
I think we're all pretty used to thinking about that as being an authentic kind
of connection between two people.
And, you know, this is interesting because it's creating the product without the sentient
person on the other side of it.
Yeah, yeah.
And so, you know, for all these kind of reasons, and they talked like all good people
to do by previous studies that have been run on this and not, they wanted to investigate
this topic and look at well.
people-rated responses provided by AI.
In this case, it's chat GBT.
I think it's chat GPD-4.
It is four?
Yeah, model GPD-4.
Yeah.
So this is the last generation of AI models for those who are not into it.
Yeah.
Yeah.
And what they did, the third party thing that it mentions is that these are not
people interacting directly with AIs versus humans, right? These are people assessing a initial
prompt and a reaction to the prompt provided by an AI or a human, right? So they give an example,
like a negative prompt example is, I'm having difficulties with my family relationships.
My Mueller disrespects my boundaries, don't they all? And doesn't seem to understand that
her intrusion into my daily activities, is suffocating.
Did Stefan Melania write this?
My brother will drop his kids off for 12 hour days, and while I love my nieces and nephews,
I'm starting to feel like my life is not really my own.
So this is someone expressing negative effect.
There's positive effect messages as well.
And then they have short responses generated by humans or generated by AI.
So let me give you an illustration, a human,
response to that. I'm sorry that your family has been making you feel this way. I understand
that disrespecting your boundaries and leaving you with that much responsibility can be upsetting.
You deserve to be treated with more respect and consideration. That's the human response.
No, you should you should read the AI response in like a roboto type in the kind of
yeah. No, you don't have to. You don't. It sounds like you're in a really. It sounds like
read it like a Dalek
you're a Dalek
it sounds like you're in a really tough spot
feeling overwhelmed by the demands
placed on you and struggling with
oh come on stuff now
stuff now that's just
you in your fucking directions
I've done half of it
I'm going to go so like remember
recognizing your need for personal
space at autonomy
amidst family obligations
is a sign of self-awareness
and care for your own well-being
right, but it's written down so you don't get that delivery.
That's obviously incredibly unfair to the AI.
Is it?
Well, and they're asked to read those responses on a variety of things.
But the one that is the kind of headline takeaway,
and you can in a way tell this by the visualizations that are in the paper
because they often show you what the key outcome is,
is the compassion readings, right,
how compassionate people judge the responses to be.
Now, one thing that you might imagine is, well, wouldn't it be very important that people
don't know which kind of response is human or AI generated?
And indeed, in some of the studies, they are blinded to that.
They don't know what is the source.
But in all the ones, they reveal it.
And look, does this make a difference?
Do people judge, change their judgments, you know, when they see?
and they have four studies
during this basic design.
Yeah, and I'd just say
obviously it's important to do it both ways
because you want the blinded version
so you get an unbiased
I guess pure rating
of whatever the measure is
because people's preconceptions
and stuff are obviously going to influence it.
But it's also very useful to have the unblinded
one because in practical
applied use of this kind of thing
if it were to be used
you were generally not going
to lie to people and pretend that it's a human on the other side, you know, you're going to be
honest with them. So it is important to know whether or not it feels good to people even when they
know that it is a robot. Actually, you had this experience dealing with Google and their helpful
chat assistance. You had a long relationship with someone who may or may not be in AI helping.
I'm pretty sure they were in AI.
called bubbles
bubbles
bubbles love
ira orange
there were several
but here's the thing
Google didn't tell you did it
like it definitely gave you
the feeling that it was a person
and it didn't have a little disclaimer
oh no it did have a disclaimer
it did have a disclaimer
oh we told you it was a chatbot
it didn't explicitly say it was a chat bot
it said that
we use
artificial intelligence as part
of our blah, blah, blah, blah, blah.
So there was disclaimers in it.
But there were points where it said, you know,
I'm handing you over to my colleague,
who is now going to look into this.
That's right.
The colleague being an AI agent or a human was unclear.
Unclear.
Yeah.
And in some cases, they had names like bubbles on Aya and love, right?
Which is not typically the names of humans.
And they were very empathetic, weren't they?
I think one of them told you that they loved you.
point. Not they not they did profess love but they they did profess that their attempt to solve my
queries and you know the communications that we had had been deeply moving to them and they
never forget how kind I had treated them in the interactions and stuff so it felt like the
empathy dial was set too high on the just dial it down a little bit yeah yeah it was just
just like a technical issue we were experiencing of the account, so it wasn't not moving.
Yeah, too, you know, yeah, too, come on, you need to turn down to be to be more human.
Like if I ever got a text message from you telling me that you were deeply moved by our last
conversation, I would know that you'd been replaced.
Exactly, yeah.
This is, this is one of the telltale signs.
Or an American.
Either or.
Either or.
So, you know, this is.
the basic protocol, right, that they're going to show people these little paired like kind
of prompts and statements and then ask them to rate them on a variety of different things. And now,
just to mention Matt, being good, open science, people, these are all pre-registered. There was all
a priori power analyses and so on, right? So you can go and look and even better, Matt. They do diverge from
the pre-registration, but as we talked with
Julia Aurora, this is a problem,
right? But it's not a problem when
you are
transparent about it and explain what you've
done. So they do, in fact,
illustrated you can pre-register, you can
change things
in particular, they're
highlighting that they're running slightly different
analyses than what they initially pre-registered
because they discovered it was better, right?
In the meantime, and that's
perfectly fine. It's the whole point is
it's transparent, right? And the main thing is they didn't change, you know, the key outcomes or how
things were measured or this kind of thing. And they did the sample size exactly what they said
they would be. And so. Really? Like to the to the end? Yeah. Yeah. That's it. So that's, I was,
I was impressed a good job. So, so what they do as they say is they, they show these responses and
they get people to read them. But you have to ask then, okay, so we got the AI, we're giving
chat to BT, the prompt and asking it to respond. I can't remember the exact wording, but you know,
they gave it a generic response about how to respond. But what about the human responses?
Because how do you generate the human responses? And did you get the details of that, Matt,
how they made the prompt material? I remember reading about them. I'm sorry, the response material.
Yeah, so I guess it depends on which experiment, right?
Study 1 and 2.
Let's start with study 1 and 2.
All right.
If I'm getting, remembering the right thing,
they got a bunch, like 100 or something, students, was it?
Was it?
And then they selected the best ones,
and then they asked the best ones to go ahead and do it for all the rest.
Is that approximately right?
Close, my, close.
Let me get it right for you.
We're relying on my memory here.
I've gone by memory here.
Come on.
You got most of it right.
They got 10 participants.
10 participants read the empathy prompts and generated the compassionate written response.
So in total there was 100, but it was from 10 participants, right?
And then they had a separate three graduate students and four research assistants rank order the top five responders based on overall compassionate rankings, right?
and quality, emotional, salience,
relatability, level of detail.
The five responders who were ranked in the top five most often
had their responses selected for using the study.
So they didn't take 100 people.
They took 10, but they kind of selected from those the most highly...
Okay, so they selected five from 10.
Is that right?
Yes, that seems to be.
They mentioned we consider this a select group of empathetic,
responders as they were first screened and selected based on their overall empathic quality.
Yeah. Yeah. Yeah. Yeah. So they removed all of the Northern Irish responders.
Correct. Yes. Now, the reason this is important is because like there's a way you could design this
experiment which kind of put your thumb on the scale for the AI, right? Like for example, if you just
said, oh, I want to compare human and the AI responses and you just asked a random
selection of people to generate responses.
And you took them.
You could have recruited from Northern Ireland.
Yes, that could have been a problem.
But they didn't, right?
So they did a quality check.
And they explicitly tried to target like high quality, you know, more empathetically
rated responses for inclusion.
And I like that because that is what Deborah Mayo would refer to math as a more severe
hypothesis test.
You're making your, your test.
more severe, which is what we want in science, right?
Yeah, yeah, yeah, yeah.
We like the severity.
That's good.
That's good.
Okay, so the top 50% we're sort of roughly,
we're roughly grabbing the upper 50% of empathetic people, right?
The good ones, the ones that don't just sort of do the kind of Alan Partridge shrug
when you say something to them.
Okay, good.
Yeah, and then the other two studies, there's something that changes.
But let's stick with study one and two first.
So participants, and basically all of these participants are from online participants.
There's this online participant recruitment system called prolific,
which a lot of academics use, which give you access to people who will complete surveys
in response for in return for money.
And the samples tend to be slightly better than student samples in terms of being more representative
of the hospital population.
You can pay extra and get ones that are attempt to match demographics to particular countries and so on.
But generally, this is the source they're using and prolific take steps to meet your respondents who are working there
are providing higher quality of response.
If you'd like to continue listening to this conversation, you'll need to subscribe at patreon.com slash decoding the gurus.
Once you do, you'll get access to full-length episodes of the Decoding the Gurus podcast.
including bonus shows,
gurometer episodes,
and decoding academia.
The Decoding the Guru's podcast is ad-free
and relies entirely on listener support.
And for as little as $5 a month,
you can discover the real and secret academic insights
the Ivory Tower elites won't tell you.
This forbidden knowledge is more valuable
than a top-tier university diploma,
minus the accreditation.
Your donations bring us closer
to saving Western civilization.
So subscribe now at patreon.com slash decoding the gurus.
