Speaking of Psychology - Can a personality test determine if you’re a good fit for a job? With Fred Oswald, PhD
Episode Date: July 14, 2021These days, many companies use assessments such as personality tests as part of the hiring process or in career development programs. Fred Oswald, PhD, director of the Organization and Workforce Labor...atory at Rice University, discusses why companies use these tests, what employers and workers can learn from them, and how new technologies, including artificial intelligence, are changing workplace assessments. Listener Survey - https://www.apa.org/podcastsurvey Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
Have you applied for a job lately? If so, you may have been surprised to find that a resume and cover letter weren't enough to get you in the door. In addition to these traditional methods of screening job applicants, many companies now also use some form of pre-employment testing, including personality assessments, to help determine whether a candidate will be a good fit for the job. In fact, one 2018 survey of human resources professionals found that 79% of them use some kind of testing when making external hiring decisions,
and 72% use testing for internal hires.
So even if you've been in your job for a while,
you may find yourself taking an assessment at work.
The same survey found that 79% of respondents
used assessments in their company's career development programs.
So why do employers use these tests?
Are the assessments that they're using reliable and valid?
Can a person's personality or other characteristics
help predict whether they'll succeed at work?
And how are new technologies, including artificial intelligence,
changing workplace assessments. What can employers, employees, and job applicants expect to see next
on this front? Welcome to Speaking of Psychology, the flagship podcast of the American Psychological
Association that examines the links between psychological science and everyday life. I'm Kim Mills.
Our guest today is Dr. Fred Oswald, a professor of industrial organizational psychology,
and director of the Organization and Workforce Laboratory at Rice University.
Dr. Oswald studies the factors that contribute to workplace success, including how to understand,
define, and measure the individual differences that affect employees' performance.
He's also interested more broadly in the future of work.
His 2019 book, Workforce Readiness and the Future of Work,
examines how technology, education, and policy will shape the future of work.
He is a past president of the Society for Industrial and Organizational Psychology,
which is a division of the American Psychological Association.
Thank you for joining us, Dr. Oswald.
Kim, thank you for having me.
This is a real pleasure and look forward to the conversation today.
Great.
Well, let's start with a broad question.
Why do employers use pre-employment testing?
What do they hope to learn about job applicants that they can't get from a resume,
a cover letter, or a job interview?
Well, testing has a long history and a wide range of purposes,
whether we're talking about employment or we all have taken plenty of tests
in school. There are tests for certification. For example, airline pilots take tests to ultimately
guard public safety. And employers are hoping to understand who is coming into the workplace.
What are their background characteristics? What are the knowledge, skills, and abilities they have
coming in or what are their qualities. I realize we'll talk about personality today. And so
testing is not perfect, but perfect should not be the enemy of the good in terms of using tools
that can help make good decisions, both for the organization and for the job applicant,
who obviously seeks to find employment that's fitting for them,
not merely get an invitation to join the organization,
but actually succeed in it.
So a moment ago I mentioned that employers are using testing
to make hiring decisions and companies use them in leadership
and career development programs.
What kinds of tests are they using?
What are the questions that employees and prospective employees are being asked?
Yeah, so my own research and experience deals more
with personnel selection on the point of initial hire.
So think about characteristics like personality or job knowledge or sometimes logical reasoning skills.
If they don't have prior background indicated in their resume, that they have prior technical skills,
there might be a test for some general reasoning.
also interests and motivation.
Those obviously differ between people and people have different profiles of interests and types of motivation and goals.
And so tests attempt to get some of those characteristics of job applicants, hopefully in a reliable and valid way.
And we can talk about that further.
Yeah, I did want to talk about that.
And what's the difference between a good workplace?
assessment tool and a bad one? A useful framework to go off of is think of a three-legged stool
of reliability, validity, and fairness. Reliability deals with whether what you're measuring is
stable over time in the case of job applicants. In other words, you don't want an applicant
to be taking a test that is essentially the role of the dice or is measuring something like
mood that fluctuates. Instead, what you want are characteristics that are likely to appear on the job
upon the point of hire. And that's where the employee and the organization starts out with the
employee to move them forward through training and development. And as you mentioned, leadership and
career progression as well. And so reliability is a cornerstone.
of measurement to make sure that a test that claims to measure personality, for instance,
actually is doing so.
We could put labels on any test and make the claim that it's measuring what we say.
But how do we know?
And we need data to inform that.
And so there are database approaches to ensure that scores are measuring what they should,
you know, personality, job knowledge, motivation.
etc. And also turning to the second leg of the stool of validity. So do these tests predict outcomes
in organizations that we think they should? So we know that, for example, conscientiousness is a
factor of personality known to predict job performance pretty consistently across jobs,
over time across cultures.
And of course, that relationship makes sense to some degree.
You need to show up on time for your job.
You need to follow the rules.
You need to set goals and achieve them and so on.
But like most things in psychology, the devil is in the details.
Can you support that common sense with well-developed measures, with evidence?
and with good measures behind the data that you're collecting.
So not just on the personality side in the case of this example,
but how are we measuring performance in the organization?
What does it mean to be a successful worker?
It's as much philosophical as it is a measurement and a statistical issue
to try and figure out what problem we're trying to solve through testing.
And then the third leg of the stool,
is fairness to make sure that anyone who takes the test has the same opportunities to reflect
who they are or what they know. And job irrelevant characteristics are suppressed. So, you know,
you wouldn't want a test that had too high of a vocabulary level if you're not measuring
vocabulary. That would be unfair. That kind of thing. Or, you know, a
accommodating people who have a vision impairment, for example.
If visual acuity is not required for the job,
then you shouldn't place visual demands on the test
for people that need that accommodation.
So being sensitive to the people who are taking tests
is just as important as what you do with the scores that come out.
Is there any independent body that validates these tests?
I mean, how would an employer choose one and know that it's a good one?
So there are various professional documents that are used by folks who develop tests.
So in my world, for example, as you mentioned, I'm a member of the Society for Industrial and Organizational Psychology, which is Division 14 within APA.
We have a document called the Principles for the Validation and Use of Personnel Selection Procedures.
It's in its fifth edition.
It's actually free to download through the APA website.
And it contains some of the principles that I've reflected on in terms of reliability, validity, and fairness.
Because those are the principles that go into any test, no matter what you're measuring, how you're measuring it.
You know, as we move forward into the world of AI and new technologies for measurement,
these principles still hold to make sure we're measuring,
job relevant characteristics and administering tests in a reliable, fair, valid, secure,
ethical, legal manner.
And so that's a major resource in the area of employment testing.
But there's also the standards for educational and psychological assessment, which APA is also
involved in as a co-editor with two other organizations.
And similar principles apply, but it's broadened into the domains, as the title suggests, into education, whether that's K-12 or placement issues into different educational programs.
And it's more about that context.
So hopefully those who are listening understand that, you know, we've been thinking about tests for a long time.
The SAC document I mentioned, it's in its fifth edition.
It's maybe as many decades old or close to it.
And so serious attention has been paid to help ensure that tests are measuring what they should.
And anyone seeking to purchase a test should be thinking about,
some of these issues, or at least they would benefit from doing so and not merely trusting what they
sounds good, but actually seeing whether there's evidence behind the claims of a test being
reliable, valid, and fair. One of the reasons that I wanted to talk to you today was because of a
kerfuffle that ensued among some psychologists a couple of months ago when HBO Max aired a documentary
film called Persona, the Dark Truth Behind Personality Tests. The film focused mostly on the Myers-Briggs
personality test, which I'm sure a lot of our listeners have heard of and maybe have even taken.
What's your view of this test in particular? Is it a valid tool for employers? And why is it so
popular? I'm hesitating here because I don't want to put any particular company test on the mat or
or as a target, but we all understand that the Myers-Briggs is a popular test.
It's used widely in organizations for a variety of purposes.
And what I'll say about that test really applies more generally to tests that are like it.
The early versions of the Myers-Briggs, and I realize the test has been revised, perhaps in ways I'm unaware of,
but at least the fundamentals of the test in its early development had to do with profiles.
So a person would say they are an INTJ.
And I honestly forget what that stands for, but it's something like introverted and thinking and judgmental, things like that.
And so the profile is used to intuitive.
Sorry, I'm going back to the end there,
introverted, intuitive thinking and judging.
But the point of that is to say that
profiles are a complex way of describing a person.
And what you could ask before turning to profiles
is whether the single variables that are part of that profile,
how much, where do you stand on those characteristics
rather than jumping forward to that complex I&J combination?
Because science prefers simplicity before you jump into complexity.
So in other words, a researcher would look at I&J
and break those variables apart and see how those
variables, intuitive, introverted, thinking, judging, how those correlate with each other and how they
correlate with organizational outcomes. See what those relationships are and then see if a profile or
something complicated like that predicts above and beyond the more straightforward relationships.
So I think that's where the general arguments lie around the test is that it's not clear
whether the profiles add anything above and beyond a traditional test that is more transparent
and straightforward in terms of how it's measuring personality traits rather than combining those
traits into profiles. I will say that the profiles do lead to interesting conversations and
perhaps yield interesting and useful insights. But whether the profiles do lead to interesting and useful insights.
But whether those conversations and insights actually have an impact in terms of, you know, if you took this test for leader development, would it help you in terms of your job success?
I think the evidence is not as strong as it is for more tests in the more traditional vein that I mentioned, measuring one trait at a time.
So many of the validated tests that are being used in the workplace are based on what's called the Big Five personality traits.
Some of our listeners may know what they are, but could you explain what the Big Five are and the history of that model of personality?
Sure. So briefly, and this reflects on the point of measuring one trade at a time, the Big Five measures, as the name implies, five traits.
So conscientiousness is one that I had mentioned already. There's extroversion, agreeableness.
openness, and neuroticism, which is an unfortunate term because all the traits are intended to measure
normal personality. It doesn't mean you're neurotic if you score highly, but it does mean that
there's a distribution that people have in terms of their anxiety and worry, things like that.
Again, within the normal range. Same thing for all the other traits. If you're low on conscientiousness,
that doesn't mean you're abysmally low. It just means you're low relative to other people.
And same for the other traits that are being measured. These are all in the normal range.
These are not clinical measures of personality. And so how those factors were derived is actually
from the dictionary. So some folks early in the history of personality psychology scoured the
dictionary for these descriptive terms for people and essentially gave those terms as a test asking
people, how much do these terms apply to you? And there were various rules for which words got included
and excluded and so on. But the point is they gave this test and then they looked at the themes. They
used statistics to find these factors, these underlying themes that basically cluster the types of
responses people were making. And there were five clusters. Hence, hence the Big Five came out of that.
And so psychologists will talk about the lexical hypothesis. And that refers to this dictionary,
the lexicon, dictionary-based approach to deriving personality. There are variations on the Big
five. For example, the Hexico model is, there are.
nuances that make it more different than the big five than simply adding a sixth factor.
But I will say that one distinction, one big distinction, is a sixth factor called honesty and
humility, the HH factor, honesty, humility.
And that factor proves to be useful in measuring personality and understanding employees.
they're organizations that will give so-called integrity tests that they find from test vendors.
And those tests are asking questions about the typical behavior as someone exhibits that are, again, these are in the normal range.
But, you know, what are described as your workplace behaviors in terms of honesty and integrity.
So it sounds like those five or six terms that we were talking about, that those can be fairly reliably predicted then based on testing?
Yeah, that people can write, the folks who are developing personality tests can write items under each of the big five that reflect reliability and validity and can write them in a way that is accessible to people and fair in that.
sense. They do vary in terms of their validity. So conscientiousness is probably the most valid of
those five traits, just focusing within personality testing. But there's continued research
on facets of personality. So facets are the kind of sub-factors that underlie the big five.
they focus on narrower aspects of personality.
So, for example, under the factor of conscientiousness,
you might think about whether somebody is being more kind of achievement striving,
like it's kind of a moving forward motion of setting your goals and falling through
versus a rule following notion being dutiful or maintaining order or paying attention to detail,
that's more of a kind of a tediousness to it, right?
It's more kind of inward focus as opposed to outward focus.
So there are these facets that can be measured reliably and that perhaps have different relationships
with organizational outcomes than those broader factors.
Is it possible for applicants to game these tests? I mean, I imagine that some of these questions
are not so obvious and not going to ask you, are you an honest person? But can you, can applicants
figure out a way to give the employer an answer that would get them in the door for the
interview, but it may not even be true? Yeah, there's no doubt that people can lie on
personality tests. And so if everybody did so, then we would throw the tests out the window that
there's nothing that will help in terms of being reliable or valid. But it turns out people,
you know, while some people do lie, and as I mentioned, the perfect shouldn't be the enemy of the
good because there's enough variation in responses that is reliable that can be used for
understanding the employee. And again, you know, these questions are trying to get at normal
personality. The framing of the test is to indicate who you are and that, you know, it doesn't
necessarily serve you well to lie on the test. There are some, some tests actually attempt
forms of lie detection to see if you're just, you know, kind of responding on the fly without
thinking carefully through the items. There are various methods for that. Those aren't perfect either.
But again, the goal is to try and find a good fit between the job applicant and the employer.
And so, you know, the straightforward answer to your question is people certainly can lie. But in the
context of real-world personality testing, we don't see that to the extent that we would throw away
a personality test as part of the employment kind of battery of tests we would administer.
I think you mentioned integrity test items. You know, an item might be something like I would turn in a
fellow worker. I saw stealing money or an employee should be fired if the employer finds out the
employee lied on their on their job application.
And, you know, those items may have right or wrong answers, but there are degrees of how much
you might endorse it.
So the format might be on, say, a one to five scale rather than true or false.
And that provides enough variation to be able to get some reliable differences between
people and they're standing on a personality trait.
Right. So one of your research interests is new ways of using technology to measure personality, including big data and artificial intelligence. Can you talk about that? I think some of our listeners may have read news stories about companies using AI to interpret applicants' facial expressions during job interviews, for example. So what are the potential advantages of these new technologies and what might be potential pitfalls?
Right. So we're talking about the Wild West here.
that these AI technologies for selection are developing at the same time that we're thinking about them
and safeguarding applicant privacy and judging whether these tests will provide benefit in general
and above and beyond traditional tests, perhaps.
And I think it's important for listeners to appreciate at some level this distinction,
several distinctions that are part of AI testing.
So the technology is one part of testing.
So how are you gathering the data?
So it could be video,
although that has been state legislation
has rightfully come up to express concerns
about whether gathering video images of applicants
is really a good way of,
understanding the applicant. And we can talk about that further, if you like. But there's a technology
for gathering the job applicants' characteristics. So those could be audio, video. The audio could be
processing tone, which again might be problematic, but the language, you know, trying to extract
themes out of what people say and maybe supplement the human interviewer in terms.
of reminding the interviewer, the things that were said and the themes that emerged.
In other words, it could be a hybrid approach rather than pure AI potentially.
Other technologies are game-based assessment, so we all have probably heard ads like,
hey, get online, play this game, and you can see what jobs are right for you.
And it sounds engaging in fun and data-driven and so on.
But again, the principles of whether the test is reliable, valid, and fair applies there.
But they are games.
They're fast to take.
They might be fun.
So we're operating in this new world of selection where the standard fill in the bubble tests are maybe increasingly being left behind.
And in light of these new technologies.
So how do we make those technologies better is where my work kind of stands is I think there's some
criticisms to be made, but at the same time balancing criticisms with opportunities.
What can we do better?
How can we move these technologies forward?
Another part of the distinctions that I mentioned to be made is the algorithms.
So they're the technologies, and then the algorithms that are the algorithms that are
applied to the data that are collected.
So how do algorithms provide scores for people that are taking these tests?
So it's one thing for somebody to take a personality test and you add up their score of
simple addition.
But it's quite another thing to use machine learning, you know, these sophisticated algorithms
onto somebody's behavior as they play a game and try and
figure out what themes are in there. So that distinction I think is really important. And related to this
is the type of data where some data may appear directly more psychological. A game might have you
literally answer questions about personality perhaps, and it would look almost like the test,
a traditional test. Or you might solve a cognitive puzzle. And it
It looks exactly like a puzzle you might get on paper in a cognitive test or a knowledge test, right?
You could be asked about knowledge within a game.
So those are more straightforward and I would say are more intentionally measuring the psychological characteristics of job applicants you're interested in.
And that's to be contrasted with more incidental data.
So incidental means, you know, as you mentioned, the video capture or, you know, learning about maybe there's something in a resume that an AI engine is scraping from the web, getting people's resumes and extracting themes from them.
And they're not intentionally expressing the psychological features that might be inferred, but instead it's incidental.
It happens to be discovered.
And those are the promises of AI where we're hoping to discover psychological characteristics from data.
But again, this opportunity also has risk in terms of issues of privacy and ethics, which we all care about,
not to mention litigation, which organizations care about.
So some of these games that are intended to tell you whether your personality is suited for a particular
job? Are those really valid? And what happens, say, if your employer gives you one of these to play
when you're already on the job, and it turns out that you don't have the right makeup for the job that
you have, maybe it requires creativity, and the game says, you're a brick, man, you're not creative at all.
What are you doing in this job? Yeah, well, Tesla are one part of a larger picture of decision-making,
or at least they should be treated as such.
And, you know, how much weight can you put into a game is a good question,
or how much weight can you put into any test?
You know, speaking to the job applicant situation,
if somebody is new to the world of work,
they just got out of high school and they're applying for jobs
and don't have much work experience.
In that case, tests might be more useful,
because it's supplying information in a systematic way that doesn't show up in any resume.
And sort of the idea that there are personality features and knowledge features that
suggests the person will be a promising young employee.
By contrast, if somebody has an extensive resume with expertise being reflected,
They're a neurosurgeon, for example.
Perhaps a game attempting to measure job knowledge is not necessary.
And then there are cases where the test isn't perfect,
but it's trying to measure these characteristics that are honestly hard to get at
through tests or through any other way, like creativity.
And so maybe those tests are used for developmental purposes and not for selection necessarily,
not for saying you will have a problem in this job,
but rather to say, here's what this game is indicating.
And as a developmental conversation, how do you see that informing what you should be doing next
to work on your creativity?
And so to the extent those scores have some reliability and validity, then that's a worthwhile
conversation because the test is actually diagnosing something about creativity.
Creativity is a hard thing to measure, by the way.
It sometimes depends on your prior knowledge.
So you need to, you know, it's not necessarily open-ended, in other words.
types of creativity are where you just let your mind go and think about whatever pops into mind.
But in other cases, it's where, you know, luck fair is the well-prepared in a sense, that
you have a lot of knowledge in your head and then you're asked to create and you have all
these knowledge frameworks that assist in your creativity.
So I will say it from the get-go that that's a tough construct to measure with tests or otherwise.
That's a topic that we're going to be exploring in a future podcast, so listeners, stay tuned because we will come back and talk to some experts about creativity, which is really fascinating.
So what would you say to a job applicant who might be uncomfortable with the idea of an algorithm instead of a person evaluating their job application?
Yeah, that's a great question. I think, you know, AI is as much a cultural feature as it is technology, and I
think, you know, we're trying to figure out as researchers and as people who are
seducted to these technologies, how is that going to change the world of work? You know,
will it be better for employees? Will it be better for organizations? How will it work when
AI is more, even more infused into organizations and perhaps becomes a given, a given,
in how selection happens. That'll be a different world than what we're experiencing today,
where we don't see, we see the entry point. We see AI entering the area of selection and
continuing to grow. And as this is happening, we do see, you know, those going through those
experiences as applicants are clearly talking and probably tweeting and getting a sense.
of how it works, you know, state legislation keeps arising in response to these technologies to
try and ensure the tests are measuring reliable characteristics of applicants, job relevant
characteristics that are important at the point of selection, not just relevant for the job,
but relevant at the point of selection. I think that's important to emphasize. And not invasive
and, you know, legal and ethical and those things.
But the devil is in the details there.
Reading these state bills or, you know, regulations,
the language is open-ended because the technology is still being developed
and we need to figure out what are appropriate uses
and what are inappropriate uses.
You know, video capture seems to be most,
problematic to try and infer employee characteristics from their feet, you know, how they look or how they,
how they, what the emotions on their face and so on. That, that seems very problematic and it is.
It's not like, it's not like the algorithm did detect highly relevant job features and, and we should be
using it. There doesn't seem to be evidence for that. So the evidence, lack of evidence,
is also a concern in that area.
But, you know, I think we will wait and see in terms of how people will, it will evolve,
I guess would be my answer, just like self-driving cars, right, have barely happened yet.
There are some very isolated instances of self-driving cars being in the world, but once they
become more commonplace, we'll have an entirely different phenomenon on our hands.
I think there's also the applicant perceptions of the hiring process that will evolve as much as the decisions that are made from AI will evolve.
So in other words, as you get more used to AI as a tool for selection, maybe people will see that humans supplementing selection becomes the add-on, the part that's valued, which would kind of turn the time.
tables on the whole thing.
So it'll be interesting to see how humans merge with AI as AI permeates selection.
An applicant might feel that the organization is not paying much attention to them by
farming it out to an AI to do selection on a broad scale.
That would be a negative.
But a positive could be that AI is treating.
applicants fairly. I mean, imagine AI tools that are making every attempt to be fair and accurate
in ways that humans try to be. That would be a positive aspect of AI in the future as we
continue to develop tools that are effective. So it won't do things like humans do like prejudge people
based on their names, which they're unusual or foreign sounding. Or maybe you make a judgment
because, oh, this person went to my alma mater. I'm going to give her an interview.
Those sorts of things.
Yeah, those sorts of things that, right, we're supposed to be selecting on job relevant characteristics
and not your relevancies like your name or your background.
And another aspect, I think, to consider is sometimes people think, well, tests are not any better
because they're deterministic.
And that's not true either.
Tests are a way of letting employers know what an applicant.
looks like, what are their strengths and areas in need of development?
And that then gets considered with other pieces of information.
And again, the tenets of fairness apply to all aspects of the application.
So it's not like a test score determines your fate in these systems,
as much as we might put weight on these scores as we have sometimes hear about
in the news, whether it's personality, the persona series that you mentioned, or standardized testing
in college admissions. No admissions office is relying solely on test scores, I don't imagine.
Right. Less and less. Yeah. Some of them aren't relying on them at all anymore.
Yeah, exactly. And that's another conversation to have, I guess.
So the last question is kind of a big one. What are the major researched questions that you think
still need to be answered and what are you studying now?
Wow, that is a big question.
Currently, I've been researching what products are out there in the world of AI and personnel
selection, what are vendors providing and what evidence is there for their effectiveness.
And we are not pretending to engage in fully comprehensive survey of what's out there,
because it keeps changing even by the day.
But at least getting a sense of what is being offered,
what types of things are being offered
and what evidence is out there to support the technologies
that are being offered and sold, sometimes at high dollar.
And what do organizations get in return for their investment?
What do employers hope to get
and how to set a line with what they might get
from these technologies?
So our research is looking there. Also, I've been interested in school-to-work transitions. So the question there is whether we can use some forms of detailed data collection or perhaps a machine learning to look at the transitions that people make from their education into the workplace. And it might be a four-year-old.
college, it might be a community college, it may or may not be a degree, it may be a certification,
or people are just, people are taking courses relevant to their interests and then getting jobs.
So what do those transitions look like and how do algorithms capture those regularities so we can
understand them? But then also, how can we intervene on that to make sure that people are
seeking meaningful employment and steady employment in the world of work in a way that fits
their capabilities and also their desires to grow in their field. Easier said than done,
that's a tall order, but we're using data to inform those efforts, but then also connecting with
educators and policymakers to see what they do with those findings. So it's one thing,
thing to reflect on the way things have been through the data, but it's another thing then to take
that information and say, well, where are we going? What are the new directions that we should
be headed in that the data alone don't tell us? So that's been pretty interesting to me and
hopefully impactful to a range of folks who perhaps like me didn't know what they were going to do
when they grew up and they kind of fumble their way through the world.
You know, I've always been an advocate of even giving a little information can go a long way
to open people's eyes to the possibilities.
Well, this has been very interesting.
Thank you for joining us, Dr. Oswald.
I've enjoyed speaking with you today.
Yeah, thank you very much, Kim.
This has been really great.
And more importantly, I hope listeners find it interesting.
and we keep our gears turning on these issues.
You can find previous episodes of Speaking of Psychology at speakingof psychology.org
or on Apple, Stitcher, or wherever you get your podcasts.
If you have comments or ideas for future podcasts,
you can email us at speaking of psychology at APA.org.
That's Speaking of Psychology, all one word, at APA.org.
Speaking of psychology is produced by Lee Wynerman.
Our sound editor is Chris Condiion.
Thank you for listening.
For the American Psychological Association, I'm Kim Mills.
