Speaking of Psychology - Big Data (SOP59)
Episode Date: June 20, 2018Social physics is the idea of using statistics to quantify and manage change in culture. This idea inspired the modern national census, but the difficulty of acquiring data limited what could be accom...plished. Today’s technology produces a continuous trail of digital breadcrumbs that allow human behavior to be examined even in complex natural environments. Alexander “Sandy” Pentland, PhD, discusses how large-scale studies can be used to predict and shape a wide range of important common behaviors. APA is currently seeking proposals for APA 2020, click here to learn more https://convention.apa.org/proposals Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
Hello and welcome to Speaking of Psychology, a podcast produced by the American Psychological Association.
I'm your host, Kim Mills.
Speaking of psychology is a podcast for anyone with an interest in the science of psychology.
We talk to psychological researchers, practitioners, and educators about any and every aspect of psychology and its application to the world around us.
Dr. Alex Pentland, also known as Sandy, is a professor of media arts and sciences at MIT,
where he directs the MIT Connection Science and Human Dynamics Labs.
He is one of the most cited scientists in the world,
and Forbes magazine recently named him one of the seven most powerful data scientists in the world.
Dr. Pentland's research focuses on social physics, big data, and privacy.
His research helps people better understand the physics,
of their social environment and helps individual companies and communities and people to reinvent
themselves to be safer, more productive, and more creative.
Thank you for speaking with APA today.
My pleasure.
So let's start with a definition.
Social physics, what do you mean by that?
Well, social physics is a word or phrase that's two centuries old.
In the beginning of the 1800s, alchemy turned into chemistry and an attempt to clean up
its act and natural science turned into physics.
And there was this dream of doing something similar with understanding culture.
And in France in particular, there was the idea of using data and statistics to really understand
the progression of culture.
And that spread to England very early on.
And it's why we have the modern census, which asks all sorts of questions.
But until just recently, it's been very difficult to do.
do very much because it's been very expensive to get data. You had to have surveys and so forth.
And statistics were really not that powerful. In the last decade or so, though, you've seen
this sort of floodgate of data coming available from cell phones and credit cards and government
and everything's digital. And then very powerful machine learning methods that let you do statistics
that were sort of unimaginable before. And that's social physics. It's the coming together
of this new technologies to better understand ourselves.
So one of the things that I think you have looked into
is electronic interaction among people
and what that does for creativity.
Can you talk about what you've found in that area?
Well, we've looked at many different sort of situations,
you know, in schools and companies,
just sort of out in the wild, as it were.
And we do something unusual,
which is we measure physical,
interaction along with the electronic interaction. So we actually build little name
badges that keep account of, you know, who talked to who, where, things like that.
And sometimes we ask people to put a little software on their phone so that we can
keep track of, you know, who actually talked to each other on the telephone or who
was nearby each other. And the sort of first thing to notice is that the electronic
stuff is not as powerful as the face-to-face stuff. If you're talking about patterns that
are predictive of life outcomes, of happiness, of social health.
It's the face-to-say stuff that just beats it hand down.
That's not to say it's not important.
It can sort of bias things in various ways and get people hopped up in various things.
But it's the face-to-face that reinforces and makes it catch.
A way to think about that is that you can have a rumor on the Internet,
but it's when you check it out with your neighbors and they go, yeah, I heard that too.
that it really begins to sort of dig into your mind and you begin to take it seriously.
Are you finding that because of all of this electronic communication capability that we have today,
that there's less face-to-face interaction?
Well, it's certainly true among younger people.
You know, I have kids and they spend an important time in their bedrooms, you know,
doing various sort of stuff.
And then there's a whole sort of culture of keeping kids really.
really, really safe. So there's the free-range kids movement in reaction to that.
Whereas when you and I were young, we sort of went out and did all sorts of stuff and maybe for good and maybe for bad.
Good thing our parents don't know what we were doing.
Yeah, well, I'm sure they didn't. But it's not actually clear which is right.
Certainly there are things that we did in our generation that weren't good for us or for society.
we got ourselves in trouble various ways.
And you can, and my kids, kids at this age, don't experience that.
Of course, they substitute these online things, which are of questionable quality,
and it tends to be a lot more vicious, actually, in various ways,
because it has this sense of anonymity.
So that's a trade-off.
On the other hand, I think teenage pregnancies have gone down.
So there is an upside here.
Yeah, okay.
So, you know, the thing I have to worry about, or the thing I worry about is something I was to joke with my wife is that she'd want the kids to be clean and everything to be clean.
And I said, no, no, let them eat dirt.
Because it's when you have these negative experiences in a sort of controlled way that you build robustness and the ability to deal with these things.
And so I worry that what they call snowflakeness, right, comes from the lack of experience in uncomfortable situations.
They're not able to deal with it as well as you could, and you can fall apart.
And that means that when you get out in the real world as an adult, you're going to have some trauma that you might have toughened yourself up when you were a kid.
It's a trade-off.
So you spoke earlier about the availability of vast amounts of data right now.
What kinds of data sets are there out there that we haven't really looked at yet?
What can we do with them?
Well, everybody focuses on Twitter and Facebook and things like that.
But those are your public expression.
They're not what you really do.
And in fact, what you express in the face,
put on is often only distantly related to what you actually feel and what you actually
do.
The sort of data sets that are also out there increasingly are things like location from
your cell phone.
Where are you spending time?
Who else is around?
Or what do you buy from your credit card?
Or what sort of public transportation do you have you from Uber or things like that?
And then there's all the government data about you.
Everything's digital these days.
The result is you can get a very, very detailed picture of somebody's actual behavior.
Not what they say, not their public phase, but what they actually do in a way that was
just inconceivable a decade ago.
Most of that data is legally protected.
It's siloed away fairly safely, actually.
You don't hear a lot about it.
But inevitably it will begin to come together in various ways.
one of the ways that's important is in research.
We don't really have this sort of picture of human existence of day-to-day life,
of our social health, of our communities that we'd like to have.
Most of our data comes from little laboratory experiments or questionnaires,
and it's actually very biased, very limited, very expensive.
And so you can imagine that you could make some of this data available
in an anonymous, you know, contractually controlled sort of way,
where you can make meters, you know,
sort of happiness of risk of stress for communities
pretty much in real time.
And in fact, the experiments we've done show that that's not only possible,
but it's practical,
enough so that in the last couple of years,
I and some of my compatriots were able to convince the United Nations
to put into the sustainable development,
goals, their 15-year goals, the notion of actually measuring things like happiness and
inequality and sustainability using all of those sort of data sources.
Nobody knows quite what they're going to have to do to do that because most of those
data are held by companies.
But there's ideas like making a data tax.
Why doesn't the telephone company report sort of aggregate statistics to the government
where everybody can see it, that would let us see,
is this neighborhood a ghetto,
or is this neighborhood nicely integrated with the rest of society?
Turns out integration of a community with society
is enormously predictive of outcomes,
of whether the kids grow up happy or grow up at all.
But how could you tell that from phone records?
Don't we have other data that tell us those things?
Other than phone records, mobility data,
like you might get from transportation,
or if everybody's using credit cards,
if you're in a place where that's true,
those are the sources of what people actually do,
whether they actually spend time with other people
or whether they talk to each other within the community.
You can go do surveys all you want.
What you're going to get is, oh, yes, we talk to everybody.
Sure, sure.
There's essentially no information at all.
But when you look at things like,
well, what's the pattern?
of calls within the neighborhood and outside of the neighborhood.
Not anybody, not any individual, just a neighborhood.
You can predict things like crime rate very accurately.
You can predict GDP very accurately.
You can predict life expectancy.
You can even predict infant mortality with great accuracy
by the sort of social connectedness of the community.
So our government has a lot of data about us
that they are not doing anything with?
Yeah, that's, just recently there's been a lot of hoo-ha
about Facebook and all that.
Okay, yeah, that's all valid.
But the other side of the coin is
the government has an enormous amount of data about it
that's not being used to understand where there are problems.
So an example would be sort of the broken windows policy
of, you know, 15 years ago, I guess now.
There was a sort of initial assessment of that looked good, so people implemented it everywhere in the country.
Now, the government had lots of data that could have shown that it wasn't working the way people thought.
But nobody thought to look.
They didn't want to release it.
It wasn't their job, something.
And the result is, is we got a decade of broken windows policy where we never looked at whether it was working.
And that you can see today, there's a lot of distress about that.
you know, people are mad.
And the question to me is, is why doesn't the government make data that it has available
for research, you know, in a controlled way and safe, so there's no questions about privacy,
but in a way that allows public debate and scientific evaluation of whether the government
and its policies are working?
So I think that the sort of core thing is the government's very happy.
to not have people know whether the government is working or not.
They're not so much into this transparency and accountability thing.
So they just don't really put much effort into releasing the data.
Have you tried asking for some of this data?
I mean, how would you unlock it from the government?
Do you have to do a Freedom of Information Act request,
or there are other avenues that you could use?
No, the thing that happens, and this happens in health also,
is they make a defense that's based on,
privacy.
Right.
Okay.
But that's ridiculous.
You know, those same people that say, oh, I can't share this with you because of privacy,
take that same data and they give it to commercial subcontractors to optimize their
processes, to make more money to whatever they do.
And it works just fine in the sense that you don't get data breaches, you don't get
violations of privacy.
You take the data, you strip out the identifiers, you give it to somebody for a particular
purpose under a contract that makes them liable for screwing up, unlike, say, Equifax or somebody.
Right?
So make them liable so they're on the hook, and then you monitor them.
Works fine.
Why don't we do that so that we can check on the government?
Yeah?
It would be sort of interesting to know whether some of these policies are really working or not.
Recently there was some very brave people within the IRS that made data available to researchers like that.
And it's remarkable, 30 years of data from the IRS about the outcomes of different minorities across the entire country
and what conditions lead to intergenerational mobility.
And that data just absolutely puts the nail in a lot of theories from both the left and the right.
And it's interesting because, you know, they've had that data for generations.
So you stand back and you say, okay, so they have this data that could have, you know, kept millions,
literally millions, maybe tens of millions of lives from jail, other sorts of suffering.
And they didn't let anybody look at it.
Doesn't that seem wrong?
I think it's very wrong.
So what this is is sort of,
what I'm saying is we need to bring science
to the data that we already have.
We need to do it in a controlled way
that preserves privacy and so forth, of course.
But we need a lot more accountability and transparency
using the data that already exists.
And I think that people are spending too much,
time yelling about, oh, you know, we're all Facebook is doing whatever and ignoring the big thing,
which is the government's not letting us figure out what's actually going on.
So what are you working on now? What's next for you in this arena?
Well, I run a group that does a couple things. We do, actually we build software and things
to help do this sort of release of data. Interestingly,
So we're funded by like the government of France, not the government of the United States.
That's a little interesting.
They build in systems to let them have better accountability for their data.
And then we do analytics.
And the typical thing that we're focused on is what are the conditions that help communities grow and become more innovative and more successful?
And those are things, it turns out, some of the biggest effects are, not surprisingly, diversity, access and integration with the rest of the community.
One of the biggest things that's happened in this country, but also in Europe and elsewhere, is a mass segregation based on income.
Now, there's racial segregation, too.
I'm not saying there isn't.
But the new thing on the block is by income.
by SES. So the poor folks live over there and the rich. And our data show that they do not mix at all,
except maybe in Walmart. Seriously, Walmart places like that are one of the few places
where rich people and poor people actually rub shoulders occasionally. The consequence, of course,
is that they don't talk to each other, they don't understand each other,
and a lot of the things we do can be very hurtful to poor people
because there's no understanding of the context of what's happening.
So those are the sorts of things we do.
Again, it's interesting who funds that.
So it's people like European governments
and some of the big financial firms
are interested in that sort of thing.
What's different about the European governments that they're interested in this and ours is not?
Well, the thing that's really different in Europe is that they've taken steps to enforce privacy law.
And this is something that I was integral to at the very beginning.
I ran the discussion in Davos, or co-led the discussion in Davos that led to creation of the privacy laws there.
So they're wrestling with the fact that a lot of business as usual has to stop.
it has to change.
And they have to understand how it is that you can be more respectful about people's control
of their own data and still have a working society.
And the hope is that by sort of getting ahead of this curve having to do with privacy and so
forth that they can set the standard for the world and, you know, their companies will do well.
And that, you know, their hope is that companies like Facebook,
will retreat because of their poor handling of private data.
And some of their companies in Europe will advance
because they're more friendly to people's actual interests.
This is a little out of left field,
but I think I read that some years back you were involved
in the development of Google Glass.
Yeah.
And I just wanted to ask you a little bit about that
because it kind of came and went.
Everybody thought it was going to be the new, new thing.
and it didn't work out too well.
Why not?
Well, some almost 25 years ago now,
we really broke into the area of wearable computing
and tried to build our own Google Glass and stuff.
And one of my students, Thad Starner,
who's now at Georgia Tech,
went on from that experience
and actually was one of the technical leads for Google Glass.
It wasn't me, I'd like to call myself the Godfather
and not the father.
I think what it was is it was early to market, and it was a good example of the geeks thinking that everybody was a geek.
It's a good idea.
It's really cool for, you know, like surgeons and people building things where you need to see the plans while your hands are busy.
Right, right.
The thing that they did that was probably the biggest mistake is stick a camera in it at the beginning before people were used to it.
and it might have been very successful.
People will keep trying it.
Eventually, we're going to have little displays in our glasses.
Yeah, it's coming.
And it'll be good because it's going to result in not forgetting people's names,
not getting lost.
All of us somewhat older people are going to have better memories than we might otherwise had, etc.
Sounds great.
Well, thank you, Dr. Pentland.
Thank you for joining us today.
Speaking of Psychology is part of the APA podcast network, which includes other great podcasts such as APA journals dialogue about the latest and most exciting psychological research and progress notes, which discusses the practice of psychology.
You can find all APA podcasts on iTunes, Stitcher, or wherever you get your podcasts.
You can also go to our website, Speakingof Psychology.org, to listen to more episodes and see more resources on the topics we discuss.
I'm Kim Mills with the American Psychological Association, and this is Speaking of Psychology.
