Front Burner - Millions exposed by 23andMe breach
Episode Date: December 8, 2023Genetic testing company 23andMe says attackers were able to gain access to the profiles of nearly 7 million of its users. What kind of information was exposed? How did hackers try to sell the info? ... What broader and future concerns do experts have about sending DNA to services like 23andMe? Jason Koebler is a co-founder of the independent tech website, 404Media.co. For transcripts of Front Burner, please visit: https://www.cbc.ca/radio/frontburner/transcripts Transcripts of each episode will be made available by the next workday.
Transcript
Discussion (0)
In the Dragon's Den, a simple pitch can lead to a life-changing connection.
Watch new episodes of Dragon's Den free on CBC Gem. Brought to you in part by National
Angel Capital Organization, empowering Canada's entrepreneurs through angel
investment and industry connections. This is a CBC Podcast.
Hi, I'm Damon Fairless.
This Christmas, some Canadians will be wrapping up colorful boxes filled with spit tubes to put under their trees.
They're DNA test kits from 23andMe.
And depending on which one you get, the company says it can tell you which part of the world your ancestors come from, who your possible relatives are, and even flag genetic traits and
potential health conditions. Kenzie's test and me being able to find out that I was BRCA positive
was life-saving. The holidays wouldn't be worth celebrating without my family.
saving. The holidays wouldn't be worth celebrating without my family. It's personal stuff. So it wasn't exactly comforting to 23andMe customers in October when the company said a breach had
exposed the accounts of a small portion of its users. Well, we're now learning that that leak
was much bigger than originally reported. This week, the company said that information from almost 7 million people was exposed. News of the breach has even prompted
a proposed Canadian class action lawsuit against the company. Jason Kebler is a co-founder of the
independent, journalist-owned tech website 404media.co, and he's going to tell us about
the breach at 23andMe, and also about the broader risks of giving our genetic info to services like this.
Hey, Jason, thanks so much for coming on FrontBurner.
Hey, thanks for having me.
Okay, so I want to start at the beginning of October when these posts about 23andMe
appear on a website called
Breach Forum. So let's just start off with what is Breach Forum? What kind of stuff gets posted
there? Yeah, Breach Forum, like the name suggests, is where hackers talk about data breaches that
they have either perpetrated or that they are interested in getting in on the action for.
And Breach Forums is one of the larger ones.
So in October, we start seeing on Breach Forum posts about 23andMe.
What's going on there?
Right.
So this user named Gollum goes on Breach Forum and says that they have collected, quote, one million data points about Ashkenazi Jews and people of Chinese descent. And then they later,
meaning a day or two later, start advertising 23andMe profiles for between $1 and $10 per account.
So they started offering to sell these accounts in mass.
So, I mean, I know that there are certain genetic disorders associated with Ashkenazi genetic heritage,
but why was this person interested in Ashkenazi Jews and their genetic information?
Yeah, I mean, I think very broadly the Post repeated some anti-Semitic tropes about Jewish people controlling the world,
and the Post said, you know, I have data about very powerful people
and that sort of thing.
So, I mean, it was a pretty unpleasant situation.
A lot of the things that happen on hacking forums
are pretty gnarly.
But, you know, in this case,
it seemed like a targeted attack
on people of Jewish background.
So after that, am I right that this poster, Gollum, dumped even more apparent
profile data on the forum? Yeah. So later they started posting entries, like individual entries
of data that they had allegedly stolen for people like Mark Zuckerberg, Elon Musk, Sergey Brin,
who was one of the co-founders of Google. And the information contained had like name, sex, birth year, broad location.
And then it had these fields that are called Y-DNA and N-DNA.
To be totally honest, I don't know exactly what those mean,
but that had something to do with the apparent genetic background of the people whose data had been
stolen. It's kind of astounding to think that you've got people like Zuckerberg and Elon Musk.
Did the data seem legit? Did this seem like it was really their profiles?
So that's the thing. I mean, there are a lot of scammers on hacking forums. That's sort of another
core feature of a lot of hacking forums where hackers will take already public info from
previous data breaches and sort of repackage it and say that they have new data. Sometimes the
data is completely made up. There was some initial reporting about it, but no one knew for sure at
first whether it was real information. And I think more importantly, no one knew whether 23andMe itself had been breached.
So but leaving some of these individual celebrity profiles aside, it did become clear after
some of these initial data doubts that at least some of that data came from 23andMe.
Is that right?
Yes.
Pretty quickly, it was confirmed that this information had come from 23andMe, but 23andMe said that it had not been hacked.
People started theorizing that the data had been stolen using this method known as credential stuffing, which is very, very common.
It is basically when a hacker takes usernames, emails, and passwords from previous data breaches of other companies and tries those same username
and passwords on other websites. And this is why security experts say, you know, always use unique
passwords for different services. It became clear pretty quickly that that's what was happening
here, that there was a credential stuffing attack targeted at 23andMe users specifically.
Tonight, drawing questions after a first-of-its-kind data breach targeting genealogy site 23andMe.
23andMe confirmed profile information was taken that includes usernames, passwords, gender, photo,
relatives in common, and the percentage of
DNA you share with them. The hackers could gain access to anyone who opted to share information
with potential relatives, even if those people did have secure passwords and two-factor authentication.
So it looks like the data breach is basically this kind of brute force hack where they're recycling old passwords that have been leaked out in other leaks, as I understand it.
But part of that information is this opt-in feature on 23andMe that apparently allows much more data to be accessed.
It's called DNA relatives.
So what is the DNA relatives option and what was it designed to let
users do? Yeah, so one of the core reasons that people use sites like 23andMe is because they
want to see who they might be related to. You know, there's a lot of sort of heartwarming stories of
people who have been able to find their adopted parents, parents who gave a child up for adoption have been able to find their children.
And so there's this opt-in feature that many, many users do opt into on 23andMe called DNA Relatives, where it will show you people who you might be related to.
And it sort of tells you how likely it is that you are related to these people.
how likely it is that you are related to these people. And so the hacker started releasing information about DNA relatives. And it wasn't clear at first how many accounts they
had actually broken into, but they started posting evidence that they had millions of records from
23andMe. So last Friday, 23andMe updated their filing with the U.S.
Securities and Exchange Commission, laying out that 14,000 accounts had been directly accessed.
And as I understand, that's just a small fraction of its account, something like 0.1%.
But then after that, 23andMe gave a different number, right? What are they saying now about
how many people's personal data has been shared? Just this week, 23andMe said that there were 6.9 million other users who had data stolen in the breach, but whose accounts were not hacked, if that makes sense.
So what happened is of these 14,000 customers who had their accounts hacked, the hackers then stayed in those people's accounts and started
scraping all of this information from this DNA relatives feature. And there's evidence that they
were sort of in many of these accounts for months. I don't know how long it necessarily took to
compile this database, but they were able to make a database of 6.9 million other users who had
data stolen, even though their own accounts
were not actually hacked. So let's dig into exactly what the data we're talking about here
is. What are the different levels of information that various users have had exposed? So as I
understand, you know, if you are able to hack into someone's 23andMe account, you have access to anything that they've given 23andMe. So for those 14,000 people who were reusing passwords and had their accounts hacked
into, anything that they gave 23andMe was accessible. For these other users, these users
who had their data stolen through the DNA relatives feature, they had things like their name, their birth year,
something that 23andMe calls relationship labels,
and then the percentage of DNA that they shared with different relatives,
as well as something called ancestry reports and their self-reported location.
There's this other feature called family Tree that's on 23andMe, and that also 1.4 million
people had their data stolen through that.
And this includes very similar information.
So I think that's why this hack is so interesting.
It's because even though specific genetic data was not stolen, like people's entire
genomes were not downloaded. Because of the nature
of the product, people use this to form connections and DNA is specifically what relates us together
as humans. It created this sort of chain reaction situation where from only 14,000 users, they were
then able to get information on, you information on almost 7 million total individuals.
And that is pretty novel for a hack.
Normally, if you break into someone's email or account, you can maybe find out information about other people depending on what sort of account it is.
information about other people depending on what sort of account it is. But normally you can't
sort of use that access to then get family relationships, location, ethnicity, things like that. So 23andMe says this information was exposed by this credential stuffing tactic,
but some 23andMe users are chafing at this, right? So there's a lot we don't know,
but I'm thinking in particular here of a U.S. cybersecurity official who's been posting on X. Can you tell me what he's been saying?
Yeah. So Rob Joyce is a former NSA official who says that some of his data was exposed in this
breach. And he says, you know, I'm not sure how this happened because I used a completely unique
email address for 23andMe, and yet my data was still exposed somehow.
And it turns out his information was part of another breach of this company called MyHeritage
back in 2018. And 23andMe partnered with MyHeritage at some point to provide this family tree capability. And so in this case, users who took every possible step to protect their information
also had data stolen because of these sort of like internal connections between these
genetic database companies. in the dragon's den a simple pitch can lead to a life-changing connection watch new episodes of
dragon's den free on cbc gem brought to you in part by national angel capital organization
empowering canada's entrepreneurs through angel investment and industry connections. Hi, it's Ramit Sethi here. You may have seen my money show on Netflix.
I've been talking about money for 20 years. I've talked to millions of people and I have some
startling numbers to share with you. Did you know that of the people I speak to, 50% of them do not
know their own household income? That's not a typo. 50 percent. That's
because money is confusing. In my new book and podcast, Money for Couples, I help you and your
partner create a financial vision together. To listen to this podcast, just search for Money
for Couples. Beyond just this case, I want to talk a little more broadly about what we're dealing with when it comes to 23andMe and other genetic services.
So, you know, we should be fair to 23andMe.
They are offering a service that a lot of people are, you know, crazy about.
So maybe we can kind of talk about what is the service it's providing and why do people like it?
Yeah, so I think that a lot of people started using 23andMe because they were interested to learn like, what is my heritage? You know, what percentage of my DNA comes from different geographic parts of the world? And
a lot of people are very interested in that just from a curiosity standpoint. But then there is
very real potential for personalized medicine based on people's DNA, developing pharmaceuticals that work specifically
for people who have specific genetic markers.
23andMe has started doing some of that,
which is, I find it to be a little bit problematic
because they've partnered with Big Pharma,
but that concept in general shows a lot of promise.
And then a lot of people have used it
to find their adopted,
either their distant relatives or their adopted parents, their children that they gave up for adoption.
And it's frankly changed a lot of people's lives. You know, I wrote an article about 23andMe and a
lot of people said, you know, I would make this trade where hackers steal my data every day of
the week because I was able to find my adopted parent.
And that's very powerful. At the same time, I will also say that 23andMe, it's not great that
they got hacked, or rather that they sort of allowed this hack to occur because there were
steps that 23andMe could have taken to prevent credential stuffing at this scale. Things like requiring two-factor authentication,
things like rate-limiting logins
where you're not able to steal this many email addresses
without some sort of automated system
that's spamming passwords into the system.
That said, they have relatively good privacy practices
where they do encrypt the genome
that they ultimately sequence.
And so no genetic data was stolen in this case. They've also pushed back against law enforcement
requests for data about their customers, which I think is really important. That said, I'm 35.
My DNA has been the same since the day I was born, and it's going to be the same on the day that I
die. I don't know what 23andMe is going to look like in five years, in 10 years, in 20 years.
And they're changing their policies all the time.
They changed their privacy policy and their terms of service this week as a response to the hack.
And so there are very real concerns, I think, with giving your genetic data to a company like 23andMe.
real concerns, I think, with giving your genetic data to a company like 23andMe.
And I should just point out that 23andMe is now requiring two-factor authentication after this breach. Now, and I'm asking this in no way making a comparison with 23andMe, but you've actually
written about a fairly disconcerting example of just what can happen with genetic databases.
And I'm thinking here about a database called GEDmatch. So can you take me through that? Yeah, I find this really disconcerting and really compelling,
actually. So GEDmatch is a genetic database that is open source or is founded to be open source,
meaning it was public. People can go in there and sort of grab genetic data sort of willy-nilly.
Like people can go in there and sort of grab genetic data sort of willy-nilly.
It was founded out of a Florida basement.
It was kind of like a DIY operation.
There were two people running it.
It was all volunteers. And when it launched in 2010, it pitched itself as being for, quote, amateur and professional researchers and genealogists.
Its privacy policy said,
we will never sell your information.
Well, eight years later,
it helped cops crack the Golden State Killer
serial killer case, which is huge news.
Tonight, hiding in plain sight.
Police say one of the most elusive serial killers
in American history has been captured
outside his suburban home.
But D'Angelo, not a suspect until days ago when they got a break. serial killers in American history has been captured outside his suburban home.
But D'Angelo, not a suspect until days ago when they got a break.
They say cutting edge DNA testing allowed them to make a match.
We were able to get some discarded DNA and we were able to confirm what we thought we already knew.
And this was particularly interesting because the killer himself had not uploaded his DNA into GEDmatch, but of been leading the way there. There's been cases where they have broken their own terms of service
to help police in certain cases. And the use of this sort of genetic forensics is very common now.
I think my argument is once you put your DNA into a a database like this you are not able to easily get it out
your dna does not change and you can also be sort of implicating any of your relatives uh your
descendants etc which i guess brings me to what happened with GEDmatch next is that it was sold in 2019 to a for-profit
company called Veragen that works very closely with the FBI and a lot of other police departments
throughout North America. And then Veragen itself was sold last year to a Dutch conglomerate called
Quyogen, which is sort of a larger conglomerate that has a forensics, healthcare,
biotech, and pharma arms. So in the last four years, they have been three different owners of
this company. Let's say a service did allow your genetics to get leaked,
your raw genetics are soldered off.
What's your fear about how it could be used in the future?
And I guess I'm thinking here about something
that the Edmonton Police did here in Canada.
The Edmonton Police Service is admitting
the use of a new-to-Edmonton technology
to identify a sexual assault suspect was a mistake. The result was a generic photo of a new-to-Edmonton technology to identify a sexual assault suspect was a mistake.
The result was a generic photo of a suspect that was criticized as stereotypical.
The photo, which City News has blurred, was created for Edmonton and shows a black male face
whose features were constructed by phenotyping.
So there's this company called Parabon, and they do this thing called DNA portraits.
And they work very closely with a variety of different law enforcement agencies.
And what they do is they take someone's genome and they generate what they believe to be an image of them.
So using only their DNA sequence, they then generate an image.
And that image is used for police sketches essentially like a high-tech
version of a police sketch the big issue with this is that there there's not great research
that shows that someone's dna correlates very closely with their modern day appearance i mean
you know some things are sort of genetic like genetic express themselves in a variety of
different ways yeah there's a huge range of possible expressions, right? That's the issue.
Of course, yes. And the concern here is they're going to generate images that look like, you know, someone who is innocent and that is going to be used, you know, they'll be arrested for a crime that they didn't commit.
I think that there are other potential dystopian uses of someone's raw
genetic code. I think that we do need to take sort of a long picture approach here because,
like I said, people's DNA doesn't change and it's also very similar to your relatives and
your descendants. So right now there are pretty strong protections, at least in the United States,
and I believe in Canada as well, that prevent health insurance companies from sort of like
raising someone's premiums because they have access to their genetic data. However, it's very,
very easy to imagine that regulation changing at some point and health insurance companies
raising someone's premium because they are more likely to
have cancer, more likely to have some other chronic illness. Jason, thank you so much.
It's been great talking to you. Yeah. And thank you so much for having me.
Before we go today, we need to make a correction to an error we made in an episode earlier this week.
In our story about the U.S. Supreme Court case involving Purdue Pharma
and the settlement for its role in the opioid crisis,
we made a reference to the Netflix movie Pain Hustlers, saying Purdue was depicted in that film.
Pain Hustlers was inspired by a different company that sold an opioid-based pain medication.
We regret the error.
That's all for today.
Front Burner was produced this week by Rafferty Baker, Shannon Higgins,
Joyta Shingupta, Lauren Donnelly, and Derek Vanderwyk.
Sound design was by Mackenzie Cameron, Sam McNulty, and Will Yar.
Music is by Joseph Shabison.
Our senior producer is Elaine Chao.
Our executive producer is Nick McKay-Blocos.
And I'm Damon Fairless.
Thanks for listening.
FrontBurner will be back on Monday.
For more CBC Podcasts, go to cbc.ca slash podcasts.