Front Burner - Millions exposed by 23andMe breach

Episode Date: December 8, 2023

Genetic testing company 23andMe says attackers were able to gain access to the profiles of nearly 7 million of its users. What kind of information was exposed? How did hackers try to sell the info? ... What broader and future concerns do experts have about sending DNA to services like 23andMe? Jason Koebler is a co-founder of the independent tech website, 404Media.co. For transcripts of Front Burner, please visit: https://www.cbc.ca/radio/frontburner/transcripts Transcripts of each episode will be made available by the next workday.

Transcript
Discussion (0)
Starting point is 00:00:00 In the Dragon's Den, a simple pitch can lead to a life-changing connection. Watch new episodes of Dragon's Den free on CBC Gem. Brought to you in part by National Angel Capital Organization, empowering Canada's entrepreneurs through angel investment and industry connections. This is a CBC Podcast. Hi, I'm Damon Fairless. This Christmas, some Canadians will be wrapping up colorful boxes filled with spit tubes to put under their trees. They're DNA test kits from 23andMe. And depending on which one you get, the company says it can tell you which part of the world your ancestors come from, who your possible relatives are, and even flag genetic traits and
Starting point is 00:00:51 potential health conditions. Kenzie's test and me being able to find out that I was BRCA positive was life-saving. The holidays wouldn't be worth celebrating without my family. saving. The holidays wouldn't be worth celebrating without my family. It's personal stuff. So it wasn't exactly comforting to 23andMe customers in October when the company said a breach had exposed the accounts of a small portion of its users. Well, we're now learning that that leak was much bigger than originally reported. This week, the company said that information from almost 7 million people was exposed. News of the breach has even prompted a proposed Canadian class action lawsuit against the company. Jason Kebler is a co-founder of the independent, journalist-owned tech website 404media.co, and he's going to tell us about the breach at 23andMe, and also about the broader risks of giving our genetic info to services like this.
Starting point is 00:01:54 Hey, Jason, thanks so much for coming on FrontBurner. Hey, thanks for having me. Okay, so I want to start at the beginning of October when these posts about 23andMe appear on a website called Breach Forum. So let's just start off with what is Breach Forum? What kind of stuff gets posted there? Yeah, Breach Forum, like the name suggests, is where hackers talk about data breaches that they have either perpetrated or that they are interested in getting in on the action for. And Breach Forums is one of the larger ones.
Starting point is 00:02:29 So in October, we start seeing on Breach Forum posts about 23andMe. What's going on there? Right. So this user named Gollum goes on Breach Forum and says that they have collected, quote, one million data points about Ashkenazi Jews and people of Chinese descent. And then they later, meaning a day or two later, start advertising 23andMe profiles for between $1 and $10 per account. So they started offering to sell these accounts in mass. So, I mean, I know that there are certain genetic disorders associated with Ashkenazi genetic heritage, but why was this person interested in Ashkenazi Jews and their genetic information?
Starting point is 00:03:11 Yeah, I mean, I think very broadly the Post repeated some anti-Semitic tropes about Jewish people controlling the world, and the Post said, you know, I have data about very powerful people and that sort of thing. So, I mean, it was a pretty unpleasant situation. A lot of the things that happen on hacking forums are pretty gnarly. But, you know, in this case, it seemed like a targeted attack
Starting point is 00:03:38 on people of Jewish background. So after that, am I right that this poster, Gollum, dumped even more apparent profile data on the forum? Yeah. So later they started posting entries, like individual entries of data that they had allegedly stolen for people like Mark Zuckerberg, Elon Musk, Sergey Brin, who was one of the co-founders of Google. And the information contained had like name, sex, birth year, broad location. And then it had these fields that are called Y-DNA and N-DNA. To be totally honest, I don't know exactly what those mean, but that had something to do with the apparent genetic background of the people whose data had been
Starting point is 00:04:25 stolen. It's kind of astounding to think that you've got people like Zuckerberg and Elon Musk. Did the data seem legit? Did this seem like it was really their profiles? So that's the thing. I mean, there are a lot of scammers on hacking forums. That's sort of another core feature of a lot of hacking forums where hackers will take already public info from previous data breaches and sort of repackage it and say that they have new data. Sometimes the data is completely made up. There was some initial reporting about it, but no one knew for sure at first whether it was real information. And I think more importantly, no one knew whether 23andMe itself had been breached. So but leaving some of these individual celebrity profiles aside, it did become clear after
Starting point is 00:05:13 some of these initial data doubts that at least some of that data came from 23andMe. Is that right? Yes. Pretty quickly, it was confirmed that this information had come from 23andMe, but 23andMe said that it had not been hacked. People started theorizing that the data had been stolen using this method known as credential stuffing, which is very, very common. It is basically when a hacker takes usernames, emails, and passwords from previous data breaches of other companies and tries those same username and passwords on other websites. And this is why security experts say, you know, always use unique passwords for different services. It became clear pretty quickly that that's what was happening
Starting point is 00:05:59 here, that there was a credential stuffing attack targeted at 23andMe users specifically. Tonight, drawing questions after a first-of-its-kind data breach targeting genealogy site 23andMe. 23andMe confirmed profile information was taken that includes usernames, passwords, gender, photo, relatives in common, and the percentage of DNA you share with them. The hackers could gain access to anyone who opted to share information with potential relatives, even if those people did have secure passwords and two-factor authentication. So it looks like the data breach is basically this kind of brute force hack where they're recycling old passwords that have been leaked out in other leaks, as I understand it. But part of that information is this opt-in feature on 23andMe that apparently allows much more data to be accessed.
Starting point is 00:07:00 It's called DNA relatives. So what is the DNA relatives option and what was it designed to let users do? Yeah, so one of the core reasons that people use sites like 23andMe is because they want to see who they might be related to. You know, there's a lot of sort of heartwarming stories of people who have been able to find their adopted parents, parents who gave a child up for adoption have been able to find their children. And so there's this opt-in feature that many, many users do opt into on 23andMe called DNA Relatives, where it will show you people who you might be related to. And it sort of tells you how likely it is that you are related to these people. how likely it is that you are related to these people. And so the hacker started releasing information about DNA relatives. And it wasn't clear at first how many accounts they
Starting point is 00:07:55 had actually broken into, but they started posting evidence that they had millions of records from 23andMe. So last Friday, 23andMe updated their filing with the U.S. Securities and Exchange Commission, laying out that 14,000 accounts had been directly accessed. And as I understand, that's just a small fraction of its account, something like 0.1%. But then after that, 23andMe gave a different number, right? What are they saying now about how many people's personal data has been shared? Just this week, 23andMe said that there were 6.9 million other users who had data stolen in the breach, but whose accounts were not hacked, if that makes sense. So what happened is of these 14,000 customers who had their accounts hacked, the hackers then stayed in those people's accounts and started scraping all of this information from this DNA relatives feature. And there's evidence that they
Starting point is 00:08:52 were sort of in many of these accounts for months. I don't know how long it necessarily took to compile this database, but they were able to make a database of 6.9 million other users who had data stolen, even though their own accounts were not actually hacked. So let's dig into exactly what the data we're talking about here is. What are the different levels of information that various users have had exposed? So as I understand, you know, if you are able to hack into someone's 23andMe account, you have access to anything that they've given 23andMe. So for those 14,000 people who were reusing passwords and had their accounts hacked into, anything that they gave 23andMe was accessible. For these other users, these users who had their data stolen through the DNA relatives feature, they had things like their name, their birth year,
Starting point is 00:09:46 something that 23andMe calls relationship labels, and then the percentage of DNA that they shared with different relatives, as well as something called ancestry reports and their self-reported location. There's this other feature called family Tree that's on 23andMe, and that also 1.4 million people had their data stolen through that. And this includes very similar information. So I think that's why this hack is so interesting. It's because even though specific genetic data was not stolen, like people's entire
Starting point is 00:10:23 genomes were not downloaded. Because of the nature of the product, people use this to form connections and DNA is specifically what relates us together as humans. It created this sort of chain reaction situation where from only 14,000 users, they were then able to get information on, you information on almost 7 million total individuals. And that is pretty novel for a hack. Normally, if you break into someone's email or account, you can maybe find out information about other people depending on what sort of account it is. information about other people depending on what sort of account it is. But normally you can't sort of use that access to then get family relationships, location, ethnicity, things like that. So 23andMe says this information was exposed by this credential stuffing tactic,
Starting point is 00:11:17 but some 23andMe users are chafing at this, right? So there's a lot we don't know, but I'm thinking in particular here of a U.S. cybersecurity official who's been posting on X. Can you tell me what he's been saying? Yeah. So Rob Joyce is a former NSA official who says that some of his data was exposed in this breach. And he says, you know, I'm not sure how this happened because I used a completely unique email address for 23andMe, and yet my data was still exposed somehow. And it turns out his information was part of another breach of this company called MyHeritage back in 2018. And 23andMe partnered with MyHeritage at some point to provide this family tree capability. And so in this case, users who took every possible step to protect their information also had data stolen because of these sort of like internal connections between these
Starting point is 00:12:18 genetic database companies. in the dragon's den a simple pitch can lead to a life-changing connection watch new episodes of dragon's den free on cbc gem brought to you in part by national angel capital organization empowering canada's entrepreneurs through angel investment and industry connections. Hi, it's Ramit Sethi here. You may have seen my money show on Netflix. I've been talking about money for 20 years. I've talked to millions of people and I have some startling numbers to share with you. Did you know that of the people I speak to, 50% of them do not know their own household income? That's not a typo. 50 percent. That's because money is confusing. In my new book and podcast, Money for Couples, I help you and your partner create a financial vision together. To listen to this podcast, just search for Money
Starting point is 00:13:18 for Couples. Beyond just this case, I want to talk a little more broadly about what we're dealing with when it comes to 23andMe and other genetic services. So, you know, we should be fair to 23andMe. They are offering a service that a lot of people are, you know, crazy about. So maybe we can kind of talk about what is the service it's providing and why do people like it? Yeah, so I think that a lot of people started using 23andMe because they were interested to learn like, what is my heritage? You know, what percentage of my DNA comes from different geographic parts of the world? And a lot of people are very interested in that just from a curiosity standpoint. But then there is very real potential for personalized medicine based on people's DNA, developing pharmaceuticals that work specifically for people who have specific genetic markers.
Starting point is 00:14:10 23andMe has started doing some of that, which is, I find it to be a little bit problematic because they've partnered with Big Pharma, but that concept in general shows a lot of promise. And then a lot of people have used it to find their adopted, either their distant relatives or their adopted parents, their children that they gave up for adoption. And it's frankly changed a lot of people's lives. You know, I wrote an article about 23andMe and a
Starting point is 00:14:36 lot of people said, you know, I would make this trade where hackers steal my data every day of the week because I was able to find my adopted parent. And that's very powerful. At the same time, I will also say that 23andMe, it's not great that they got hacked, or rather that they sort of allowed this hack to occur because there were steps that 23andMe could have taken to prevent credential stuffing at this scale. Things like requiring two-factor authentication, things like rate-limiting logins where you're not able to steal this many email addresses without some sort of automated system
Starting point is 00:15:15 that's spamming passwords into the system. That said, they have relatively good privacy practices where they do encrypt the genome that they ultimately sequence. And so no genetic data was stolen in this case. They've also pushed back against law enforcement requests for data about their customers, which I think is really important. That said, I'm 35. My DNA has been the same since the day I was born, and it's going to be the same on the day that I die. I don't know what 23andMe is going to look like in five years, in 10 years, in 20 years.
Starting point is 00:15:50 And they're changing their policies all the time. They changed their privacy policy and their terms of service this week as a response to the hack. And so there are very real concerns, I think, with giving your genetic data to a company like 23andMe. real concerns, I think, with giving your genetic data to a company like 23andMe. And I should just point out that 23andMe is now requiring two-factor authentication after this breach. Now, and I'm asking this in no way making a comparison with 23andMe, but you've actually written about a fairly disconcerting example of just what can happen with genetic databases. And I'm thinking here about a database called GEDmatch. So can you take me through that? Yeah, I find this really disconcerting and really compelling, actually. So GEDmatch is a genetic database that is open source or is founded to be open source,
Starting point is 00:16:38 meaning it was public. People can go in there and sort of grab genetic data sort of willy-nilly. Like people can go in there and sort of grab genetic data sort of willy-nilly. It was founded out of a Florida basement. It was kind of like a DIY operation. There were two people running it. It was all volunteers. And when it launched in 2010, it pitched itself as being for, quote, amateur and professional researchers and genealogists. Its privacy policy said, we will never sell your information.
Starting point is 00:17:08 Well, eight years later, it helped cops crack the Golden State Killer serial killer case, which is huge news. Tonight, hiding in plain sight. Police say one of the most elusive serial killers in American history has been captured outside his suburban home. But D'Angelo, not a suspect until days ago when they got a break. serial killers in American history has been captured outside his suburban home.
Starting point is 00:17:28 But D'Angelo, not a suspect until days ago when they got a break. They say cutting edge DNA testing allowed them to make a match. We were able to get some discarded DNA and we were able to confirm what we thought we already knew. And this was particularly interesting because the killer himself had not uploaded his DNA into GEDmatch, but of been leading the way there. There's been cases where they have broken their own terms of service to help police in certain cases. And the use of this sort of genetic forensics is very common now. I think my argument is once you put your DNA into a a database like this you are not able to easily get it out your dna does not change and you can also be sort of implicating any of your relatives uh your descendants etc which i guess brings me to what happened with GEDmatch next is that it was sold in 2019 to a for-profit
Starting point is 00:18:46 company called Veragen that works very closely with the FBI and a lot of other police departments throughout North America. And then Veragen itself was sold last year to a Dutch conglomerate called Quyogen, which is sort of a larger conglomerate that has a forensics, healthcare, biotech, and pharma arms. So in the last four years, they have been three different owners of this company. Let's say a service did allow your genetics to get leaked, your raw genetics are soldered off. What's your fear about how it could be used in the future? And I guess I'm thinking here about something
Starting point is 00:19:35 that the Edmonton Police did here in Canada. The Edmonton Police Service is admitting the use of a new-to-Edmonton technology to identify a sexual assault suspect was a mistake. The result was a generic photo of a new-to-Edmonton technology to identify a sexual assault suspect was a mistake. The result was a generic photo of a suspect that was criticized as stereotypical. The photo, which City News has blurred, was created for Edmonton and shows a black male face whose features were constructed by phenotyping. So there's this company called Parabon, and they do this thing called DNA portraits.
Starting point is 00:20:05 And they work very closely with a variety of different law enforcement agencies. And what they do is they take someone's genome and they generate what they believe to be an image of them. So using only their DNA sequence, they then generate an image. And that image is used for police sketches essentially like a high-tech version of a police sketch the big issue with this is that there there's not great research that shows that someone's dna correlates very closely with their modern day appearance i mean you know some things are sort of genetic like genetic express themselves in a variety of different ways yeah there's a huge range of possible expressions, right? That's the issue.
Starting point is 00:20:48 Of course, yes. And the concern here is they're going to generate images that look like, you know, someone who is innocent and that is going to be used, you know, they'll be arrested for a crime that they didn't commit. I think that there are other potential dystopian uses of someone's raw genetic code. I think that we do need to take sort of a long picture approach here because, like I said, people's DNA doesn't change and it's also very similar to your relatives and your descendants. So right now there are pretty strong protections, at least in the United States, and I believe in Canada as well, that prevent health insurance companies from sort of like raising someone's premiums because they have access to their genetic data. However, it's very, very easy to imagine that regulation changing at some point and health insurance companies
Starting point is 00:21:42 raising someone's premium because they are more likely to have cancer, more likely to have some other chronic illness. Jason, thank you so much. It's been great talking to you. Yeah. And thank you so much for having me. Before we go today, we need to make a correction to an error we made in an episode earlier this week. In our story about the U.S. Supreme Court case involving Purdue Pharma and the settlement for its role in the opioid crisis, we made a reference to the Netflix movie Pain Hustlers, saying Purdue was depicted in that film. Pain Hustlers was inspired by a different company that sold an opioid-based pain medication.
Starting point is 00:22:24 We regret the error. That's all for today. Front Burner was produced this week by Rafferty Baker, Shannon Higgins, Joyta Shingupta, Lauren Donnelly, and Derek Vanderwyk. Sound design was by Mackenzie Cameron, Sam McNulty, and Will Yar. Music is by Joseph Shabison. Our senior producer is Elaine Chao. Our executive producer is Nick McKay-Blocos.
Starting point is 00:22:50 And I'm Damon Fairless. Thanks for listening. FrontBurner will be back on Monday. For more CBC Podcasts, go to cbc.ca slash podcasts.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.