The Current - Why did AI chatbot DeepSeek stun the tech world?
Episode Date: January 29, 2025DeepSeek, an AI chatbot from China, rattled the stock market this week when its sudden rise caught the tech industry off guard. Tech reporter Rashi Shrivastava explains what makes it different — and... why she’s been hesitant to test it out herself.
Transcript
Discussion (0)
When a body is discovered 10 miles out to sea, it sparks a mind-blowing police investigation.
There's a man living in this address in the name of a deceased.
He's one of the most wanted men in the world.
This isn't really happening.
Officers are finding large sums of money.
It's a tale of murder, skullduggery and international intrigue.
So who really is he?
I'm Sam Mullins and this is Sea of Lies from CBC's Uncovered, available now.
This is a CBC Podcast.
Hello, I'm Matt Galloway and this is The Current Podcast.
This deep-seek announcement is a wake-up call and I would expect a bunch more surprises
to come in an industry as volatile and innovative as AI. A wake-up call for the industry and beyond. That's Matt Calkins of the US AI firm Appian,
which is one of the tech CEOs reacting to this week's big news about DeepSeek, an AI
company from China. DeepSeek claims that its artificial intelligence model can perform
as well as industry leaders like ChatGPT but was built at a fraction of the cost.
The shockwaves fell well beyond the tech sector, including throughout the stock market,
and it is upending conventional wisdom about who is leading the way in artificial intelligence
and what that future might look like. Rashi Srivastava is a tech reporter with Forbes,
has been looking into this and is here to explain what's going on. Rashi, good morning.
Good morning.
Thank you so much for having me.
Thanks for being here.
So people know about chat GPT.
They might know about the artificial intelligence chat bots on their meta products.
You see it on Instagram now, perhaps through Google.
Tell me about DeepSeq and what it can do compared to those AI models.
Yeah, absolutely.
So DeepSeq is essentially a small Chinese lab
based in Hangzhou.
It is spun out of a high flyer,
which was a quantitative hedge fund
that used AI to make trades.
Their new model, they've launched two so far
that have really made,
that caused a lot of excitement in this space.
Just last week on Monday, they launched their R1 model, which is a model that's able to
do higher performance tasks like reasoning, coding, math and science problems, things
like that.
And they've also launched V3, which is a previous version of that model that is a language model. I'd said in the introduction that it was comparable or perhaps better to some people's eyes than what
we've seen in those other models, but was built at a fraction of the cost of things like CHAP GPT.
How do they manage to build something that apparently is so much more efficient?
Yeah, absolutely. So the cost that they've estimated is about $5.6 million. And what they've done is
essentially built on top of the open source models that are already out there, like llama.
And they've used a technique that's called distillation, which is they use the outputs from
an existing AI model and then just use a technique called reinforcement learning to make that
better and achieve results that are just as good as some of the closed source models that
are locked away behind this $200 price that OpenAI has.
So tell me more, explain what that means.
I mean, that they're building on existing technology or building on existing frames
of knowledge?
Yeah, absolutely.
So like Lama model, which is what they've claimed
that they've built on top of,
they've used the outputs from that model
and they've used reinforcement learning,
which is essentially a way to give positive feedback
to a model when it gets the answer right
and the negative feedback when it gets the answer wrong.
And so that is like,
they've just used pure reinforcement learning without humans in the loop, which is what we've
been hearing from some of the American AI labs, and that's how they've been able
to get the six million dollar price tag. It's also important to be taking this
number with a grain of salt because that's what they've claimed, that
they've just used that much of compute
and they've been able to get more out of existing compute.
But I think the matter is a bit more nuanced than that and that this number does not include
like previous training that they're building on top of.
And just today we're getting details that they may have open AI alleges that they trained on some of their own
outputs as well, which is like proprietary
and it violates their terms of service.
So this thing, I mean the most recent incarnation
of it appears and billions of dollars in the value
of tech stocks evaporates on Monday.
People are rattled by this, you have Mark Andreessen, who is a tech venture capitalist,
was a, you know, early investor in many, many companies
who said this is AI's Sputnik moment.
Why are people so rattled by this?
Yeah, it definitely had a major reaction
in the stock market as well as in the AI world and beyond,
because it just came out of nowhere.
No one really knew about this company and all of a sudden,
they're able to launch this model that's just as good as OpenAI's model.
How is that possible?
How is it that people would be surprised by something like this?
Yeah, because they hadn't anticipated this.
It's funny that I think within the labs themselves,
they're scrambling to figure out why had we not thought
about this? How can we make our models better than this? And all these big labs, they've
raised billions in funding. And now to hear that you're able to just do that with such
a fraction of the amount, it's going to cause panic and frenzy in the AI world.
And I think that's what we're seeing.
That's what we saw.
But it's also important to note that the next day, that was
yesterday, Nvidia got back almost like 40% of how much it
had lost.
So we are seeing it bounce back now as people are realizing
that the cheaper these models get, the more competition and the more like Javan's paradox,
which is something that people talk about is like,
the cheaper these models are, the more we can use it
and the more people are gonna build on top of it.
You mentioned Nvidia.
I mean, they're the big chip maker that everybody relies on.
This is an incredibly powerful and incredibly wealthy company,
but now they lost a lot of money as well,
at least in terms of value.
Does something like this undercut the idea
that these AI firms need a huge investment of cash
to be able to make progress?
That's what everybody's thinking,
that you have to raise an extraordinary amount of money
to be able to get to that next level.
But are you suggesting that perhaps this undercuts that?
Yeah, absolutely.
I think a lot of AI companies are going to be
rethinking their strategies and we're going to see
more players in the space as well.
And like you said, it is an impressive feat
because they use something that was, I think,
in plain sight, but people hadn't really used
that technique before.
And Sam Altman, who's
the OpenAI CEO said it was impressive. It's gotten a lot of accolades from like leaders and
American leaders as well in the AI space. And so we're going to start seeing like this question
like being up there about like, do we really need all this money to spend all this money to just train the models themselves?
There's a separate conversation about like running
the models, which is expensive for sure.
And this is just, we're just talking about
the training costs here, so.
Does it reset the competition between these companies?
I mean, they've been fighting against each other.
Is it now, you know, the United States
versus China, for example?
Yeah, that has always been a huge narrative, right? Like right from the beginning, like
Donald Trump has said that President Donald Trump has said that this is a wake up call
for Silicon Valley. And there's the CEO of Scale AI, Alexander Wang, which is a billion dollar company
that does training for AI models,
said that this is an AI war
and this is like an earth shattering model
and it really means we need to like really push ahead
in the race between like to build the leading AI model.
We have to let you go, but I mean,
one of the interesting things is people
have been downloading this and kind of plugging in phrases
to see what it does or doesn't do.
Um, people were perhaps surprised or not to
learn that if you asked it about Tiananmen
square and the massacred Tiananmen square, um,
the super powerful AI model would say, well, I
can't actually answer anything about that.
Yeah.
What should that tell us then if China is a big
player in the AI race, should we be concerned
about national security, about
censorship, but also just about what it does and does not want to talk about?
Yeah, of course.
That is definitely a big concern.
This is at the end of the day, a Chinese model.
It's censored.
It doesn't talk about things like you mentioned.
And there are concerns about if you use it, where is that data going to be stored?
DeepSeq has said it's going to be stored on models that are in China.
There are some companies that are now trying to come out of that and say,
okay, you can use it on our platform and we'll store your data in the US.
But still, definitely there are a lot of concerns around data privacy,
around censorship, all that stuff.
Have you used it?
Have I used it? Have I used it?
I have not used it, but I've spoken to a lot of folks
who have used it.
And I guess I'm just a little concerned
about my data as well.
As a tech reporter, you would know these things better
than the average person, I suppose.
Yeah, I guess that's maybe why I've not used it yet.
Rashi, thank you very much for this.
Of course, thank you so much.
I appreciate it.
Rashi Srivastava is a tech reporter.
She is with Forbes.
For more CBC podcasts, go to cbc.ca slash podcasts.