The a16z Show - The Race for AI—Search, National Infrastructure, & On-Device AI
Episode Date: December 25, 2024The AI race is on, and 2025 could be its most transformative year yet.In this episode, a16z General Partners Anjney Midha and Jennifer Li, and Partner Alex Immerman dive into the trends reshaping AI a...nd its impact on search, infrastructure, and devices.We explore:How AI-native tools like ChatGPT and Perplexity are challenging Google’s search dominance.The rise of infrastructure independence as governments prioritize compute, energy, and data.The future of smaller, on-device AI models driving privacy, performance, and new consumer applications.With insights from a16z’s Growth and Infrastructure teams, this episode unpacks the forces driving AI innovation—and the opportunities founders and nations could seize to lead in the next wave of technology.Stay tuned for more in this four-part series, and explore the full 50 Big Ideas for 2025 at a16z.com/bigideas.Resources: Find Alex on X: https://x.com/aleximmFInd Anjney on X: https://x.com/AnjneyMidhaFInd Jennifer on X: https://x.com/JenniferHliStay Updated: Let us know what you think: https://ratethispodcast.com/a16zFind a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://twitter.com/stephsmithioPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
In 2024 alone, I think they were more than 700 pieces of state-level legislation that were AI-specific.
The average query on perplexity is 10 to 11 words.
The average search on Google is two to three keywords.
You've had an enormous amount of Nvidia's purchasing orders come from the balance sheet of governments.
You can actually reimagine the AR experience.
If you're a founder like that who has the guts, then your impact on humanity ends up being quite generational.
Here we are again.
inching even closer to the end of 2024.
And as we near 2025,
here are a few dates to give you some perspective.
We've had 24 incredible years of Wikipedia,
18 years of the iPhone,
and 16 years since the Bitcoin white paper release.
So as we look to 2025,
and the speed of innovation is only increasing,
we continue our coverage of A16C's big ideas,
together with the dozens of partners
who are meeting daily with the people building our future.
Last year, we predicted
A new age of maritime exploration
programming medicine's final frontier.
AI post schemes that never end.
Democratizing miracle drugs.
On deck this year, closing the hardware software chasm.
Game tech powers tomorrow's businesses.
Super staffing for healthcare.
And throughout our four-part series,
you'll hear from all over A16C,
including American Dynamism, Healthcare, FinTech, games, and more.
However, if you'd like to see the full
list of 50 big ideas, head on over to A16Z.com slash big ideas. And of course, if you missed it,
check out part one, all about the intersection of hardware and software. As a reminder, the content
here is for informational purposes only, should not be taken as legal, business, tax, or investment
advice, or be used to evaluate any investment or security and is not directed at any investors or
potential investors in any A16C fund. Please note that A16C and its affiliates may also maintain
investments in the companies discussed in this podcast. For more details, including a link to our investments,
please see A16C.com slash disclosures. Today in part two, we'll be talking about the topic of the day,
the month and quite frankly the year, artificial intelligence. There's certainly AI, AI, AI,
and the race is on, whether it's across companies like Google and the disruptors chasing away
10 blue links, or sovereign countries trying to capitalize on the next frontier, or even
device companies, figuring out their role as AI meets the edge. That is all on deck today.
There's a bit of an innovator's dilemma here, and I'm excited to watch it play out. That was
Alex Zimmerman, and I'm a partner here on the growth fund. Here's his big idea. The search
monopoly ends in 2025. Google controls 90% of U.S. search, but its grip is slipping. Its recent
U.S. antitrust ruling encourages Apple and other phone manufacturers to empower alternative search
providers. More than just legal pressure, Gen A.I. is coming for search. ChatGBT has 250 million
weekly active users. Answer Engine perplexity is gaining share, growing 25% month-on-month,
and changing the search engagement form. Their queries average 10 words.
three times longer than traditional search,
and nearly half lead to follow-up questions.
Claude, Grok, Meta-I, Poe, and other chatbots
are also carving off portions of search.
60% of U.S. consumers used a chatbot
to research or decide on a purchase in the last 30 days.
For deep work, professionals are leveraging domain-specific providers,
like causally, consensus, Harvey, and Hebia.
Ads and links historically aligned with Google's mission.
Organize the world's information and make it universally accessible and useful.
But Google has become so cluttered and gamed that users need to dig through the results.
Users want answers and depth.
Google itself can offer its own AI results, but at the cost of short-term profits.
Google as a lot of short-term profits. Google, as a lot of,
a verb is under siege, the race is on for its replacement.
Maybe start by setting the stage. So how big is the search market today?
The search market is enormous. Anyone listening to this uses search. Virtually anyone
with the internet uses search. But to put some numbers around it, Google that I just mentioned,
they're the biggest game in town. They're approaching 200 billion of revenue annually.
They're still growing double digits, highly profitable. Microsoft,
Bing, which has been the number two player for a long time, mid-single-digit market share, so pretty small.
They have 12 billion of revenue. This is a massive, massive market, one of the largest out there.
And it does feel like there are forces reshaping this industry, so tell me about those.
So there's certainly AI, AI, AI, AI, but before we jump into that, we can level set with
the legal pressure that's mounting on Google. So earlier this year, Google was declared a monopoly.
the court ruled that they're spending billions of dollars to phone manufacturers. In the case of
Apple, tens of billions of dollars, is monopolistic. It's anti-competitive, and it's preventing
their competitors from gaining share in the marketplace. They are the default search engine on all
these phone manufacturers, and it basically makes it impossible for any of the others to gain share.
So the exciting technological change, of course, is around AI. The GEOPLE, the GEOPLE, the GIFT,
Gen A.I. Search providers are fantastic. And the market has been dominated by Google for close to
25 years at this point. And as a monopoly, they're not really innovating. They have no incentive
to until now. If you think about the Google experience as it is today, it's just a long
list of links. And the first few are sponsored. Their ads. And I as a user, when I make a query
on Google, I then need to make the decision. I have to sift through the information on that
site and find the answer that I'm looking for. That's actually a pretty long process. Instead,
with Gen AI, I can just get the answer. So on chat, GBT, perplexity, Claude, or when I'm
chatting with my AI friends on Character AI or on Po, I get an answer immediately. And that's just
a much, much better experience. And I hear tons of people saying they're trying these new tools,
but maybe you can give us a sense,
is this shift really stark?
Are people really moving over?
People are definitely moving over,
and I'd say there's four main reasons.
One is the one we just talked about,
the shift from links to answers.
A second one would be how they are personalized,
how the answers feel interactive,
how they are conversational.
The average query on one of these new services
has a follow-up question.
So it's not just about that,
initial engagement. It's about the ongoing conversation. The third difference is about how these
new services engage with complex queries. So the average query on perplexity is 10 to 11 words.
The average search on Google is two to three keywords. As you can imagine, their ability to leverage
that information and get you what you want is much higher. On Google, you're going to get a list
of links. One of those links might have one side of the debate. Another link
may have the other side, you may never make it to that second side. On perplexity, on chat
GBT, they're going to synthesize both sides. They're going to give me both perspectives.
And then the fourth, they're just not cluttered with ads. That may change over time. But putting
it all together, it should be no surprise that these AI native services are gaining share. For 10%
of consumers, they're now referring to chat GBT as their search engine of choice. Perplexity
queries are growing 25 to 30% month over month. And I saw a survey last week that 60% of consumers
for purchase decisions in the last 30 days used a chatbot. All of these services are growing
massively. So if we look at some of the tools that exist today that are coming for that
market share, you take a perplexity or you take a chat GPT, those are also broad-based search engines
or chatbots. But there are these other players you mentioned consensus or have you
for example, that are verticalized,
does it surprise you that the last wave of search engines weren't verticalized?
It does not surprise me that the last wave or the next wave
will result in a winner-take-most dynamic.
It's a pretty monopolistic market.
Search is very much a distribution game.
Google has an incredible brand.
They have incredible direct traffic.
They are the default search engine on most browsers.
They are the Kleenex, the Band-Aid of online.
search. It's important to note that search engines definitely benefit from network effects.
More users coming to Google provides more data around preferences for Google. That means the
next search has more relevance and brings more users. It also means more users means more
advertisers, more profit dollars that you can invest back in the service or into ensuring
that Apple puts them as the default provider. Absolutely. And so as we
think about this next AI wave, it sounds like you think maybe a similar dynamic might be at play,
or how do you think about maybe these smaller players actually finding a wedge or differentiating?
One framework for thinking about how search could fragment or verticalize is just looking at
why does vertical software beat horizontal software in some cases. And it's typically the vertical
requires a specific user interface, proprietary data.
features, workflows, compliance, etc.
And that's why Viva in life sciences and pharma can beat Salesforce in horizontal CRM.
For the average query on chat, GBT, perplexity, or Google, they're pretty good.
You don't need a vertical-specific application.
But for deep domain research, I can imagine a world in which standalone apps thrive.
In the case of our portfolio company, Hebia, they have.
a unique interface. It looks like a spreadsheet, which is native to financial services and their
customers. It brings in public filings, earnings transcripts, but also private data. It can bring
in research. It can bring in survey results. You can query. And then the output can also be
specific to that industry. Not just look like a spreadsheet, but populate a meeting agenda.
so you can go in immediately prepared.
And so when I think about vertical search
or vertical apps,
it's not just about the search,
it's about everything around the search.
That's a good distinction.
And as we think about how this maybe continues,
a progression of this industry,
if we go back to the last wave of search,
we did see a bunch of search engines,
get traction to start,
maybe Alta Vista or Ask Jeeves.
We all pick on Ask Jeeves.
You know, all the Gen Zers
will not know what the hell any of that is,
but they had some traction
and then Google exploded and took over.
Do you expect that same kind of consolidation, or is this time different?
I think consolidation should be expected in general purpose search because of distribution,
because of network effects.
Google won the last time around, in part because of their page rank algorithm.
It produced superior results.
It had a minimalist UI.
It was really simple.
Asgiv's Alta Vista, they had tons of ads.
it was cluttered, links on the homepage, no one wanted to use it.
The irony is that today, what we're all complaining about with Google
is that their pages are cluttered with ads on the results,
and that creates the opportunity for these new search engines.
I would expect consolidation again.
And I think something else that's interesting is what you're pointing at
with some of these verticalized solutions is that it's not just for the everyday consumer,
but also for the lawyer or for the academic researcher.
Do you think that is also part of the fragmentation? Do you expect that to continue?
Search historically has been thought of as a consumer product, and for good reason.
I make a lot of searches that are related to my personal lives, but I use, as do all the
professionals you named, search at work. I probably make more Google searches at work
than I do around personal matters. But there is a category of enterprise search in a market
that has existed. And you can think of that as querying box, Dropbox, Salesforce, all in one.
But I think those two worlds are going to blend together. Consumer search should not just be limited
to what's on the web. And Enterprise Search should not be limited to just proprietary data.
It should have both. And a lot of these AI Native services are working on that.
Yeah, that's a great point. And as we think about those two models maybe blending together,
we think of consumer search for sure. It's always been a.
an ad-based model, or at least today. Google is this massive economic engine, but no one's
paying a subscription for that. The new entrants do seem to have a more subscription-based model.
Is this a temporary thing, or how do you see those two dynamics playing?
So the subscriptions could stick around, but I think this is going to continue to be a
digital advertising-focused market. And I think digital advertising is going to grow because
of these AI services. So these AI services today, they don't have ads. I imagine that's going to
change. It's been in the news that perplexity is already talking to advertisers. They have a high
income, highly educated user base that should be attractive to these advertisers. But as we discussed
earlier, the queries on these services, they're longer, they're more complex, they're more detailed,
they're more personalized. And because of that, there's greater intent, which should be more
helpful to advertisers in producing the best results on their end and creating an even larger market.
But in the meantime, these subscription business models, they make a lot of sense.
They bootstrap the business.
They cover costs.
And from a personal perspective, I'm very happy to pay $20 a month for chatGBT,
perplexity, and poe.
I mean, they provide so much value.
A lot more than $20 a month.
Yes, yes.
So obviously this was your 2025 big idea.
And I know this is going to be a many year, maybe even decade-long progression.
But what are you paying attention to in the next year?
What opportunities are still on deck?
as I looked at 2025, Google it may be on decline, but I don't expect them to go down without a fight.
Google and meta, I think they can be big players.
Google AI Overviews already has a billion monthly active users.
Meta's not far behind.
They have 500 million monthly active users.
Again, this shows the power of distribution for search.
And maybe to that end, Apple, if you could call them a dark horse, is a dark horse.
They are not investing the CAPEX like Meta and Google and Microsoft and Amazon,
but they control a central node to the consumer.
And if they wanted to build a search application,
the next day, they could have a billion users.
We just heard how quintessential winning AI is to multi-trillion dollar companies like Google and Meta.
But what about nation states?
In the race for AI dominance, compute has become critical national infrastructure,
but not every country is equipped to compete in that race.
That was...
I'm a general partner here at A16Z
where I focus on AI infrastructure.
And here's his big idea.
My big idea is infrastructure independence.
And it's the idea that a lot of countries and regions
are starting to realize that modern AI,
deep learning-based AI, generative models,
are a form of what have been called general-purpose technologies.
In the history of humanity,
we've only had maybe 20 or 22 or so general purpose technologies.
And these are usually types of technologies like electricity, the printing press,
that have very broad-based applications in society.
They end up being largely horizontal economic multipliers and progress multipliers
across a whole set of pillars and domains in society.
There are usually two moments in the adoption of a general-purpose technology
where first countries, nation-states, start asking,
are we going to welcome this technology
or are we going to be hostile
to its development?
And that's the first step
that becomes pretty important
in a country's progression
or in a nation state's progression
or region's progression is,
do we want to adopt it?
Do we allow this in?
Regardless of whether we own it.
Do we embrace it or not?
And then the second is,
do we build or buy?
Right.
Which is, can we trust
somebody else to provide it for us?
We're well past the stage of
do we embrace it or not?
We're already well into
billions of people
around the world now, having already embraced it. So the governments don't really have a
choice, so to speak. So in a sense, AI has already percolated throughout society at one of the
fastest diffusion rates of any general purpose technology, because it's piggybacked off of years and
years of digital infrastructure. And so now the question everybody's asking is, do we build or buy?
Yeah. It's the single largest probably purchasing decision that's going to happen in the next 24
months is do nation states start buying it? Do they build or buy? Do they build or buy?
Yeah. I love the parallel of companies because there are many companies that do choose to build, but also companies that choose to rent or buy. And so as you think about that, there's large nations around the world like the United States, which are clearly building. But talk about the argument for smaller nations or the 190 plus that should be thinking about buying. The good news here is that we've got hundreds of years of human history to look at for clues about what happens next.
If you're a small country and you were in the early 1900s, you were watching the modern electrification of the developed world, the United States or Europe.
If you chart what happened with many of those countries, many of them decided to actually enter into what we're called joint venture agreements.
It starts with a joint venture with a country that's at the frontier.
Right.
These countries at the frontier via is what I call hypercenters.
These are countries that have the ability to develop, train, build, and host their own frontier models.
I call them hypercenters mostly as an homage to the word hyperscaler,
which is that there have been a handful of companies that have had the compute and the talent
to actually build frontier AI.
And now I think what we're seeing is a shift from just those companies driving a bunch of
frontier AI to countries and regions driving it.
And so if you're a small country and you're going, well, we certainly believe that
it's important to have our own AI infrastructure.
We want to be independent.
But we don't have all the compute required to train these.
models or we don't have all the talent locally. Then what you enter into is a joint venture
with a country or an overseas partner that matches your values. And this is the really important
thing about AI and AI models and how they're different from infrastructure like electricity
is there's a fundamental encoding of human values in AI models because they're trained on data.
Yeah. And the data has these local norms and cultural values embedded. And so if you happen to train a model
on a bunch of internet data collected in the US,
the models are just generally...
American.
American. They're encoded with that.
Yeah.
And if you're trained the models on data in France,
they actually subtly have a bunch of different values encoded in the models
that reflect those cultural norms.
And so I think step number one, if you're a small country,
is actually being a little bit crystal clear about which value systems
you align with most out of the hypercenters.
Now, it's not lost in people that,
The way the internet worked out was there essentially ended up being two internets, right?
The Chinese internet and the rest of the world.
AI may not end up looking that different.
And if you're a small country, what you really have to figure out is whose values align more with yours.
A good historical precedent to look at here is the technology of money.
Money is a pretty general purpose technology.
And what happened in the early 1900s with the modernization of finance is a number of countries started to ask the same question is,
do we build or buy our own currency? Do we rely on the dollar? Or do we have our own currency? And that led to the modern day currency regime where the dollar is a single global reserve currency. And that happened through a bunch of allied cooperation, where a number of countries realized they did not have the local resources required to hold the peg of gold, right, to the dollar. And so I think what we're going to end up seeing is the emergence of very similar to what happened with currency flows, where you have,
a couple of large countries that control their own sovereign currencies.
You have the U.S., you have China, you have India,
and then you have a number of smaller countries that decided they wanted to be flow points.
So you have Singapore and Ireland, and you have Luxembourg and Zurich
that become massive global leaders in modern finance
because they decide they want to ally with one of those power centers.
Yeah.
Right?
So if you think about it in the AI world, let's call regions at the frontier hypercenters.
And then we have compute deserts, and these are places that have literally no install base of compute capacity to even be relevant.
All the smaller folks have to figure out which of the high percenters they want to align with.
And how do you become a modern-day Singapore, Ireland, Luxembourg, etc., for the world of AI infrastructure.
And so it starts with deciding whether you want to be a compute desert or not.
And if you're not, and you're going to actually embrace AI infrastructure as a government, then I think you've got to figure out which hypercenter you want to align with most.
Right.
And then it becomes actually quite easy to reason about how to be a valuable ally.
That's such a good parallel because a lot of people think about resources in terms of the farmland that you have, the people who are working in that economy.
But what you're pointing out is that countries for a long time have offered value or offered a resource in other ways.
And as we think about AI, there's a few things that you've pointed out that countries can invest in, whether it's the compute capacity that they have, the energy resources to power AI and forward-thinking policy.
So maybe we can break down each of those.
How do you think about each of those blocks and how countries should be maybe maneuvering or investing in those things?
The good news is that there's only three or four ingredients here that really matter.
The first is compute, which we've talked about.
The second is abundant and low-cost energy, which powers the data centers.
The third is data, just the availability of really high-quality tokens for these models to learn on.
And the fourth is regulation.
Yeah.
So that's the good news.
Now, the bad news is that world is pretty unevenly split up, right?
Some countries just have dramatically more compute than others.
Yeah.
Others have dramatically more energy than others because of their natural reserves.
Yeah.
So if you are in the Middle East, you may not have massive data centers yet,
but what you do have is vast reserves of oil.
And how you translate that into becoming a hypercenter is quite simple.
It's the law of comparative advantage, right?
You've got energy.
You should use that to attract the world's best teams and companies and companies and
foundation model labs and so on, by trading what you have with what they have. And so I'm
quite bullish on allied ties between countries that recognize what their strengths are and then
partner with other countries to fill that gap. And by countries, I mean private companies,
too, from other countries. One of the things we may end up seeing in the coming years is jointly
trained models between countries. Basically, I think for most countries, it's impossible to have
total infrastructure independence at all parts of the stack.
Sure.
What is much more feasible is to be great at one part of the stack and then collaborate
with another sovereign or another country or region to achieve joint independence from
a value system that you don't subscribe to, like the CCP.
And so I actually think what's more important is for countries, regions, and frankly,
in some of the world's largest companies that operate at nation scale, to assess which
parts of the stack are critical to them that they must have independence from. And the answer there
is the function of what asset they already have, right? It's their strengths. And then if they've got a
critical gap to fill, it's to go and buy that. Now, in the long term, you might be able to build
things out. But with infrastructure, especially of this kind, it can often take years, if not a decade-long
scale. So as an example, lower down in the stack from the model layer, you have the chip layer.
And even below that, you have the lithography layer, right? There's a company in Holland called
ASML that builds literally the world's most important machines. How many machines do they make
per year? It's some very small number. Each machine costs about $200 million. Yeah. And I think they did
$23 billion in revenue this year. 40% of which came, by the way, from China, because China was
stockpiling ASML machines before a bunch of export restrictions kick down. Yeah. And they're the only
company that can actually make these EU lithography. That can do EUV lithography of this precision.
Now, is it feasible for the U.S. to say we're going to build our own ASML like tomorrow?
No. I mean, it's just going to take 10 plus years, right?
EUV lithography just takes a really long time.
On the other hand, is it feasible for a smaller country to say,
we're going to train our own local models at the frontier?
That's a little bit easier to do over the quarters of time scale.
If you've got a leading research team.
If, and that's a big if, right?
There's only a handful, really, of research teams globally that are capable of this.
And so to answer your question, yes, I don't think sovereign AI or infrastructure independence means
you have 100% ownership over every part of the stack. That's infeasible over the short term.
It means that you don't rely on somebody for a critical part that you don't trust.
Right. Can we talk about private companies for a second? Because you've brought them up a few times.
How do you think about that dynamic where as a nation state you're saying, okay, you need this sovereignty?
Right. But at the same time, can you rely on that sovereignty through the companies that exist within your nation?
Just using America as an example.
Right.
Does the government really need to be involved or can they just let Anthropic or Open AI kind of command that part of the stack?
Or how do you think about the difference between government versus private enterprise?
The line is pretty stark in a few countries and is more blurry in others.
So in China, the line is very clear.
There's a law called the PRC 2017 National Intelligence Law that says Chinese individuals and entities are required to support PRC,
national intelligence work by law, which means if there's any technology that a PRC company has access to,
they are automatically obliged to make that available to the government. And that's not the case in the
United States. There are some covered types of technology, like dual-use technology, like classified
defense technology, where if you are developing it, particularly if you're funded under a defense
program, then you're required to make that available to the government because the government's
paying for the development of that technology. But by and large, the private
sector in the United States and most other allied countries is by default protected from having to make
its technology available to the government. It's not the case in the CCP. So I think that the question
becomes from most countries is where on that spectrum do you want to exist? There's a general framework
through which most infrastructure is categorized. Every country approaches it slightly differently,
but the G5, the five eyes, the US, Canada, UK, Australia, and New Zealand, we generally have a joint
approach or framework to categorizing this infrastructure. And by and large, AI models have not
been categorized as being dual use or protected under national security. The short answer is
the history of technology has largely shown that if you'd like to win, then unlocking the best
talents of a country with as few bureaucratic slowdowns usually ends up winning. Well, if we think
about wanting to keep America at the frontier. And we think about the different layers or ingredients
that we talked about earlier, are there any high-risk areas that we think, or that you think,
we're falling behind? I think we go back to the four ingredients we talked about earlier, of the frontier
of AI, which is compute data, energy, and laws. Now, on the compute front, I think the private market in the
United States is doing a pretty good job. It's pretty responsive to market demand. And I think
there's no coincidence that the largest infrastructure businesses in the United States are chip companies
and computing companies. Because I think we've generally done a pretty good job of letting the market
feed that demand. I think on the data side, things are extraordinarily tough. Because, one,
the Biden executive order last year was a starting gun that said, oh, AI is important. Please
do something about it. And left it to the states.
to figure it out. And the states have all taken a complete patchwork of approaches to data regulation.
In 2024 alone, I think there were more than 700 pieces of state level legislation that were AI-specific.
And a bunch of those laws, if you look at it, are really well-intentioned, but atrociously
implemented ideas for data regulation. Right. And impossible to adhere to. Basically impossible to
adhere to it. And so I think one area where we're just handicapping ourselves is that there's no unified
framework in the United States at the federal level yet for data, especially around training.
And I think we needed that yesterday. Overseas in a number of countries where rule of law,
especially on copyright and IP and so on, is just less stringent. Those labs are happy to just
race ahead, whereas our companies here are trying to figure out what they should even comply
with. And that greater hurts you more than actually a laissez-faire approach.
Right. I think our companies would be totally fine. The best founders at the frontier
would be fine the United States being compliant.
They just want to be told what to comply with.
Not across 50 different states with different regulations that are changing and unclear,
and in some cases, impossible.
Right.
And there's also the fundamental scientific problem that they're just very real data walls
that these models run into.
And I do think, one of the thing that hurts frontier research in the United States
and allied countries is a lack of government support in collaborating across borders
to make more data available to allied regions.
So that's number two.
On energy, I think we've obviously hamstrung ourselves in the United States with nuclear.
France, for example,'s embrace of nuclear 20 years ago has positioned them to have extraordinarily efficient data centers today.
Whereas in the United States, I think we've basically shot ourselves on the foot around that.
And then lastly, I think around inference regulation, what we're not doing enough of is making it clear who the liability rests on.
I've seen a number of proposals ahead of legislative sessions next year that want to hold model developers liable
for the outputs of the inference,
even if the misuses being done by somebody else.
And what does that do?
That drives those very important developers elsewhere.
Essentially, forces most startups to lose much-needed ground to big tech companies,
and that entrenches incumbents more.
So as we think about 2025, whether it's in the U.S. or elsewhere,
because, I mean, this idea really is truly global.
What are you looking out for, or what should maybe, let's say, a legislator,
or let's say the head of a nation, what should they be thinking about?
And what are you looking out for in some of those decisions?
Are you looking for countries that are buying GPUs or building out new energy centers?
What are you paying attention to?
The leading indicators definitely compute.
If you think about the AI supply chain, the first mile starts at the data center.
That's the new atomic unit of sovereignty, I would say, which is a new thing.
We've never actually had nation states think about atomic units of an AI data center
as a thing that countries should be purchasing.
And I think about 24 months ago, we started seeing nations reason about that first mile as being important.
So you've had an enormous amount of Nvidia's purchasing orders come from the balance sheet of governments.
Just unprecedented demand they've been seeing from nation states realizing that they want to be hypercenters.
Yeah.
And that starts with them placing orders 12 to 36 months in advance to take delivery of GPUs.
Because if you don't get in front of that line, it's over.
You're getting it after everybody else.
that was step one. The second thing I look for
as founders who are deeply
both technical, who often come from
deep research backgrounds and scientists
who've led frontier model
development already, often inside
of large hyperscalal labs. So an example is
Arthur Mench who started Mistral.
They worked at Deep Mind or Guillaume Lomp who
led the initial Lama family
at Meta, who are deeply mission-led
and believe that they can help
solve a bunch of these infrastructure problems
for the world's largest governments. So there's a new
class of founder who's both
primarily technical and has their training in academia,
but is motivated to solve all the really hard problems that come
with having to deliver, solve a bunch of these infrastructure problems
for really large nation states and regions.
But I think if you're a founder like that who has the guts,
then your impact on humanity ends up being quite generational.
But now let us convince you that the future of AI may not be so straightforward.
Instead of models running in the cloud, perhaps are bound for a future
where many more applications will run on-device.
I expect smaller on-device AI models to dominate in terms of volume and usage.
This trend will be driven by use cases as well as economic, practical, and privacy considerations.
That was Jennifer Lee.
I'm a general partner on infrastructure team.
Here's her big idea.
My big idea is on-device and smaller generative AI models will become more popular in the next year.
If you're a frequent user of Uber, Instacart, Lyft, Airbnb,
applications. I'm sure there are many, many machine learning models already running our device.
Very easily, when you load up an Uber screen, it's 100 models that's coordinating routes and
giving you a real-time price. What I'm more referring to is the generative models that are
creating image, voice, video will become more prevalent in the same way to run on device and within
your applications similar to these other traditional machine learning models.
The models that we've seen in the last few years do take a lot of compute.
So can you square that with how much compute we can get from something like a smartphone and also these models, whether they're getting smaller or how this kind of comes together?
Yeah. First, never underestimate that compute power on a smartphone is probably as powerful as a computer 10 or 20 years ago. That's thanks to Moros Law.
At the same time, the models are for especially smaller sizes of 2 billion, 8 billion parameter models.
That's enough compute for them to run on device. And it can generate and create very robust experience already.
be it text or image or audio.
And some of these models, if they are diffusion models,
they are intrinsically smaller than large text models
to be very capable.
And there's another new set of tooling
and also technology developed around distillation
is if you have a very powerful large model,
can be distilled to a smaller parameter size model
and still maintain a lot of the capabilities
that large model contains.
So both on the infrastructure side
and also on the device-compute power side,
it's a perfect setup for the smaller models
to be more popular.
Totally.
So I'm hearing a few things.
I'm hearing that the smartphones are becoming more powerful.
Some of these models are becoming more efficient.
But that kind of brings us to the question of why.
So why would we want to run these models on device?
What are the advantages of that and also the disadvantages?
As consumers and day-to-day users,
we're already spoiled by real-time and very performant applications.
If you're talking to a chatbot, if you're talking to a conversational AI,
if you're adding filters to your video and images on Instagram or TikTok,
you don't want to wait for multiple seconds to load a new filter.
You don't want to wait for multiple seconds for the chatbot to respond to you.
Those are many real use cases that can really delight and improve user experience.
Also, optimization for compute.
There are a lot of harder, more complex questions or video processing
that requires going into the cloud.
But largely, if it's, again, changing user experiences and improve,
the visual and sound effect of things, it doesn't have to route through multiple servers
going through a network. So both from a user experience and efficiency perspective, it's a much
better design to run some of the models on device. And then the last part is just privacy.
Users do care about if my meeting knows is taken locally, I probably will use this meeting
note ticker much more often than knowing some of the data is being sent to a server and
they're processing a lot of my private conversations. So it depends on the use case, again,
for the application. I think that also improves that option. Absolutely. And that has my wheel
spinning for sure in terms of maybe this unlocks new applications. So on that note, you mentioned
a few already, but where might we see applications pop up or where perhaps are we already seeing
applications with these on-device models? First come to mind is real-time voice agents. It's a very
popular topic and it's something I'm very excited about. We invested in and worked very close to
this company called 11 Labs and that's one of the areas they're spending. Also, a lot of
efforts on is not just having the human-like synthetic voice, but being able to handle conversations
fluently with end users and to get the latency down and also to think about what type of
real-time exchanges you want to have with your AI companion, your support agents or any sort of life
coach, I think we do need to think about the modality and the latency in a much more, I guess,
improved fashion. So I won't be surprised if some of those inference workloads are running locally
coming into the next 12-18 months. Absolutely. And as we think about how maybe these different
models also interact with other parts of a smartphone, let's say the camera, do you expect this to
also maybe change user behavior in what we can do? A hundred percent. You can actually reimagine
the AI experience of if I point a camera to this room and I want to see a new surface and
wallpaper and furniture's, the technology is already there.
We can actually leverage both generative AI and the camera and also prompting interaction
to create new experiences already of how we interact with real physical life.
And that's where I also think a lot of on-device models will play a big role.
of how to interact with a 3D world,
how to interact with the physical world,
and not just using the camera for capture,
but also using it as a projector.
Definitely. And let me ask you about economics then,
because a lot of the models that exist today
do rely on inference
and sending that inference up to the cloud
and that costs money.
Do the economics change
if you all of a sudden have these models
running on device on the smartphone compute
that already exists?
Do the economics actually shift
or can we come up with new ways of monetizing
in this new world?
Yeah, it's a great question, and I honestly don't really have the answer because even for larger models, the inference price has been dropping really significantly for the optimizations to be done.
If it's a very workload-intensive compute, let's say using your computer or phone, I think it will still have economic benefits.
But I don't think it's a very direct answer of it's going to substantially reduce infrastructure costs for some of these applications.
But architecting and structuring, sort of the whole tool chain, it does.
change sort of economics on the developer efficiency and sort of iteration speed.
There's pros and cons when shipping in the cloud where you can launch more continuously.
On device has its own challenges because you'll have to go with the updates with the application
and with hardware. So there's that side of economics that I think will have impact from
how teams are being structured in launching models in a hybrid mode.
So I would encourage teams who are thinking of leveraging these technology, consider it more holistically.
Super interesting. And as we think about that world, are there any players that you think really succeed here? Like in one sense, I could see maybe the phone manufacturers. I could also see maybe, you know, the manufacturers of wearables, being able to introduce all kinds of new applications. Think of wearing an Apple Watch, Fitbit, Woop, things like that? Is it Nvidia that benefits in some way? Who do you think actually benefits from this idea of the models becoming more efficient and these on-device models becoming a thing?
Right now, I've seen more interest and enthusiasm from the hardware development side, whether it's chips, the filmmakers.
I do think there's also a lot of interest from the model developers as well of just like proliferating the model adoption across different setups and devices.
But I think over the long run is probably going to impact the whole supply chain.
We talked about some of these macro trends throughout.
How do you specifically see those trends shaping up in 2025?
and is there anything in particular that you're putting your eye toward?
This will sound more like a consumer investor.
I've been like a hardcore infraintra investor,
but I am very excited about the mixed reality
where generative models, 3D models, video models
that really, again, makes the reality of what we're seeing today
and through the camera lens, through the microphones.
Much more creative world, even when sitting at home
or when going on the ride.
That's the type of experience I'm very much looking forward to.
I think the foundation model technology is pretty mature.
The infrastructure is getting ready.
So I'm personally very excited about sort of the new consumer experience.
All right.
I hope these big ideas got you geared up and ready for 2025.
Stay tuned for parts three and four.
And again, if you'd like to see the full list of 50 big ideas, head on over to a16.com
slash big ideas.
It's time to build.
