The AI Daily Brief: Artificial Intelligence News and Analysis - What Comes Next In AI & Agents (According to Y Combinator)
Episode Date: February 5, 2025Y Combinator has just released its latest Request for Startups, which covers nearly every theme related to AI and agents. From infrastructure to vertical applications, YC’s vision for the upcoming w...ave of startups provides insight into what they believe will shape the industry. This episode analyzes their predictions, the largest opportunities, and why managing AI agents rather than replacing jobs might be the true future of work. Brought to you by: KPMG – Go to www.kpmg.us/ai to learn more about how KPMG can help you drive value with our AI solutions. Vanta - Simplify compliance - https://vanta.com/nlw The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, what we can learn about the near future of AI and agents from
why Combinator's most recent request for startups.
Before that in the headlines, SoftBank CEO is feeling the AGI.
The AI Daily Brief is a daily podcast and video about the most important news and discussions
in artificial intelligence.
To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around
five minutes.
SoftBanks Masayoshi's son says that AGI will arrive much earlier than he thought.
At this point, Sun has to be just about the most enthusiastic AI investor on the planet.
In June, he said, SoftBank was founded for what purpose?
For what purpose was Masa Sun born?
It may sound strange, but I think I was born to realize artificial super intelligence.
I'm super serious about it.
Not just a few months ago, he said that AGI was still two to three years away.
Recent developments seem to have accelerated his thinking, with Sun stating at an event on Monday that, quote,
I now realize that AGI would come much earlier.
The event was used to announce a joint venture between SoftBank and Open.
AI aimed at driving Japanese AI adoption. The venture will develop an AI agent platform called
Crystal Intelligence with a goal to, quote, help make every knowledge worker more effective and
solve even more complex problems. One thousand soft bank employees will be assigned to kickstart sales
and engineering work, and the focus will initially be offering services to Japanese
businesses before establishing a plan for a global rollout. SoftBank Group will be using their own
organization as the test case, paying $3 billion annually to deploy open AI models across their
businesses. Notably, this includes Chipmaker Arm, which will, quote, use Crystal Intelligence
to drive innovation and boost productivity across the company, strengthening its pivotal
role in advancing AI globally. Overall, SoftBank plans to automate 100 million workflows using
Crystal Intelligence. Now, at this point, every AI announcement is just a competition
for how close to Googleplex your numbers can be, and there is something of a perception
in Silicon Valley that taking money from SoftBank is akin to a financial death knell, but there
is definitely a lot going on here, and it's probably worth paying attention to.
Speaking of AGI and the Brave New World we're moving into,
meta have released a new policy document stating that they may not release models they
deemed too risky.
The company's frontier AI framework details two categories of models that may not be suitable
for release, high risk and critical risk.
They consider these to include AI systems capable of aiding in cybersecurity, chemical
and biological attacks.
The difference with critical risk systems is their ability to bring about a, quote,
catastrophic outcome that cannot be mitigated in a proposed deployment context.
high-risk systems are still capable of making these kinds of attacks easier to carry out,
but not as reliably as a critical risk system.
Meta gives a few examples of their nightmare scenarios for AI risk,
including a, quote, automated end-to-end compromise of a best practice protected corporate-scale
environment, or the, quote, proliferation of high-impact biological weapons.
Now, this is the first safety policy update we've seen from a major AI lab since the Trump
inauguration and the accelerationist vibe shift.
To what extent then is this meta saying,
you don't need to regulate us because we're taking it upon ourselves to implement safeguards?
That remains to be seen, but I think it's at least a reasonable interpretation.
When it comes to determining these risks, meta does not seem to be using any particular test to classify
risk, but instead relying on the input of internal and external researchers with review from senior
level decision makers. They stated that, quote, the science of evaluation is not sufficiently
robust as to provide definitive quantitative metrics. If meta determines that a system is high
risk, they will limit internal access and won't release it until mitigations are implemented.
critical systems will be locked down to prevent exfiltration, and the company will stop development
until the system can be made less dangerous. In the policy document meta-rights, we believe that by
considering both benefits and risks and making decisions about how to develop and deploy advanced
AI, it is possible to deliver that technology to society in a way that preserves the benefits
of that technology to society while also maintaining an appropriate level of risk.
Now, in a positive development surrounding risk, Anthropic is challenging hackers to break into
their new AI security system. The company claims their newly developed method can block
95% of jailbreak attempts and are inviting red teamers to try to defeat it.
Gel brakes are specifically designed prompts that circumvent restrictions on an LLM's output.
One example that was surprisingly successful on the previous generation of models was to tell the
LLM to, quote, do anything now. Another is the notorious God mode, which substituted letters with numbers
to sneak past safety filters. Jailbreaking is relatively easy to minimize, but methods usually involve
a lot of incorrectly refused prompts, or adding a ton of compute to run supervision models.
Anthropic is claiming that their method avoids those tradeoffs.
The company has launched a demo with eight different types of unsafe requests.
Red Teamers are invited to try to jailbreak the system by finding a prompt that unlocks them all.
The intention is to prove the system is resistant against universal jailbreaks that work for all unsafe requests.
Currently, no one has managed to get past more than three levels using a single prompt.
To construct this system, Anthropic trained a new constitutional classifier using 10,000 generated jailbreaking prompts.
This AI technique relies on training a model on a list of principles that define allowed and disallowed actions aligned with human values.
This is Anthropics' constitutional approach.
Minimize incorrectly refused prompts, the team also trained the model on benign queries that should be allowed.
Their baseline version of Claude had an 86% jailbreak success rate, but with constitutional
classifiers added that fell to just 4.4%.
Not perfect, but absolutely would be huge progress.
Lastly, today, AI has won its first Grammy, sort of.
The Beatles track now and then won the Grammy for Best Rock Performance, making it the first time
an AI-assisted song has taken home the award.
Now, you'll remember that this song didn't include a generated version of John Lennon,
but instead used AI techniques to clean up archive demo tracks.
Now and then was first put together during the Beatles anthology remastering project in 1995.
It was based around demos recorded by John Lennon in the late 1970s,
with Paul Ringo and George adding their parts in the 1990s.
The song was never released with technological limits at the time
preventing John's vocals from being separated from the piano on the demo track.
In 2021, the surviving Beatles worked with filmmaker Peter Jackson and his sound team
to clean up the demo using modern machine learning techniques.
The tech is similar to that used in video calls to remove unwanted.
wanted background noise. When the song was rumored in 2023, there was a lot of anti-AI backlash.
Paul McCartney addressed the controversy stating, to be clear, nothing has been artificially
or synthetically created. It's all real and we all play on it. We cleaned up some existing
recordings, a process which has gone on for years. Whatever the case, the Grammy Committee has seen
passed the backlash to award the Beatles their eighth award, thanks in this case to AI.
That's going to do it for today's AI Daily Brief Headlines edition. Next up, the main episode.
Today's episode is brought to you by Vanta.
just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned
security professional scaling your GRC program, proving your commitment to security has never been
more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust
by automating compliance needs across over 35 frameworks like SOC2 and ISO-27-01. Centralized security
workflows, complete questionnaires up to 5X faster, and proactively manage vendor risk. Vanta can help you
start or scale up your security program by connecting you with auditors and experts to conduct your
audit and set up your security program quickly. Plus, with automation and AI throughout the
platform, Vanta gives you time back, so you can focus on building your company. Join over 9,000
global companies like Atlassian, Kora, and Factory, who use Vanta to manage risk, improve security
in real time. For a limited time, this audience gets $1,000 off Vanta at vanta.com slash nLW. That's
V-A-N-T-A dot com slash NLW.
for $1,000 off.
If there is one thing that's clear about AI in 2025,
it's that the agents are coming.
Vertical agents by industry,
horizontal agent platforms,
agents per function.
If you are running a large enterprise,
you will be experimenting with agents next year.
And given how new this is,
all of us are going to be back in pilot mode.
That's why Super Intelligence is offering a new product
for the beginning of this year.
It's an agent readiness and operations.
opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand
what type of agents make sense for you to test, what type of infrastructure support you need to be ready,
and to ultimately come away with a set of actionable recommendations that get you prepared to figure
out how agents can transform your business. If you are interested in the agent readiness and
opportunity audit, reach out directly to me, NLW at B-Super.a.I. Put the word agent in the
subject line so I know what you're talking about, and let's have you be a leader in the most dynamic part
of the AI market.
Hello, AI Daily Brief listeners. Taking a quick break to share some very interesting findings from KPMG's latest AI quarterly pulse survey.
Did you know that 67% of business leaders expect AI to fundamentally transform their businesses within the next two years?
And yet, it's not all smooth sailing. The biggest challenges that they face include things like data quality, risk management, and employee adoption.
KPMG is at the forefront of helping organizations navigate these hurdles.
They're not just talking about AI. They're leading the charge with practical solutions.
and real-world applications. For instance, over half of the organization surveyed are exploring
AI agents to handle tasks like administrative duties and call center operations. So if you're looking
to stay ahead in the AI game, keep an eye on KPMG. They're not just a part of the conversation,
they're helping shape it. Learn more about how KPMG is driving AI innovation at KPMG.com
slash US. Welcome back to the AI Daily Brief. Blissfully, we have a bit of a quiet day,
which means that we get to do something that I've been looking forward to. I basically had this
on deck for whenever we got a break from the blistering pace of news.
Y Combinator is, of course, the best-known startup accelerator in the world.
And every couple of cycles, they release what they call their request for startups.
The idea here is that the partners of Y Combinator get together and talk about what they think
the big themes of the future are going to be and where they'd like to see more entrepreneurs
taking a crack at particular problems.
Now, of course, these aren't the only companies that they'll accept.
As they say, the list will only be a small fraction of the ideas they actually fund.
but it's a way for them to give feedback to entrepreneurs who are looking for their next big idea
what they think some of the key themes are. Well, they just recently released their spring 2025 update,
and of the 14 ideas, 13 are AI or at least AI adjacent. Eight of the 14 touch agents in some way.
So what I want to do is use this as a way to preview the future that Y Combinator thinks is coming.
I'm going to discuss the big categories that I see across these startup areas and hone in on a couple
that I think personally are particularly interesting. So, I think broadly speaking, you could
categorize these 14 startup areas in roughly four buckets. The last one is other. There's one idea here
that's sort of more about a founder profile than it is about a particular idea. So I'm leaving that one
to the side. But outside of that, they all fit into one of three buckets, AI and agent infrastructure,
agent applications, or AI applications. For the sake of focus, once again, I'm not going to spend as much time
on the AI applications. I'll call them out briefly. One is compliance and audit, where partner
Tom Blumfield points out that LLMs excel at the tasks of traditional compliance, including reading
dense regulation, cross-checking internal policy, etc. This is a great example of one of those
very unsexy but still very significant problems that AI can solve and just take entirely off
the plate of humans. The other AI application is DocuSign 2.0. However, Michael Seibel argues that with the
current crop of products, it's too hard to create a document template, avoid filling out duplicate
get information, correct document errors, et cetera. And so the idea is to use AI to simplify this process.
All right, but now from there, let's move into the two categories that I think are really interesting.
AI and agent infrastructure and agent applications. Infrastructure is the biggest category here. It is very
clear that AI, but even more agents are coming down the pipeline and why Combinator is interested
in the things that are going to enable that transformation. Some of them are dead on what you'd expect,
like data centers. This one is not surprising. The world of the future,
requires more data, more power infrastructure, more cooling, more material procurement, more project
management. And so anything around those themes is of interest. Where it starts to get even more
interesting is where you're seeing why Combinators start to make bets on which type of agentic applications
are going to be ready for primetime in short order. One very catch-all area is called DevTools for
AI agents and basically is why Combinator saying, we want people to keep making agents better.
They're interested in agent builders directly, i.e. companies that enable their customers to
easily create and deploy custom agents, as well as agent building blocks. Tools, APIs,
or platforms that enhance agent capabilities, enabling them to perform more complex actions and achieve
greater impact. You could blow out that agent building block category into a million things.
For example, one of the things that agents will at some point need to be able to do is access
financial infrastructure. Plad for agents feels to me to be one of those really obvious to
conceptualize but very difficult to build type of projects that will make someone very, very rich
over the next few years. You also get a sense of where we are in the agent development cycle.
Partner Jared Friedman talks about browser and computer automation, effectively arguing that
while we're starting to see agents be able to use computers in the form of OpenAI's operator
as well as Anthropics computer use, that doubling down on this and giving agents even more access
to the browser and using the computer is going to, quote, 10x the addressable use cases for
AI agents. So building out that infrastructure seems extremely important. Another area of infrastructure
is that we're starting to see why Combinator adjust to a different scaling model.
One of the themes from partner Diana Who is inference AI infrastructure in the world of test time compute.
If you're a regular listener, you'll have heard lots of discussion around how we've seen a shift
in thinking around scaling from a focus on pre-training to a focus on applying compute at inference time.
Diana points out that, quote, as AI apps 10x or even 100x the number of API calls to complex reasoning
models, the infrastructure costs will become a real problem.
And so what YC is interested in is better software at inference layer tool.
tooling, cheaper ways to handle GPU workloads and optimizations. This is in many ways a doubling
down on a key theme that has dominated conversation for the last couple of months. And the last two in this
infrastructure category that I find interesting, both relate in some way to how enterprises are going
to use agents. Partner Dalton Caldwell calls out AI commercial open source software. And effectively,
the idea here is that many of enterprises' AI deployments are going to be custom builds built
on top of open source software. However, when building with open source software, one of the things
that you give up for the flexibility and freedom is, of course, the support.
YC then is interested in companies that replicate some amount of the type of support you get
from a closed source vendor, but in the context of enterprise open source deployments.
Still, maybe the most interesting request for startups in this infrastructure section is B2A,
software where customers will all be agents.
The thesis here is pretty simple.
Right now, a huge amount of internet traffic is people looking for information.
Already, much of that is automated.
It's bots and non-humans that are scraped.
and looking for information. However, what Wycombinator is interested in is software that
explicitly recognizes that a lot of purchasing decisions are going to be made explicitly by agents
in the future. And so rather than building services that support human internet use and human
commerce decision making, they think there's interesting ground for entrepreneurs to build services
that specifically aim at serving agents. This is one of those fundamental shifts that I think will
create just enormous opportunities. And so I'm really interested to see which startups take up
that particular call.
Now, moving on to the agent application section.
Some of these are well-trod in territory.
Not that YC shouldn't point them out because there's still a lot to build,
but one of the themes, for example, is vertical AI agents.
They define those as software that's built on top of LLMs
that's been carefully tuned to be able to automate some kind of real important work.
Now, what's interesting is that they argue that this opportunity is big enough
to mint another 100 unicorns.
For every category, they say, with a successful B2B SaaS company,
you could imagine an even larger vertical AI company being built.
And they argue that although this is a huge point of conversation, that we're still not thinking broadly enough,
and that much of the entrepreneurial energy so far has been to very obvious applications,
rather than the full expanse that the opportunity actually represents.
Another one that's sort of well-trodden territory is AI personal staff for everyone.
This is a classic Silicon Valley argument that a good way to guess at what the future of a consumer experience is going to be
is to look at what only rich people can afford now and then imagine how it could be brought to everyone.
The quintessential example is Uber and Now Waymo giving people a private driver which was never
accessible until those companies existed. They're interested in how agents bring things like
personal lawyers, money managers, personal trainers, private tutors, personal doctors to the realm
of the everyday person. But lastly, maybe the most interesting one to me of all, across both
the infrastructure and the agent application category, is one from Pete Kuman called the Future of Software
Engineering. I'm actually going to read a big chunk of this one. Pete writes,
language models can already write code better than most humans.
This is going to bring the cost of building software down to zero.
So will agents kill the job of software developer?
No, we'll need more human software engineers in the future
because software is going to run almost everything.
These humans won't write much code directly.
Instead, they'll manage teams of agents that build software for them.
In addition to writing code, agents will perform most of the other specialized tasks
required to build software, including QA, deployment, security,
and compliance audits, translations, operations, etc.
We'd like to fund startups that enable small groups of generalist software
developers to manage large teams of agents working together to build and ship lots of software.
So two things that are interesting about this. One, it's obviously planting a flag in how they think
economically this is going to play out, sort of at J-Von's paradox, but instead of being about
resource usage, being about talent deployment, where effectively the greater availability of
intelligence and the reduction of the cost of intelligence will actually increase our utilization
of intelligence. Now, I think that it makes sense that software engineering is the area that they're
looking to first for this, but my bet is that this pattern, that the job doers of today will
become the managers of agents of tomorrow, I think is a pattern that we're likely to see played out
across lots and lots of different domains. Think social media managers. Instead of writing
tweets and creating posts, they're going to be able to manage entire armies of agents that do
that across multiple platforms at much greater scale. It is fascinating to think about how to actually
build tools to manage those armies of agents. I think that this is going to be critical infrastructure
for the future, and something like I said that goes far beyond just software engineering.
Anyways, like I said, I think this is a fun way to see how one influential Silicon Valley
institution sees the future of AI and agents. Hopefully this gives you some ideas for what you might
build. For now that, that is going to do it for today's AI Daily Brief. Appreciate you listening
or watching as always. Until next time, peace.
