The AI Daily Brief: Artificial Intelligence News and Analysis - 5 Ways Companies Are Using AI Agents Today
Episode Date: January 8, 2025Businesses are turning to AI agents in innovative ways this year. From refining drug discovery at Johnson & Johnson to advancing financial analysis at Moody's and streamlining customer service at ...Deutsche Telekom, these tools are redefining workflows and driving measurable outcomes. Discover how companies are deploying AI agents for growth and efficiency in 2025. Brought to you by: Vanta - Simplify compliance - https://vanta.com/nlw The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, five ways that companies are using AI agents right now.
Before that on the headlines, Google forms a new team to build world models.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, follow the Discord link at our show notes.
Welcome back to the AI Daily Brief Headlines edition,
all the daily AI news you need in around five minutes.
Perhaps the biggest theme of Q4 of last year was this question around whether the pre-training model for scaling AI had started to
into serious limits. We obviously got the rise of reasoning models like 01 and 03. We had CEOs
like Satchanedella from Microsoft talking about how new architectures were needed, but we also got some
interesting alternatives. One of the approaches that some are interested in are models that can
simulate the physical world. Google is forming a new team within DeepMind to work on scaling these
types of models. The team will be led by Tim Brooks, one of the co-leads of OpenAIs
SORA video model who left that company back in October. Yesterday, Brooks posted, DeepMind has
ambitious plans to make massive generative models that simulate the world. I'm hiring for a new
team with this mission. Come build with us. So far, what we've seen from labs are functional if
limited demos. Basically, these are AI models that have a better understanding of the physics
and appearance of the real world, understanding it in a similar way to how LLMs understand the
structure of language. So far, a lot of what we've seen from world model labs are based on training
data from video games or movies, and so are really only a proof of concept. One of the few projects to
move past this stage was Genesis, first shown off last month. That project was able to generate
groundbreaking video and extremely accurate robotics training modules using a 4D world simulation.
Genesis claimed they were able to train robots 430 times faster than the previous leading physics
simulator, cutting the time below a minute. Now, Deep Mind is one of the labs that published a brief
demo of a model that understands video game physics last year. That model was called Genie 2,
and I actually think that the announcement went a little under the radar. Establishing this new
team suggests that they want to push the technology even harder. Job postings for the new team
invited applicants to, quote, join an ambitious project to build generative models that simulate the
physical world. We believe scaling pre-training on video and multimodal data is on the critical
path to artificial general intelligence. World models will power numerous domains, such as visual
reasoning and simulation, planning for embodied agents and real-time interactive entertainment.
The team will collaborate with and build on work from Gemini, VO, and Genie teams and tackle
critical new problems to scale world models to the highest level of compute. One of the people who has
talked most explicitly about this view of the importance of these types of models for achieving AGI,
is meta-chief AI scientist Jan Lecun. Indeed, he has gone so far as to hypothesize, loudly,
on Twitter, that standard GPT architecture has no pathway to AGI. This project sounds as though
it will be one of the first to attempt to build a world model using the full scale of the
training data and compute that can be mustered by a big tech firm. Invidia, meanwhile, is also
pushing the frontier of world models, releasing a family of models called Cosmos, during his keynote
address at CES, which we will cover in more depth later in the week, the Nvidia CEO Jensen Huang
announced, the chat GPT moment for robotics is coming. Like large language models, world foundation
models are fundamental to advancing robot and AV development, yet not all developers have the
expertise and resources to train their own. He demonstrated the model being used to simulate warehouses
and roadways commenting, it's not about generating creative content, but teaching the AI to understand
the physical world. The models were trained on 20 million hours of video, with a particular focus on
human movements like walking, hand movements, and manipulating objects. They can be fine-tuned
for specific tasks and customized for external data. The family includes three models ranging
from 4 billion to 14 billion parameters. The smallest model is optimized for low latency in
real-time applications, while the largest model is intended to deliver high-fidelity outputs.
And what's more, the models are available as open source for commercial use, allowing robotics
and autonomous vehicle developers to use them in production. Diego Odd posted, this is huge for AI democratization,
a powerful open source video world model trained on 20 million hours.
Not just the model itself, but its application to synthetic data generation could be a game
changer for robotics training.
One more quick story before we close out the headlines.
One of the big questions surrounding the AI industry is whether it can actually make money.
You'll remember that this was a huge point of conversation last summer.
We had that Sequoia blog post AI $600 billion problem.
And now we've learned that ChatGPT Pro, the $200 per month tier, is not only not a
cash grab, but is actually not even paying for itself. A couple of days ago, Sam Altman tweeted,
insane thing. We are currently losing money on OpenAI pro subscriptions. People use it much more than we
expected. In the replies, he added, I personally chose the price and thought we would make money.
Now, of course, Open AI is making a ton of money but losing more. The company reportedly expected
losses of around $5 billion last year on revenues of $3.7 billion. The pricing of all of this stuff
at any point has been pretty arbitrary. In a recent interview, Sam Altman said,
that when it came to the main chat GPT subscription, the company was tossing it up between $20 and $42.
They eventually went with $20 because, quote, people thought $42 was a little too much.
They were happy to pay $20.
Alman continued, it was not a rigorous hire someone and do a pricing study thing.
Now, what makes this interesting isn't anything really about OpenAI itself.
It's much more about the question of the long-term profitability of AI.
Mojo Flynn writes, OpenAI losing money is no big surprise.
but when they're losing money on a $200 monthly subscription
should tell you there's no viable at-scale consumer business model.
Even Microsoft with a $30 co-pilot subscription
is forced to offer discounted pricing.
I don't think it's an unreasonable concern.
However, I do have a very different take.
I think that we are extremely early in the life cycle of AI,
and the simple reality is,
the cost of delivering the service hasn't come down
as fast as the demand for using the service has increased.
That's an unsustainable state,
but unsustainable doesn't mean an understanding.
inevitable failure, it means that there's going to need to be a recalibration.
Already, the cost of AI has come down spectacularly from where it was a few years ago,
at least in terms of what you can do with the same amount.
I would expect that to continue, and I think that we're going to figure out a lot more
use case by use case what sort of business models different performance levels of AI can
support.
Frankly, I think this is exactly what venture capital and risk capital is designed to do.
It's designed to allow incredibly promising innovations, the ability to build and get
through these complicated early stages before these markets get rationalized. I think the speed of
adoption of these tools has taken basically everyone by surprise and puts additional pressure on this
even relative to other industries. Anyways, still an interesting story to watch, one that we will keep track
of here. For now, though, that is going to do it for today's AIA Daily Brief Headlines edition.
Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned,
it's demanded. Whether you're a startup founder navigating your first audit or a season,
and security professionals scaling your GRC program, proving your commitment to security has
never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta
to establish trust by automating compliance needs across over 35 frameworks like SOC2 and ISO-2
2701. Centralized security workflows, complete questionnaires up to 5X faster, and proactively
manage vendor risk. Vanta can help you start or scale up your security program by connecting you with
auditors and experts to conduct your audit and set up your security program quickly.
Plus, with automation and AI throughout the platform, Vanta gives you time back, so you can focus
on building your company. Join over 9,000 global companies like Atlassian, Kora, and Factory,
who use Vantage to manage risk and prove security in real time. For a limited time, this audience
gets $1,000 off Vanta at vanta.com slash NLW. That's V-A-N-T-A-com slash NL-W for $1,000 off.
There is one thing that's clear about AI in 2025. It's that the agents are coming.
Vertical agents by industry, horizontal agent platforms, agents per function. If you are running a large
enterprise, you will be experimenting with agents next year. And given how new this is, all of us
are going to be back in pilot mode. That's why Superintelligent is offering a new product for the beginning
of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks,
we dig in with your team to understand what type of agents make sense for you to test,
what type of infrastructure support you need to be ready,
and to ultimately come away with a set of actionable recommendations
that get you prepared to figure out how agents can transform your business.
If you are interested in the agent readiness and opportunity audit,
reach out directly to me, NLW at B-Super.A.I.
Put the word agent in the subject line so I know what you're talking about,
and let's have you be a leader in the most dynamic part of the AI market.
Welcome back to the AI Daily Brief.
Right now in Las Vegas, the annual CES Consumer Electronics Show is happening,
and I anticipate that there will be some interesting AI announcements from that event that we will cover
later in the week.
However, for today's episode, as we let those announcements come in a little bit more,
I noticed something interesting in the Wall Street Journal.
Yesterday, that publication published a piece of their CIO journal called How Are Companies
Using AI Agents.
Here's a look at five early users of the bots.
You can tell the language is a little bit stuck in the past.
But what's interesting to me is that in a year where we really are talking about 2025 being the time that companies start experimenting with agents,
mainstream media is already picking it up that this is a major theme.
Part of why this matters is that most people in big companies, much to my chagrin, are not so up to speed that they're listening to something like the AI Daily Brief.
They're getting their news from sources like the Wall Street Journal.
And so when this style of publication starts taking this stuff seriously, it can have a pretty big impact.
So what we're going to do today is briefly look through these.
five use cases that the WSJ covered, and I'm going to pair that with an overview of a recent paper
from Google that I think might be a pretty useful resource as well. The Wall Street Journal piece
basically points out that this is a big trend. They describe how many different companies have
officially announced their own agents, and they point out one of the biggest reasons, frankly,
that enterprises are so focused on agents. Quote, if these agents work is promised, they could
also provide businesses with the return on investment they've been looking for out of generative
AI. According to some corporate technology leaders, that means the ability to tie the technology
to a reduction in the number of hours employees work, or even how many new people they need to
hire. Basically, there is a priori built in if agents actually work. Agents necessarily replace
certain amounts of human labor, and presumably do it at lower cost than the equivalent human time.
Now, it's important to note that how companies use those cost savings and that increased productivity
is going to dictate just how disruptive this is.
If companies reinvest that human time
into growing the business in other areas,
I tend to think that this will be a phenomenal development for everyone.
If, on the other hand, they'd just view it as a cost-cutting measure,
well, that's a whole different kettle of fish.
But the real thrust of this Wall Street Journal piece
is to try to figure out how agents are being used right now in reality.
The first example they gave is from pharmaceutical giant Johnson and Johnson,
who have been deploying drug discovery agents.
Honing in on what agents can and can't do,
the article points out that these agents aren't yet up to the task of coming up with new drugs all by
themselves. Instead, they're deployed to optimize key points in the drug synthesis process.
Traditionally, drug manufacturing is refined by running a multitude of experiments, which often
have multiple variables to adjust. Agents are able to take the data from a smaller number
of experiments and extrapolate it out to arrive at an optimal method. At this stage,
employees are still reviewing the output of agents, but they write, the company is still figuring
out how that oversight can be done more systematically. Next up, we move over to the world of
Finance, where financial analysis firm Moody's has developed a team of agents to research
public company filings and perform industry comparisons. In total, the firm has 35 different agent designs
all trained for different subtasks and linked up together in a multi-agent system. The system
even has agents as supervisors to check for hallucinations. The novel idea here is that each agent
has its own set of instructions, personality, and data access. This means the agents within the
system can come up with different conclusions in their analysis, which are then synthesized
together. For example, one agent might be building their analysis based on industry competition data,
while another might be focused on geopolitical risk. Nick Reed, the company's chief product officer,
said, it's almost a bit like your ability as an individual person. What we worked out is that an agent
is better at not multitasking. This is obviously a highly relevant conclusion, even if this just
represents the current state of things, in terms of how enterprises think about deploying agents.
Rather than trying to have one agent do multiple things, companies might get better results by assigning
multiple agents with narrow subtasks and finding ways to coordinate them, once again, possibly with agents.
The thinking is not ultimately dissimilar to the way you would construct a team of humans to carry out a multidisciplinary task.
eBay is engaged in one of the most popular agent use cases, writing code.
Interestingly, eBay actually built its own agent framework that can take advantage of several different LLMs.
In addition to writing code, eBay's agents are also creating marketing campaigns, and they're
planning on rolling out another set of agents that can help buyers find items, as well as helping
sellers list goods.
The journal writes, eBay's agent framework functions as an orchestrator, dictating which
AI models will be used for certain tasks like translating code and suggesting code snippets.
Next up is Deutsche Telecom, and rather than facing outward, their agents are facing inward.
The company employs roughly 80,000 workers across Germany.
They've trained agents now to answer employees.
employee questions about internal policies and benefits. They also have an agent trained to assist
service staff with questions about the company's products and services. In this case, we might be
pushing the boundaries of the language of agent. This sort of sounds ultimately like a chatbot
that has access to internal databases. Still, call it what you want. It seems to be getting a lot of
traction. The company's chief product and digital officer, Jonathan Abramson, said that about 10,000
employees are using it each week. That is dramatically more efficient than having an HR specialist or
having employees search for policies on an internal website. Still, Deutsche Telecom is figuring out
how to go farther. The company's next step is allowing the agent to execute requests on behalf of
employees further automating basic HR. The example given was allowing the agent to complete a request
for leave and enter it into the HR system, all fully automated from a natural language text
prompt. The final example is, I believe, at this stage, the most commonly deployed agent example.
In this case, it came from Spanish company, Constantino, who manufacturer countertops and other
stone materials for buildings. The company has brought on a team of agents to fill in gaps for their
customer service staff. They refer to the agents as a digital workforce and are thinking about them in a very
similar way to human workers. The agents are expected to have basic skills but receive training when
they begin work. Agents are given instructions to follow a strict process and supervisors are present
to ensure they don't go off the rails. The so-called digital staff have replaced the work of three to four
team members who were previously involved in clearing customer orders. Those people have now been
reassigned to more high-touch areas of customer service liberated from their data entry tasks.
Now, like I said, all of these are fairly basic use cases, but that I think represents where we are.
I do believe that 2025 is going to be a huge year for agent pilots, and many of them are going to
fall into some of these areas described and articulated in this piece.
Now, one useful resource for figuring out how to implement agents in your workforce is a white
paper published by Google last September, simply titled, agents. The paper explains what agents are
and what they require to function. But more importantly, suggests that companies shouldn't think about
agents as an upgrade to existing technology. Instead, they should think about agents as a fundamental
shift in the way organizations operate in order to see maximum gains in efficiency and productivity.
Basically, the first big idea in the paper is that agents are more than just smarter LLMs.
The core agentic function is being able to access other systems. This could mean simply accessing
a database to inform an output, but the possibilities go so much deeper. It's possible, for example,
to integrate agents into real-time data feeds to inform autonomous to see.
decision-making. Agents have much greater ability to process data than a human. We will likely
find agents are able to monitor and take actions based on multiple data sources that would
have required an entire team of people to carry out. Google's paper discusses another big difference
between LLMs and agents, the ability to reason through multi-step tasks. There are many different
architectures that can be used to achieve this. The agent could use chain of thought,
an iterative process of reassessing the task as it progresses based on new information revealed
at each step, could use a tree of thoughts where multiple possible solutions are explored at the same
time. Ultimately, according to the paper, this makes agents capable of managing uncertainty and
complexity in ways that traditional models can't. There's a ton of really interesting information in here.
I will link to it in the show notes. And of course, one quick shill here if you've made it this far.
You've probably been hearing this ad, but one of the things that we were doing at Super this year
is an agent readiness audit, where we are digging in with you to help you understand
what parts of your company or your workforce's activities are best suited for exploring agents.
And we're also helping scope and even support pilots in that area. If that's something you're
interested in, email me at nLW at Bsuper.a.I. And join this 2025, the year of agents.
For now that, that is going to do it for today's AI Daily Brief. Until next time, peace.
