The AI Daily Brief: Artificial Intelligence News and Analysis - When Will AI Make Scientific Discoveries?
Episode Date: October 2, 2025Today’s AI Daily Brief asks when artificial intelligence will begin making real scientific discoveries. We look at Periodic Labs, which just raised more than $300 million to build AI scientists and ...autonomous labs for physics and chemistry, and Thinking Machines, which is creating tools to democratize custom model training. These efforts highlight a shift from consumer apps toward AI as a scientific instrument, arriving alongside early reports that models like GPT-5 are already generating small but novel breakthroughs. In headlines, the U.S. government blasts China’s DeepSeek models, Apple pivots from Vision Pro to smart glasses, Amazon refreshes Alexa devices with custom AI chips, and Meta plans to target ads based on chatbot interactions.Brought to you by:Is your enterprise ready for the future of agentic AI?Visit AGNTCY.orgVisit Outshift Internet of AgentsTry Notion AI today with Notion 3.0 https://ntn.so/nlwKPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. https://www.kpmg.us/AIpodcastsBlitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/Insightwise - AI for the entire consulting lifecycle https://www.insightwise.ai/Vanta - Simplify compliance - https://vanta.com/nlwThe Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? nlw@aidailybrief.ai
Transcript
Discussion (0)
Today on the AI Daily Brief, when will AI start making novel scientific discoveries?
Before that in the headlines, the U.S. government says actually DeepSeek stinks.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, quick announcements before we dive in.
First of all, thank you to today's sponsors, Notion, Blitzy, Insightwise, and Pencils and Robots.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief,
or you can now subscribe to the audio-only version ad-free in,
Apple Podcast. So if you just listen and you get it through Apple Podcasts, you can now subscribe
directly there. I'm working on having Spotify's ad-free subscription coming as well. So soon you will
have multiple choices, Patreon, Apple, and Spotify. But with that, let's dive in. Welcome back to the
AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes. We kick off today with a
new report from the National Institute of Standards and Technology that finds deep seek lacking in both
performance and security.
Announcing the report, Commerce Secretary Howard Lutnik wrote,
Today, thanks to President Trump's AI Action Plan,
the Commerce Department and NIST's Center for AI Standards and Innovation,
have released a groundbreaking evaluation of American versus adversary AI.
Result, American AI models dominate.
Our systems outperform deep seek across nearly every benchmark.
The report is clear.
Deepseek lags far behind, especially in cyber and software engineering.
These weaknesses aren't just technical.
They demonstrate why relying on foreign,
AI is dangerous and short-sighted, allowing our adversaries to control AI poses serious risks
to our security by setting the standards, driving innovation, and keeping America secure.
The Department of Commerce is helping ensure continued U.S. leadership in AI.
Now, in addition to lack of performance, NIST found that DeepSeek was more expensive than
comparable U.S. models.
They claimed that one of the U.S. models used as a reference cost 35% less on average to complete
the 13 performance benchmarks that were tested.
NIST also found that Deepseek's models were 12 times more likely than U.S. Frontier models to
allow malicious instructions designed to derail them from the desired task.
NIST wrote, hijacked agents sent fishing emails, downloaded and ran malware, and exfiltrated
user login credentials all in a simulated environment. Deepseek's models were also far more
susceptible to jailbreaking, responding to 94% of malicious requests after using a common jailbreaking
technique compared to just 8% for U.S. reference models.
NISC found that Deepseek's models echoed four times as many inaccurate and misleading CCP
narratives as U.S. models, and finally and perhaps most concerningly, they found,
adoption of Chinese models has greatly increased since DeepSeek R1 was released.
They anchored this claim on the statistics that downloads of Deepseek models are up
1,000% since January of this year.
Now, some people were a little skeptical of this,
but mostly it was just a Rorschach test for wherever your politics happened to already be.
How much it changes any sort of AI policy, I'm a little bit more skeptical of.
Next up, a little in the AI hardware realm,
Apple has scrapped plans to iterate on the Vision Pro and will shift focus to developing AI smart classes.
It seems that meta's Raybans were just to,
too compelling, forcing Apple to scuttle their existing plans. Bloomberg's Apple
specialist Mark German reports that Apple had plans to develop a cheaper, lightweight version of
the Vision Pro, which was on track for release in 2027.
Sources said that last week, however, Apple announced internally that staff would be pulled
off that project to accelerate work on smart glasses instead.
Plans are reportedly to work on two different versions of the glasses product.
A cheaper version dubbed the N50 will compete with the original meta-raybans, and a higher
spec version will include a display to go up against the newly released meta-rayband display.
The N50 is expected to be ready for unveiling next year ahead of a 2027 release,
while the display version isn't expected until 2028.
The design for Apple's glasses seems to stick close so far to META's product line.
Apple plans to use voice controls and integrated AI as the core interface.
The glasses will also feature speakers for music and cameras for media recording,
and Apple is also reportedly exploring integrating health tracking capabilities for the device as well.
The reporting also seems to suggest that Apple has come to consider it
a mistake to have tried to sell a $3,500 consumer device that's not comfortable enough to use for long
periods of time. Right Skirman, Apple executives have acknowledged the product shortcomings in private,
viewing it as an over-engineered piece of technology. The game's not over, but it certainly
puts more evidence of the column that smart glasses, for the moment, are at the very tippy top of
the format war for AI devices. Speaking of devices, Amazon has released a new line of smart
devices built specifically for their AI assistant Alexa Plus. The entire range of Echo Smart
speakers have been upgraded with new custom silicon featuring an AI accelerator. This will allow
the new devices to provide local inference for Amazon's AI models, and the devices will also include
a new custom sensor platform designed to make ambient AI feel more natural. The sensors include
cameras, audio, ultrasound, Wi-Fi radar, and an accelerometer. Basically, everything imaginable
to make the AI model aware of its surroundings. The goal is to make interactions with the ambient
to AI feel more natural and responsive. Some of the devices enable new AI features across Amazon's
smart home systems. The latest ring cameras now have facial recognition technology that can keep
track of friends and family and distinguish them from lurking strangers. Another feature called
Search Party allows users to send out an alert across networked cameras in a neighborhood to search
for a lost pet. This is the first big product refresh since Amazon hired product chief
Panos Panay away from Microsoft back in 2023. Panay told Bloomberg, my belief is that our
job is to make devices the next big business at Amazon. And AI is very clearly right at the core of the
strategy, which Penae articulated as great products made even better through ambient AI.
Lastly today, a relevant one given all the dust up around SORA 2 and the presumption that
all of this is leading to ads in chat GPT, meta has crossed the Rubicon and will start to target
ads based on users' AI chats. Meta announced a change to their recommendation system on Wednesday
that will see AI interactions used to personalize content and advertising delivery across their apps.
Meta gave the example of a user asking their chatbot for nearby hiking recommendations.
The user might then be served hiking-related content and ads for hiking boots or other gear.
Users won't be able to opt out, but sensitive topics will be automatically excluded.
These include politics, religion, sexual orientation, health, and racial origin.
The change will go live in December, and the policy won't apply in the UK,
Europe and South Korea due to their stricter tech privacy rules,
although Meta plans a compliant rollout at a later date.
Christy Harris, the privacy policy manager at Meta said,
people's interaction simply are going to be another piece of the input
that will inform the personalization of feeds and ads.
We're still in the process of building the first offerings
that will make use of this data.
Look, it feels pretty inevitable that ads are going to be a part of the AI landscape.
I think to some extent the question will be,
for people who are already paying for subscriptions
are they going to have to deal with ads as well?
What sort of privacy controls are there be?
All of these questions remain to be seen,
but it is not surprising that we are getting to this point
frankly, it's just surprising that it's taken this long.
For now, that's going to do it for today's headlines.
Next up, the main episode.
Chatbots are great, but they can only take you so far.
I've recently been testing Notion's new AI agents,
and they are a very different type of experience.
These are agents that actually complete entire workflows for you in your style,
and best of all, they work in a channel that you already know and love
because they are purpose-built Notion super users.
Notion's new AI agents completely expands the range of what Notion can do.
It can now build documents from your entire company's knowledge base,
organize scattered information into organized reports,
basically do tasks that used to take days and get them complete in minutes.
These agents don't just help with work, they finish it.
Getting started with building on Notion is easier than ever.
Notion agents are now your very own super user to help you onboard in minutes.
Your AI teammates are ready to work.
Try Notion AI for free at the link in our show notes.
This episode is brought to you by Blitzy,
the Enterprise Autonomous Software Development Platform with Infinite Code,
context. Blitzy uses thousands of specialized AI agents that think for hours to understand
enterprise-scale code bases with millions of lines of code. Enterprise engineering leaders start every
development sprint with the Blitzy platform bringing in their development requirements. The
Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers
80% plus of the development work autonomously while providing a guide for the final 20% of human
development work required to complete the sprint. Public companies are achieving a 5x engineering
velocity increase when incorporating Blitzy as their pre-IDE development tool,
pairing it with their coding co-pilot of choice to bring an AI-Native STLC into their org.
Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises.
The team will provide a 5x velocity increase on a real development project in your org.
Visit blitzy.com and press book demo to learn how Blitzie transforms your STLC from AI-assisted
to AI Native. That's BLITZY.com.
As a consultant, responding to proposals can often feel like playing tennis against a wall.
You're serving against yourself trying to guess what the client really wants.
That all changes with the Insight Wise proposals platform.
Now you've got an AI coach that thinks just like your client.
It returns to the brief time and time again, identifying opportunities, showcasing your track record,
and making recommendations to improve your pitch.
Suddenly you're on center court, but this time you've got a secret weapon.
Insight Wise does a way with all the time-consuming manual work,
so you can focus on winning more business more often.
Generate reports, pull insights from your own data,
build competitive advantage, and go to sleep before 2 a.m.
When it comes to proposals, you only get one shot.
With insight-wise, make yours an ace.
AI isn't a one-off project.
It's a partnership that has to evolve as the technology does.
Robots and pencils work side-by-side with clients
to bring practical AI into every phase,
automation, personalization, decision support, and optimization.
They prove what works through applied experimentation and build systems that amplify human potential.
As an AWS-certified partner with Global Delivery Centers, robots and pencils combines reach with
high-touch service, where others hand off they stay engaged, because partnership isn't a project
plan. It's a commitment. As AI advances, so will their solutions. That's long-term value.
Progress starts with the right partner. Start with robots and pencils at robots and pencils.com
slash AI Daily Brief.
Welcome back to the AI Daily Brief.
Today we are discussing AI and scientific discovery.
And of course, at least part of the context has to be the launch of SORA 2.
We talked extensively about this in yesterday's episode, but if you had to take just one meme
that best summed up the enfranchised critique, let's say, of OpenAI's announcement, not just of
the SORA 2 model, but of the SORA app to go with it, it was really a critique of what
Open AI is choosing to spend its time on. One example of this was Rudder Tushar who said,
Sam Altman two weeks ago, we need $7 trillion and 10 gigawatts to cure cancer. Sam Altman today,
we are launching AI slot videos marketed as personalized ads. Now, there was a lot of that going
around, enough that it clearly got under Sam Altman's skin. In fact, he responded to that one
saying, I get the vibe here, but we do mostly need the capital for building AI that can do
science. And for sure, we are focused on AGI with almost all of our research effort.
It is also nice to show people new cool tech products along the way, make them smile, and hopefully make some money given all that compute need.
When we launched chat GPT, there was a lot of who needs this and where is AGI.
Reality is nuanced when it comes to optimal trajectories for a company.
So let's hold aside Sam's response in whether you think it's legitimate or not.
The point is that a lot of people are saying, we were promised AGI in scientific discovery and we got another social media app.
It harkens back to the famous Founders Fund manifesto where they wrote, we wanted flying cars.
instead we got 140 characters, this sense that there is some deeper value to big, massive machines
and inventions and discoveries, as opposed to just social media attention sucks.
This question has actually been lurking around the open AI space for a lot of 2025.
One of the big stories of this year, of course, was Mark Zuckerberg parading around the valley,
trying to poach engineers to stock his superintelligence lab.
And while a lot of that effort was very successful and a lot of incredibly talented people came over to Meta,
there were some who turned down even extremely generous compensation packages,
and a lot of the scuttle butt and buzz around AI circles
was that those folks just weren't willing to risk
that ultimately what they were going to have to spend all of their brain power
and all of Meta's compute on was improving ad click-through rates.
Now, one of the things that was interesting about this conversation yesterday
as it happened alongside SORA
was that we didn't just have to speculate
that some number of researchers and top minds
weren't going to be interested in that sort of work.
We actually had an example of researchers who had left OpenAI and Google and meta
to build something that had a much heavier scientific discovery type of focus.
The company was called Periodic Labs.
Its goal is to use and build AI that can actually accelerate discovery
in fields beyond computer science, such as physics and chemistry.
And its announcement was very explicitly positioned
as being about AI researchers getting sick of working on consumer AI
and moving on to a higher purpose. Indeed, the New York Times article about periodic labs starts with one of
these stories. They write, this summer Mark Zuckerberg invited Rashab Agrawal to join the company's new
AI lab, offering him millions of dollars in stock and salary. With the new lab, Zuckerberg said he wanted
to build superintelligence, a technology that could eclipse the power of the human brain. Though no one
knew how to create superintelligence, he urged Dr. Agarwal to make a leap of faith. In a world that is changing
fast, Zuckerberg told him, the biggest risk you can take is not taking any risk. But although
Dr. Agarwal was already a meta-employee, he turned down the offer to join another company.
That company is, of course, periodic labs.
In discussing periodic's goals, founder Liam Fida said, the main objective of AI is not to
automate white-collar work. The main objective is to accelerate science. Now, we talk in this
show all the time about the difference between efficiency AI and opportunity AI, and this is,
of course, in the context of enterprises deploying AI at work. The idea of efficiency AI is thinking
about AI simply as a way to do what is currently done, but faster, cheaper, or maybe better,
but still doing the same thing. There's nothing wrong with efficiency AI. People should leverage
those gains. They're going to become table stakes. But the real opportunity in what will differentiate
companies I have always said and believed is those who think about it as a new opportunity
technology, a technology, in other words, that opens up things that weren't possible before.
The founders of periodic labs are taking a similar assessment and have built their company around
the premise of answering how to actually make that real.
and figure out a gap in the market that opens up that possibility.
So what does Periodic Labs do?
Simply put, their goal is to accelerate science.
In their announcement post, they wrote,
Our goal is to create an AI scientist.
Science works by conjecturing how the world might be,
running experiments and learning from the results.
Intelligence is necessary but not sufficient.
New knowledge is created when ideas are found to be consistent with reality.
And so at periodic, we are building AI scientists
and the autonomous laboratories for them to operate.
And really at core of what they think is missing, is that while, yes, current models have read
everything that's available, ultimately to make new discoveries you need practical application
and experimentation.
As they put it, as any scientist knows, though rereading a textbook may give new insights,
they eventually need to try their idea to see if it holds.
So basically what they want to do is connect the dots between human researchers, AI agent
experiment designers, and autonomous and robotic labs where those experiments can be conducted.
The way the neuron framed it was this.
Human evaluers initiate, AI agent designs experiments, the robotic lab executes them,
and nature itself provides a reward signal, did the experiment work, and from there, data improves the models.
Basically taking the scientific method and AIifying it.
The company is starting with physical sciences, because they say physics is a verifiable environment.
They note that AI has progressed fastest in domains with data and verifiable results.
And this is what they mean when they say that nature is the reinforcement learning environment.
Now, part of the strategy is to collaborate with industry right from the get-go.
For example, they are already working with a semiconductor manufacturer on issues around heat
dissipation on their chips.
Now, as part of the coming out party this week, Periodic Labs announced that it had raised
over $300 million in seed funding from a who's-who of investors including Andrewson Horowitz,
Excel, Nvidia, Jeff Bezos, and many, many more.
One of their investors, Bain Capital Ventures, wrote the rare investment announcement post
that uses historical analogy well.
They begin with an exploration of the difference in science before.
the telescope and after the telescope. They write,
The history of science is full of similar examples in which technological progress
enables the invention of new scientific instruments, which in turn leads the new scientific discovery.
Basically, that there is a relationship between the discoveries of science and the technology
of science, that they are mutually reinforcing in a positive feedback cycle.
The point is this. Galileo, they write, had the newly invented telescope.
We have newly developed AI systems. What can we see now that we couldn't be for?
To the extent there was concern around the sloppification of AI with SORA and before it the Meta Vives app,
the periodic announcement saw almost the inverse excitement, with just an incredible amount of enthusiasm,
not just from the AI industry, but from many different parts of academia, science, research, and beyond.
Now, one of the other comparison points in that August article about turning down Zuckerberg's offers
was all of the people who were leaving to work with former OpenAI CTO Miramirani at her new startup Thinking Machines Labs.
And while thinking machines isn't as aggressively self-styled as a place for physical scientific
research as periodic labs, they are very self-consciously trying to take a different approach
to spreading and expanding knowledge around AI as opposed to the other labs.
And right as OpenAI was announcing SORA 2, we also got the first product released from them.
The product is called Tinker, and it's an API for training and fine-tuning custom models.
Essentially, it's an AI infrastructure as a service, and it's meant to reduce the barrier to
entry for model training substantially. Thinking Machines Lab provides the GPU cluster and the software
stack, leaving customers to focus on training data and model design. Marotti posted, Tinker brings
frontier tools to researchers offering clean abstractions for writing experiments and training pipelines
while handling distributed training complexity. It enables novel research, custom models, and solid
baselines. In comments to Wired, she added, we believe Tinker will help empower researchers and
developers to experiment with models and will make frontier capabilities much more accessible to all
people. We're making what is otherwise a frontier capability accessible to all, and that is
completely game-changing. There are a ton of smart people out there, and we need as many smart people
as possible to do frontier AI research. Wrote thinking machine scientist John Schulman, Tinker provides
an abstraction layer that is the right one for post-training R&D. It's the infrastructure
I've always wanted. So the goal here is very much a democratization of frontier AI research.
They're looking to help speed up innovation by enabling researchers or startups to test their ideas
and days instead of weeks or months. They're trying to level the playing field, making it possible
for smaller labs, universities, or even individuals to do meaningful AI work without billion-dollar budgets.
They're looking for path to make AI models more useful while keeping costs down. And certainly,
while there are powerful benefits from a broad research and democratization standpoint, there is also
a ton to be excited about here for enterprises. In the first wave, right after ChatGBTBT,
a lot of enterprises tried to train their own models, mostly to discover the bitter lesson,
that generalist models simply outperformed their limited amounts of data, even if it was data
that was contextual to their firm. Bloomberg was one big example of this. However, there's still
a lot of interest in custom-trained layers for models that take advantage of proprietary and non-public
data, and Tinker opens up some great possibilities on that front. Basically, if it works as promised,
they will be able to fine-tune models on their corpus of data without big infrastructure and
an extensive AI engineering team. What's more, it opens up the possibility that individual teams
within an enterprise could actually think about this type of custom training rather than just having
to wait for what the central AI group does. There's no reason, for example, that a marketing
analytics team, especially if they had the right support, couldn't actually go explore that
kind of customization, again, without having to get in line and wait for whatever other types of big
AI infrastructure projects are going on across the company. Now, how much thinking machines is going
to care about that enterprise use case remains to be seen, but it's certainly valuable.
The first reactions are extremely positive.
Technium, the co-founder of Distributed Training Collective Noose Research, wrote,
I had the privilege of being part of the beta for Tinker.
It's a really nice project. Simple APIs for training models can make it a lot easier
to properly leverage resources to get results.
UC Berkeley PhD student Tyler Griggs writes very hackable and lifts a lot of the LLM training
burden, a great fit for researchers who want to focus on algs and data, not infra.
Former OpenAI co-founder and Vibecoat terminology coiner, Andre Carpathy writes,
Tinker is cool.
If you're a researcher or developer, Tinker dramatically.
simplifies LLM post-training. You retain 90% of algorithmic creative control, while Tinker handles
the hard parts that you usually want to touch much less often, meaning you can do these at well below 10%
of typical complexity involved. Compared to the more common and existing paradigm of upload your data
will post-train your LLM, this is, in my opinion, a more clever place to slice up the complexity
of post-training, both delegating the heavy lifting, but also keeping majority of the data and
algorithmic creative control. GDP at Amazon actually drew the connection between thinking machines and periodic labs.
They write, Thinking Machines vision, whole world as an AI-R-L-powered lab.
In that sense, it's similar to that of periodic labs, but possibly much more expansive.
Periodic labs will be the equivalent of Bell Lab of today's world,
and they will conduct reinforcement learning on the basis of feedback on real-world feedback,
e.g. experiments in material science. We need that, and I am rooting for their success.
Thinking Machines aims to treat the whole world as an AI-R-L-powered lab.
Currently, AI training, and particularly RL, is out of reach of most engineers and organizations.
It is considered achievable only for a handful of labs with big egos.
Also, big labs cannot collect real-world feedback data beyond a point.
The real world provides the best reinforcement learning rollout data.
For example, the kind of feedback cursor gets when users either accept or reject suggestions
by cursor tab model.
Customer support conversations and action trajectories that leave users delighted or disappointed.
Factory floor decision-making data with outcomes.
With such diverse data not accessible to the big labs as it is not on the internet,
the models can be made much more intelligent.
There is actually an entire show that I have had half formed on the back burner about the whole
world as a reinforcement learning lab and how much the labs are orienting towards that being the future
of model development.
But you kind of get a little bit of a taste of it here.
And what's exciting is that even if it isn't making as much headlines, this AI-driven science
is not just something for the future.
It seems to be moving forward right now.
Over the summer, there were multiple reports of frontier models doing novel mathematics,
for example.
Open AI chief product officer Kevin Weill was so enthused by what he saw from science.
scientist working alongside GBT5 that he's now incubating a division called OpenAI for science.
Earlier this week, an MIT student going by Asher posted, if I had a nickel for every
MIT professor who told me GPT5 made a novel research discovery in the past week, I'd have two
nickels, which isn't a lot, but it's strange that it happened twice. He elaborated that one of the
breakthroughs was in biology and the other in math. Now one can be skeptical of Twitter hype
posting, but it does seem to be reflected in many people's experience. Sam Alman even reposted
it saying does feel like this is really starting to happen in tiny ways.
Pryn summed up, probably the most surprising development in AI for me over the past six months
is that GPT5 Pro and even GPT5 thinking can make very small novel scientific discoveries.
As a reminder, these are models that think for under 40 minutes and are not nearly as advanced
as OpenAI's unreleased multi-agentic models, which we know can work autonomously for hours.
It's quite easy to see why OpenAI decided to launch the OpenAI for Science Initiative now,
presumably just a few weeks before the model that won gold on the IMO, IOI, and the ICPC becomes
available to the public.
Exciting times.
Now look, I do think that OpenAI has a bit of a communication problem right now, where if Sam is
sincere, when he basically implies that the revenue from social media style applications
is a relevant part of their plan to get to AGI, they've got to connect those dots explicitly
because otherwise people are just going to do it for them.
If they articulate why ads matter, people can disagree, but at least they won't be speculating.
But still, for those of you who are disappointed by the social media orientation of the AI labs,
just know that there is a heck of a lot out there happening that is about insanely ambitious,
world-changing scientific research.
And it's not far off in the future.
It's happening right now.
I'll continue to try to cover as much of that as I can as it comes up.
For now that, that's going to do it for today's AI Daily Brief.
Appreciate you listening or watching.
As always, and until next time, peace.
