Y Combinator Startup Podcast - François Chollet: The ARC Prize & How We Get to AGI
Episode Date: July 3, 2025François Chollet on June 16, 2025 at AI Startup School in San Francisco.François Chollet is a leading voice in AI. He's the creator of the Keras library, author of Deep Learning with Python, a...nd the founder of the ARC Prize, a global competition aimed at measuring true general intelligence.He's spent years thinking deeply about what intelligence actually is—and why scaling up today’s AI models isn’t enough to reach it.In this talk, he walks through the limits of pretraining and memorized skills, and lays out a path toward true general intelligence—AI that can adapt on the fly, reason in new situations, and invent novel solutions. He explains why abstraction and compositionality matter, how ARC became the benchmark for progress, and what his team at a new research lab called Ndea is building next.
Transcript
Discussion (0)
Hi everyone, I'm Francois.
I'm super excited to share with you some of my ideas about HGI and how we're going to get there.
This chart right there is one of the most important facts about the world.
The cost of computers has been consistently falling by two orders of magnitude every decade since 1940.
There's no sign that is stopping anytime soon.
And in AI, computers and data have long been the primary bottleneck to what we could achieve.
And in 2010, as you all know,
With the abundance of GPU-based computers and large data sets,
deep learning really started to work.
And all of a sudden, we're making fast progress
on problems that had long seemed intractable
across computer vision and natural language processing.
And in particular, self-supervised text modeling started to work.
And the dominant paradigm of AI became scaling up
LAM-3 training.
And this approach was crushing almost all benchmarks.
And remarkably, it was getting predictably better benchmark results
as we scaled up model size and training data size
with the exact same architecture and the exact same training process.
That's the scaling laws that Jared told you about a few minutes ago.
So it really seemed like we had all figured out.
And many people extrapolated that more scale was all that was needed
to solve everything and get to a GI.
Our field became obsessed with the idea that,
General intelligence would spontaneously emerge by cramming more and more data into bigger and bigger models.
But there was one problem.
We were confused about what these benchmarks really meant.
There's a big difference between memorized skills, which are static and task-specific,
and fluid general intelligence, the ability to understand something you've never seen before on the fly.
And back in 2019, before the rise of LALAMs,
the rise of the alarms, I release an AI benchmark to highlight this difference.
It's called the abstraction and reasoning corpus, or arc 1.
And from at that time back in 2019 to now, with a model like GPD4.5 for instance, there's
been a roughly 50,000x scale up of basal alarms.
And we went from zero percent accuracy on that benchmark to roughly 10 percent, which is
Not a lot, it's very close to zero,
if you take into account the fact that anyone of you
in this room would score well above 95%.
So to crack general fluid intelligence,
it turns out when it did new ideas
beyond just scaling up pre-training
and doing static inference.
This benchmark was not about regurgitating memorized skills.
It was really about making sense of a new problem
that you've never seen before on the fly.
But then last year, in 2020,
In 2020, everything changed.
The AI research community started pivoting
to a new and very different pattern.
Test time adaptation, creating models
that could change their own states at test time
to adapt to something new.
So this wasn't about carrying preloaded knowledge anymore.
It was really about the ability to learn and adapt
at inference time.
And suddenly, we started seeing significant progress on arc.
So finally, we had AI that was showing genuine signs
of fluid intelligence.
So in particular in December last year, opening I previewed its O-3 model, and they used a version
of it that was fine-tuned specifically on ARC, and that showed human-level performance on
that benchmark to the versus time.
And today in 2025, we have solidly moved on from the pre-training scaling pattern,
and we know fully in the era of test-andaptation.
So test-adaptation is all about the ability of a model to modify its own behavior, dynamically
based on the specific data it encounters during inference.
So that covers techniques like test time training,
program synthesis, chain of thought synthesis,
where the model tries to reprogram itself for the task at hand.
And today, every single AI approach that performs well on ARC
is using one of these techniques.
So today, I want to answer the following questions.
First, why did the pre-training scaling paradigm
not get us to a GI.
If you look back just two years ago,
this was the standard dogma,
everybody was saying this.
And today, almost no one believes this anymore.
So what happened?
And next, does this sound adaptation get us to AI this time?
And if that's the case, maybe a GI is already here.
Some people believe so.
And finally, besides the sound adaptation,
what else might be next for AI?
And to answer these questions, we have to go back to a more fundamental question.
What is even intelligence?
What do we mean when we say we're trying to build AI?
If you look back over the past decades, there's been two lines of thoughts to define intelligence and to define the goals of AI.
There's a Minsky style view.
AI is about making machines that are capable of performing tasks that would normally be done by humans.
And this echoes very closely the current mainstream cooperates you that AI would be a model that could perform most economically valuable tasks.
Like 80% is often quoted as the number.
But then there's the MacCarty view that AI is about getting machines to handle problems they have not been prepared for.
It's about getting AI to deal with something new.
And my view is more like the MacCarty View.
Intelligence is a process, and skill is the output of that process.
So skill itself is not intelligence, and displaying skill at any number of tasks does not show intelligence.
This is like the difference between a road network and a road building company.
If you have a road network, then you can go from A to B for a specific, predefined set of A's and B's.
But if you have a road building company, then you can start connecting new A's, new B's,
on the fly as your needs evolve.
So intelligence is the ability to deal with new situations.
It's the ability to blaze fresh trails and build new roads.
So attributing intelligence to actually a crystallized behavior program,
a skill program, that's a category error.
You are confusing the process and its output.
So don't confuse the roads and the process that created the road.
So to formalize this a bit,
I see intelligence as the conversion ratio between the information you have, mostly your past experience, but also any developer-imparted priors that the system might have, and your operational area over the space of potential future situations that you might encounter.
And that's going to feature high novelty and uncertainty.
So intelligence is the efficiency with which you operationalize past information in order to deal with the future.
It's an efficiency ratio.
And that's the reason why using exam like benchmarks to the AI models is a bad idea.
They're not going to tell you how close we are to AI.
Because human exams weren't designed to measure intelligence.
They were designed to measure task-specific skill and knowledge.
They were designed according to assumptions that are sensible for humans, but not for machines.
Like, for instance, most exams assume that you haven't read and memorized all these.
the exam questions and the answers before him.
So if you want to reguously define and measure intelligence,
here are some key concepts that you have to take into account.
The first is the distinction between static skills
and fluid intelligence.
So between having access to a collection of static programs
to solve known problems versus being able to synthesize
brand new programs on the fly to face a problem
you never seen before.
And of course, it's not a binary, it's not one or the other,
there's a spectrum between the two.
The second concept is operational area for a given skill.
There's a big difference between being skilled
on in situations that are very close to what you've seen before
and being skilled for any situation within a very broad scope.
For instance, if you know how to drive,
you should be able to drive in any city,
not just in a specific geo-fenced area.
I can learn to drive in San Jose,
and move to Sacramento and you can still drive.
Right?
Again, so there's a spectrum there.
It's not binary.
And lastly, you should look at information efficiency.
For a given skill, how much information,
how much data, how much practice did you need
to acquire that skill?
And of course, higher information efficiency
means higher intelligence.
And the reason these definitions matter a lot
is that as engineers, we can only build what we measure.
So the way we define,
and measure intelligence is not a technical detail.
It really reflects our understanding of the problem of cognition.
It scopes out the questions we're going to be asking,
and so it determines the answers that we're going to be getting.
It's the feedback signal that drives us towards our goals.
And a phenomenon you see constantly in engineering is the short control.
So it's the fact that when you focus on achieving a single measure of success,
you may succeed, but you will do that at the expense of everything else
that was not captured by your measure.
So you hit the targets, but you miss the points.
And you see this all the time on Kaggle, for instance.
We saw it with the Netflix Prize,
where the winning system was extremely accurate,
but it was way too complex to ever be used in production.
So it ended up never being used.
It was effectively pointless.
We also saw it in AI with chess playing for AI.
The reason the AI community set out to create programs that could play chess back in the 70s
was because people expected this would teach us about human intelligence.
And then a couple decades later, we achieved the goal when Deep Blue beats Kasparov, the world champion.
And in the process, we had really learned nothing about intelligence.
So you need the targets, but you miss the points.
And for decades, AI has chased task-specific skill,
Because that was our definition of intelligence.
But this definition only leads to automation,
which is exactly the kind of system that we have today.
But we actually want AI that's capable of autonomous invention.
We don't want to stop at automating known tasks.
We want AI that could tackle humanity's most difficult challenges
and accelerate scientific progress.
That's what AGI is meant to be.
And to achieve that, we need a new target.
We need to start targeting fluid intelligence itself.
the ability to adapt and invent.
So one definition of HGI only enlocks automation.
So it increases economic productivity.
Obviously, it's extremely valuable.
Maybe it also increases unemployment.
But the other definition unlocks invention
and the acceleration of the timeline of science.
And it's by measuring what you really care about
that we'll be able to make progress.
So we need a better target, we need a better feedback signal.
What does that look like?
My first attempt at creating a way to measure intelligence in AI systems was the RQI
benchmark.
So I released RQ1 back in 2019.
It's like an IQ test for machines and also humans.
So RQ1 contains 1,000 tasks like this one here.
And each task is unique.
That means that you cannot cram for RRQ.
You have to figure out each task on the fly by using your general intelligence rather
than your memorized knowledge.
And of course, solving any problem always requires some knowledge.
And in the case of most benchmarks, the knowledge priors that you need are typically left
implicit.
In the case of arc, we made them explicit.
So all arc tasks are built entirely on top of core knowledge priors, which are things
like objectness, elementary physics, basic geometry, topology, counting.
So concepts that any four-year-old child has already mastered.
And solving arc requires very little knowledge, and its knowledge that is very much not specialized.
So you don't need to prepare for Arc in order to solve it.
What makes Arc unique is that you cannot solve it purely by memorizing patterns.
It really requires you to demonstrate through the intelligence.
And meanwhile, pretty much every other benchmark out there is targeting fixed, known tasks,
So they can't actually be solved or hacked their memorization alone.
That's what makes ARC fairly easy for humans, but very challenging for AI.
And when you see a problem like this, where a human child can perform really well, but the
most advanced, the most sophisticated AI models have their struggle, that's like a big red flashing
lights telling you that we're missing something, that new ideas are needed.
One thing I want to keep in mind is that ARC is not going to tell you whether a system is already
the AGR now. That's not its purpose.
Arc is really a tool to direct the attention of the research community towards what we see
as the most important unsolved bottlenecks on the way to AI.
So Arc is not the destination.
And solving Arc is not the goal.
Arc is really just an arrow pointing in the right direction.
And Arc has completely resisted the pre-training scaling paradigm.
Even after a 50,000 X scale up of pre-trained basal alarms, their performance
Ack stayed near zero.
So we can decisively conclude that fluid intelligence does not emerge from scaling up pre-training.
You absolutely need test adaptation in order to demonstrate genuine fluid intelligence.
And importantly, when the arrival of test-andaptation happened last year, Arc was really
the only benchmark at the time that provided a clear signal about the profound shift that
was happening.
Other benchmarks were saturated.
So they could not distinguish between a true IQ increase
and just brute force scaling.
So now you see this graph and you're probably asking,
well, clearly at this point, arc one is also saturating.
So does that mean we have human level AI now?
Well, not yet.
What you see on this graph is that arc one was a binary test.
It was a minimal reproduction of fluid intelligence.
So it only really gives you two possible modes.
Either you have no fluid intelligence,
in which case you will score near zero,
like basal alarms,
or you have non-zero fluid intelligence,
in which case you will instantly score very high,
like the O3 model from OpenEI, for instance.
And of course, every one of you in this room
would score within noise distance of 400%.
So arc saturates, arc one saturates
where below human level fluid intelligence.
And so now we are in,
in need of a better tool, a more sensitive tool,
that would provide more useful bandwidths
and better comparison with human intelligence.
And that tool is ARC-G-I-2, which
released in March this year.
So back in 2019, Arc 1 was meant to challenge
the deep learning pattern, where models are big parametric curves
used for static inference.
And today, Arc 2 challenges reasoning systems.
It challenges the test-sadaptation pattern.
The benchmark format is still the same.
There's a much greater focus on probing compositional gyrization.
So the tasks are still very feasible for humans,
but they're much more sophisticated.
And as a result, arc two is not easily bruteforceable.
In practice, what this means is that in arc one, for many tasks,
you could just look at it and instantly see the solution,
without I think to think too much about it.
With arc two, all tasks require some level
of deliberate thinking.
But they still remain very feasible for humans.
And we noticed because we tested 400 people firsthand in person in San Diego over several days.
And we are not talking about people who have physics PhDs here.
We recruited random folks, Uber drivers, UCDS students, people who are unemployed.
So basically anyone trying to make some money on the side.
And all tasks in Arc 2 were sold by at least two other people that saw.
it and each task was seen on average by about seven people.
And so what that tells you is that a group of 10 random people with majority voting
would score 100% on Arc 2.
So we know these tasks are completely doable by regular folks with no prior training.
So how well do AI models do?
Well if you take Basel Alams, models like GPD 4.5, Lama 4, it's simple.
they get 0%.
There is simply nowhere to do these tasks
simply via memorization.
Next, if you look at static reasoning systems,
so systems that use a single chain of thoughts
that they generate for the task,
they don't do much better.
They do on the order of 1 to 2%.
So very much within noise distance of 0.
So what it tells you is that to solve arc 2,
you really need test sound adaptation.
All systems that do meaningfully above 0
are using TTI.
But even then, they're still far below human level.
So compared to Arc 1, Arc 2 enables much more granular evaluation of DTS systems, systems
like O3 for instance.
And that's where you see that O3 and other systems like us are still not yet quite human
level.
And in my view, as long as it's easy to come up with tasks that any one of you can do that
are easy for humans, but that AI cannot figure out, no matter how much computer just
right it, we don't have a GI yet.
And you will know that we are close to having a GI
when it becomes increasingly difficult to come up with the SCALS.
We are clearly not there yet.
And to be clear, I don't think ARC 2 is the final test.
We're not going to stop at ARC 2.
We've started development on ARC-AGI-3.
And ARC 3 is a significant departure
from the input output per format of ARC 1 and 2.
We are assessing agency, the ability to explore,
to learn interactively, to set goals, achieve goals
autonomously.
So your AI is dropped into a brand new environment
where it doesn't know what the controls do.
It doesn't know what the goal is.
It doesn't know whether the gameplay mechanics are.
It does to figure out everything on the fly,
starting with what is it even supposed to do in the game.
And every single game is entirely unique.
They're all built on top of core knowledge priors only,
just like in Arc 1 and 2.
So we'll have hundreds of interactive reasoning tasks
like this one.
And efficiency is central to the design of Arc 3.
So models won't just be graded on whether they can solve a task,
but on how efficiently they solve it.
And we are establishing a strict limit of the number of actions
that a model can take.
And we are targeting the same level of action efficiency
as we observe in human.
So we're going to launch this in early 2020.
early 2026 and next month in July,
we're gonna release a developer preview
so you can start playing with it.
What's it gonna take to solve Arc 2,
and we're still very far from it today,
then solve Arc 3, and we're even further away from that.
Maybe in the future, solve Arc 4, eventually get to a GI.
What are we still missing?
So I've said that intelligence is the efficiency
with which you operationalize the past
to face a constantly changing future.
But of course, if the future you face
had really nothing in common with the past,
no common ground with anything you've seen before,
you could not make sense of it,
no matter how intelligent you were.
But here's the thing.
Nothing is ever truly novel.
The universe around you is made of many different things
that are all similar to each other,
like one tree is similar to another tree,
is also similar to your neuron,
or electromagnetism is similar to hydrodynametism,
similar to hydrodynamics, is also similar to gravity.
So we are surrounded by isomorphisms.
I call this the kaleidoscope hypothesis.
Our experience of the world seems to feature a never-ending novelty and complexity, but the
number of unique atoms of meaning that you need to describe it is actually very small, and
everything around you is a recombination of these atoms.
And intelligence is the ability to mine your experience, to identify.
to identify these atoms of meaning that can be reused across many different situations, across
many different tasks.
And this involves identifying invariance, structure, things that seem to be repeated principles.
And these building blocks, these atoms, are called abstractions.
And whenever you encounter a new situation, you're going to make sense of it by recombining
on the fly abstractions from your collection to create a brand new model.
that's adapted to the situation.
So implementing intelligence is going to have two key parts.
First, there's abstraction acquisition.
You want to be able to efficiently extract reusable abstract
from your past experience, from a feed of data, for instance.
And then there's on-the-fly recombination.
You want to be able to efficiently select and recombine
these building blocks into models that are fits
for the current situation.
And the emphasis on efficiency here is crucial.
How intelligent you are is not just determined by whether you can do something,
is determined by how efficiently you can acquire good abstractions from real experience,
how efficiently you can recombine them to navigate novelty.
So if you need hundreds of thousands of hours to acquire a simple skill,
you're not very intelligent.
Or if you need to animate every single move on the chessboard to find
find the best move, you're not very intelligent.
So intelligence is not just demonstrating high skill,
it's really the efficiency with which you acquire
and deploy these skills.
It's both data efficiency and compute efficiency.
And at this point, you start to see why,
simply making our AI models bigger,
entering them on more data,
didn't automatically lead to a GI.
We are missing a couple of things.
First, these models lacked the ability
to do on-the-fly recombination.
So at training time, they were learning a lot.
They were acquiring many useful abstractions.
But then at test time, they were completely static.
You could only use them to fetch and apply a pre-recorded templates.
And that is a critical problem that test standardation is addressing.
TTA adds on-the-fly recombination capabilities to our AI.
And that's actually, that's a huge step forward that gets much, much closer to a GI.
That's not the only problem,
Recombination is not the only thing missing.
The other problem is that these models are still incredibly inefficient.
If you take gradient descent, for instance,
gradient descent requires vast amounts of data
to distill simple abstractions,
many orders of magnitude more data than what humans need,
roughly three to four others of magnitude more.
And if you look at recombination efficiency,
even the latest set of the RCTA techniques,
I still need thousands of dollars of computers
to solve arc one at human level.
And that doesn't even scale to arc two.
And the fundamental issue here is that deep learning models
are missing compositional generalization.
And that's the thing that arc two is trying to measure.
And the reason why is that there's more than one kind of abstraction.
And this is really important.
I said that intelligence is about mining abstractions
from data and then recombining them.
There's really two kinds of abstraction.
There's type one and type two.
They're pretty similar to each other.
They mirror each other.
So both are about comparing things, comparing instances,
and merging individual instances into common templates
by eliminating certain details about the instances.
So basically, you take a bunch of things,
you compare them, you drop the details that don't matter,
and what you're left with is an abstraction.
And the key difference between the two
is that one operates over a continuous domain
and the other operates over a discrete domain.
So type 1 or value-centric abstraction
is about comparing things via a continuous distance function.
And that's the kind of abstraction
that's behind perception, pattern cognition,
intuition, and also of course modern machine learning.
And type 2 or program-centric abstraction
is about comparing discrete programs,
which is to say graphs.
And instead of trying to compute this,
between them, you're going to be looking for exact structure matching.
You're going to be looking for exact isomorphisms, subgraph isomorphisms.
And this is what underlying much of human reasoning.
It's also what software engineers do when they are refactoring some code.
So if you hear a software engineer talk about abstraction, they mean this kind of abstraction.
So two kinds of abstraction, both driven by analogy making, either value analogy or program analogy,
And all cognition arises from a combination of these two forms abstraction.
You can remember them that the left brain versus right brain metaphor, one half for perception,
intuition, and the half for reasoning, planning, rigor.
And transformers are greats at type 1 abstraction.
They can do everything that type 1 is effective for.
Perception, intuition, pattern, cognition, they all work well.
So in that sense, transformers are a major breakthrough in AIR.
but they're still not a good fit for type 2.
And this is why you will struggle to train one of these models
to do very simple type 2 things like sorting a list
or adding digits provided as a sequence of tokens.
So how are we going to get to type 2?
You have to leverage discrete program search
as opposed to purely manipulating
continuous interpolative and living spaces
learn with gradient descent.
Search is what unlocks invention beyond
just automation. All known AI systems today that are capable of some kind of invention,
some kind of creativity, they rely on discrete search. Even back in the 90s, we were already
using gigantic search to come up with new antenna designs. Or you can take AlphaGo with MOV 37.
That was discrete search. Or more recently, the Alpha evolved system from DeepMind, all discrete search
systems. So deep learning doesn't invent, but search does.
So what's discrete program search?
It's basically combinator search over graphs of operators taken from some language, some
DSM.
And to better understand it, you can try to draw an analogy between program synthesis and the
machine learning techniques you already know about.
In machine learning, your model is a differentiable parametric function, so it's a curve.
In program synthesis, it's going to be a discrete graph, a graph of all.
orps, symbolic ops from some language.
In ML, your learning engine, the way you create models,
is gradient descent, which is very compute efficient,
by the way.
Gradient descent will let you find a model
that fits the data very quickly, very efficiently.
In program synthesis, the learning engine
is search, it's commensatory search,
which is extremely compute and efficient, obviously.
In machine learning, the key obstacle that you run into
is data density.
In order to fit a model, you need a dense sampling,
of the data manifolds.
You need a lot of data.
And program synthesis is the exact reverse.
Program synthesis is extremely data efficient.
You can fit a program using only two or three examples.
But in order to find that program,
you have to sift through a vast space of potential programs.
And the size of that space grows collaterally
with problem complexities.
You run into this combinator explosion wall.
I said earlier that intelligence is a combination,
of two forms of abstraction, type 1 and type 2.
And I really don't think that you're going to go very far.
If you go all in on just one of them,
like all in on type 1 or all in on type 2,
I think that if you want to really unlock their potential,
you have to combine them together.
And that's what human intelligence is really good at.
That's really what makes us special.
We combine perception and intuition together
with explicit step-by-step reasoning.
We combine both forms of abstraction in all of the same way.
in all our thoughts, all our actions everywhere.
For instance, when you're playing chess,
you're using type 2 when you calculate,
when you unfold some potential moves
step by step in your mind.
But you're not going to do this for every possible move, of course,
because there are too many of them, right?
You're only going to be doing it for a couple of different options,
right? Like here you're going to look at the knight, the queen,
and the way you narrow down these options
is via intuition, is via pattern cognition on the board.
And you build that up very much true experience, right?
You've mined your past experience and consciously to extract these patterns.
And that's very much type 1.
So you're using type 1 intuition to make type 2 calculation tractable.
So how is the merger between type 1 and type 2 going to work?
Well, the key system 2 technique is discrete search over a space of program.
And the blocker that you run into is cumulative explosion.
And meanwhile, the key system 1 technique is curve-fitting and interpolation on the curve.
So you take a lot of data, you embed it on some kind of interpolity manifold that enables
fast but approximate judgment calls about the target space.
And the big idea is going to be to leverage this fast but approximate judgment calls to fight
committal explosion and make program search tractable.
A simple analogy to understand this would be drawing a map.
So you take a space of discrete objects with discrete relationships that would normally require
a clinical search, like past finding on a subway system,
for instance, and you embed these objects into a latent space
where you can use a continuous distance function
to make fast but approximate guesses about discrete relationships.
And this enables you to keep cognitive explosion in check
while doing search.
And this is what the full picture looks like.
This is the system that we are currently working on.
AI is going to move towards systems that are
more like programmers that approach a new task
by writing software for it.
And when faced with a new task,
your programmer like meta-learner
will synthesize on the fly a program or model
that is adapted to the task.
And this program will blend deep learning submodules
for type 1 sub-problems, like perception, for instance,
and algorithmic modules for type 2 sub-problems.
And these models are going to be assembled
by a discrete program searchist
that is guided by deep learning-based intuition
about the structure of program space.
And this search process isn't done from scratch.
It's going to leverage a global library
of free-usable building blocks of abstractions.
And that library is constantly evolving
as it's learning from incoming tasks.
So when a new problem appears,
the system is going to search through this library
for relevant building blocks.
And whenever in the course of solving a new problem,
you're synthesizing a new building
you're going to be uploading it back to the library,
much like as a software engineer,
if you develop a useful library for your own work,
you're going to put it on GitHub so that other people can reuse it.
And the ultimate goal here is to have an AI
that can face a completely new situation,
and it's going to use its rich abstraction library
to quickly assemble a working model, much like a human software engineer
can quickly create a piece of software to have a new problem
by leveraging existing tools, existing libraries.
And this AI is going to keep improving itself over time,
both by expanding its library of abstractions
and also by refining its intuition about the structure of program space.
This system is what you are building at India,
on your research lab.
We started India because we believe that in order to dramatically accelerate scientific progress,
we need AI that's capable of independent invention and discovery.
We need AI that could expand the future
frontiers of knowledge, not just operate within them.
And we really believe that a new form of VR is going to be key to this acceleration.
Deep learning is great at automation.
It's incredibly powerful for automation, but scientific discovery requires something more.
And our approach at Endia is to leverage deep learning guided program search to build this
programmer like meta-learner.
And to test our progress, our first milestone is going to do.
going to be to solve RKGI using a system that says that knowing nothing at all about
RKGI. And you ultimately want to leverage our system for science to empower human
researchers and help accelerate the timeline of science.
