SemiWiki.com - Podcast EP296: How Agentic and Autonomous Systems Make Scientists More Productive with SanboxAQ’s Tiffany Callahan
Episode Date: July 9, 2025Dan is joined by Dr. Tiffany Callahan from SandboxAQ. As one of the early movers in the evolving sciences of computational biology, machine learning and artificial intelligence, Tiffany serves as the ...technical lead for agentic and autonomous systems at SandboxAQ. She has authored over 50 peer-reviewed publications, launched… Read More
Transcript
Discussion (0)
Hello, my name is Daniel Nennie, founder of SemiWiki, the open forum for semiconductor
professionals. Welcome to the Semiconductor Insiders podcast series.
My guest today is Dr. Tiffany Callahan from SandboxAQ. As one of the early movers in the
evolving sciences of computational biology, machine
learning and artificial intelligence, Tiffany serves as the technical lead for agentic and
autonomous systems at Sandbox AQ.
She has authored over 50 peer-reviewed publications, launched several high-impact open source projects
and holds multiple patents.
Welcome to the podcast, Tiffany.
Thank you so much.
It's really nice to be here.
Yeah, so the first question I'd like to ask is
what first got you started in the sciences?
That's a great question.
I found that I've always been really curious
how things work.
So not just what motivates someone or something to act,
but why they end up
doing it and what that response really means with the overarching goal. And so
for me, I started my journey in psychology where I thought I could
really put my skills to test to understand humans and human motivation
and behavior and quickly learned that humans are really complicated and maybe
it's more about the data that excited me. So how do we use data and math to understand
people? Well that led me further to say, okay well if I can analyze data and math
using math and and put it to the test then I probably need really good
computational chops to allow me to apply that math to really big data. And so
going from psychology to statistics to eventually computational biology, I then
found myself after graduating at IBM research where I was working on ways to
try to understand how do we help chemists understand how things work using
computers. So they're very good at using chemistry related tools to dissect complicated problems, but using computers is a newer thing to most
chemists. So my job was helping them to do that, which has finally led to this
amazing position at Sandbox where I kind of get to combine all of these aspects
together. So how do we help chemists, how do we help biologists solve complicated
problems by building autonomous
systems that help them do that, not replace them, but empower them to do it better?
Yeah, that's a great story.
So why did Sandbox AQ develop an agentic AI system when others are already in that space?
So that is also an excellent question.
And I'll focus on one particular aspect
we think we can use these sort of tools to solve.
And that's around drug development, which today
is incredibly slow, risky, and really expensive.
Despite massive investment, most candidates
still fail in late stage trials due to unexpected toxicity
or just a lack of efficacy.
So one aspect of this problem we find particularly motivating is that many of these failures stem
from an inability to predict how a drug will behave across the full complexity of human behavior.
So that means across different cell types, genetic backgrounds, and tissue environments.
And this happens before ever reaching the clinic.
So what if it were possible to actually develop
an autonomous system that could think like a scientist?
Not just run isolated models,
but actually reason across these diverse data sets,
help us to ask the right questions
and plan the right experiments,
and then explain their decisions
by working in tandem with a human researcher. And that's where we think agentic AI can come in. So we're trying to approach the
problem from multiple directions, but the one I find the most intriguing is to build a system
that's modular, interoperable, and allow agents to work together, test hypotheses, and ideally
reason about the complex problems in a way that they
can help us accelerate drug discovery timelines, but doing it by making the process smarter,
safer, and hopefully more explainable.
So wrapping that up, our ultimate motivation is really twofold, scientific and strategic.
Scientifically, we want to give researchers tools that can model chemistry and biology
at a deeper causal level.
Strategically, we see agentic AI as this powerful tool to support scientists, help them navigate
complex data, generate hypotheses, and ideally make more informed decisions throughout the
process.
Okay, so let's talk a little bit more about this.
So SandboxAQ calls its
Agenic AI and AI chemist. So what makes your system deserve that title?
Absolutely. Our approach to agentic AI is fundamentally human-centered. So we're
not building tools to replace scientists, we're building tools to extend their
capabilities. So the name really comes from the idea of how do we give researchers an intelligent,
dependable partner that helps them explore, reason, and decide faster without
sacrificing transparency or control? And what we think could really set us apart
in this space is the philosophy behind it. So we're not just building systems
for scientists,
we're actually trying to build them with scientists.
And that means listening to not only what customers ask for,
but also trying to sense what they don't yet know
they might need.
So we treat that as a design challenge
and as a responsibility.
Because in drug discovery,
which is the primary area that we are focusing in,
safety and explainability
aren't features, and they really should be non-negotiables.
And getting it right really matters.
So we're trying to invest the time to both get it right and keep scientists in the driver's
seat while letting the system really handle the heavy lifting behind the scenes with full
transparency. Okay, so what makes Sandbox AQ's large quantitative
models capable of solving problems that other AI systems might not be able to?
I'm so glad you asked this question. So at Sandbox, large quantitative models or
LQMs are at the core of how we're tackling some of the world's hardest
scientific challenges,
especially in drug discovery and material science.
Where traditional LLMs are trained on the internet to reason about text, LQMs are trained
on the first principles of physics, chemistry, and engineering.
We think of this as physical AI, AI that reasons about the physical world to create real-world
impact, not just digital
insight.
Unlike LLMs, which are increasingly hitting a data wall, having already consumed most
of the internet, LQMs get stronger as we generate more structured, high-fidelity scientific
data ourselves.
That allows us to simulate, at massive scale, things like 100 million variations of a potential drug for cancer,
or hundreds of millions of new material combinations to lighten next generation aircraft.
In RA gentics systems, LQMs are leveraged to enable agents to act like scientific engines.
They don't just autocomplete answers, they simulate and predict grounded physical outcomes.
That means we would like our AI agents to do more than retrieve information and generate.
They should work alongside us and our scientists to generate new hypotheses, explore chemical spaces,
and test them virtually before ever running a single lab experiment.
And that's a huge leap in how we can accelerate discovery.
Ultimately, the role of LQMs is to bridge
deep physical modeling with flexible agentic reasoning,
giving scientists tools that are both
grounded and generative.
Okay, and what evidence do you have
that your agentic AI really works?
So that's a great question,
and one we're excited to answer
a little bit further in the future. So right now we're still in the early
stages of co-developing our system with our internal scientific partners, meaning
we're focusing heavily on thoughtful design, safety, and usability rather than
trying to be the first to do it because we want to be the first to do it right.
So we're being very intentional about testing in real world context, but it's too soon to
share any concrete breakthroughs responsibly just yet.
But that said, we're seeing really positive signs that the system can reduce the load
that we're hoping to reduce and is getting people excited about the potential applications
internally. Well, good.
But you know, there's a lot of concern about AI replacing people and scientists specifically,
but your system seems built to empower researchers, right?
So can you explain the philosophy behind the human AI partnerships at Sandbox AQ and how
you ensure scientists remain at the center.
Absolutely, and that initial concern is absolutely valid and one that I and we as a company take
very seriously.
And our belief is that agentic AI should amplify the human experience and expertise, not replace
it.
So we're building systems designed to assist
scientists by handling the repetitive data-heavy tasks so that researchers
can focus their energy on what matters the most. Interpreting results, making
decisions, and advancing discovery. So there are places where agentic AI can be
incredibly helpful. Things like generating hypotheses, connecting dots
across datasets, or even running virtual experiments. But there are also places
where it probably shouldn't take the lead. So things like high-stakes safety
critical decisions, especially in fields like chemistry or pharmacology, where
ethical judgment and human accountability are just irreplaceable.
And in those cases, our agents support this process
by providing insight,
but not ultimately making final decisions.
We're also deeply aware of the risks of deploying AI
and chemistry, especially unmonitored.
A model doesn't need to be malicious to be dangerous,
it just needs to be unaware of the consequences.
So a well-phrased, innocent query
that could accidentally output a harmful compound is something we want to be really careful about and most models,
especially LLMs that aren't explicitly guardrailed, can't tell the difference
between a vitamin and a nerve agent because chemistry risk lives in the
context. So that's why our system that we're trying to build from the ground up
will include safeguards, traceability, and human oversight at every step.
And this is one of the reasons that building these systems
just takes a really long time to do well.
It's hard to think about all the things that can go wrong.
Right.
So can you talk a little bit more about that?
What guardrails are embedded into the system?
Why can we trust your AI systems to produce accurate results
and work responsibly?
Great question.
We think that this combination of agentic systems with LQMs
opens up this new kind of scientific workflow,
one that's interactive, ideally traceable,
and hopefully grounded in the physical world.
Instead of bouncing between disconnected tools,
scientists can work with AI agents that plan and execute complex analyses,
simulate outcomes using first principles,
and critically show their work.
Every step within our system and assumption then,
an output is both logged and ideally verifiable.
So this means that
results aren't just faster, they're also more reproducible and as a result more
trustworthy. And this shift towards explainable traceable discovery is
especially powerful in fields like chemistry and drug development where
safety is not optional. A system can simulate millions of molecular
candidates and that means we can explore
a huge chemical space while reducing trial and error.
But it also means we have a great responsibility to ensure that these systems understand what
they're doing and the risk and boundaries of what should or shouldn't be proposed.
So that's why we are so heavily invested in responsible design.
We're trying to build this technology with scientists, not just to meet their needs,
but to ask them what they need and ideally learn from them to anticipate the needs that
they don't know they have yet.
And in the long run, we believe this approach, combining agentic reasoning with physically
grounded models or LQMs, will transform how discovery happens,
not by replacing the scientific method,
but by scaling it, accelerating it,
and making it safer and more transparent than ever before.
That sounds good.
So thank you for your time, Tiffany.
It was a pleasure meeting you.
And hopefully we can talk to you again throughout this journey.
Thank you so much.
I look forward to it.
That concludes our podcast.
Thank you all for listening, and have a great day.