Semiconductor Insiders - Podcast EP296: How Agentic and Autonomous Systems Make Scientists More Productive with SanboxAQ’s Tiffany Callahan
Episode Date: July 9, 2025Dan is joined by Dr. Tiffany Callahan from SandboxAQ. As one of the early movers in the evolving sciences of computational biology, machine learning and artificial intelligence, Tiffany serves as the ...technical lead for agentic and autonomous systems at SandboxAQ. She has authored over 50 peer-reviewed publications, launched… Read More
Transcript
Discussion (0)
Hello, my name is Daniel Nenny, founder of semi-wiki, the open forum for semiconductor professionals.
Welcome to the Semiconductor Insiders podcast series.
My guest today is Dr. Tiffany Callahan from Sandbox AQ.
As one of the early movers in the evolving sciences of computational biology, machine learning, and artificial intelligence, Tiffany serves as the technical
lead for agentic and autonomous systems at Sandbox AQ.
She has authored over 50 peer review publications, launched several high-impact open-source
projects, and holds multiple patents.
Welcome to the podcast, Tiffany.
Thank you so much.
It's really nice to be here.
Yeah, so the first question I'd like to ask is, what first got you started in the sciences?
That's a great question.
I found that I've always been really curious how things.
work. So not just what motivates someone or something to act, but why they end up doing it and
what that response really means with the overarching goal. And so for me, I started my journey
in psychology where I thought I could really put my skills to test to understand humans and
human motivation and behavior and quickly learned that humans are really complicated. And maybe
it's more about the data that excited me. So how do we use data and math to understand?
people. Well, that led me further to say, okay, well, if I can analyze data and math using math
and put it to the test, then I probably need really good computational chops to allow me to
apply that math to really big data. And so going from psychology to statistics to eventually
computational biology, I then found myself after graduating at IBM Research, where I was
working on ways to try to understand how do we help chemists understand how things.
work using computers. So they're very good at using chemistry-related tools to dissect
complicated problems, but using computers is a newer thing to most chemists. So my job was helping
them to do that, which has finally led to this amazing position at Sandbox where I kind of get to
combine all of these aspects together. So how do we help chemists, how do we help biologists
solve complicated problems by building autonomous systems that help them do that, not replace them,
but empower them to do it better.
Yeah, that's a great story.
So why did Sandbox AQ develop an agentic AI system,
you know, when others are already in that space?
So that is also an excellent question.
And I'll focus on one particular aspect we think we can use these sort of tools to solve.
And that's around drug development, which today is incredibly slow, risky, and really expensive.
Despite massive investment, most candidates,
still fail in late-stage trials due to unexpected toxicity or just a lack of efficacy.
So one aspect of this problem we find particularly motivating is that many of these failures
stem from an inability to predict how a drug will behave across the full complexity of human behavior.
So that means across different cell types, genetic backgrounds, and tissue environments.
And this happens before ever reaching the clinic.
So what if it were possible to actually develop an autonomous system?
that could think like a scientist.
Not just run isolated models,
but actually reason across these diverse data sets,
help us to ask the right questions and plan the right experiments,
and then explain their decisions by working in tandem
with a human researcher.
And that's where we think agentic AI can come in.
So we're trying to approach the problem from multiple directions,
but the one I find the most intriguing
is to build a system that's modular, interoperable,
and allow agents to work together.
to work together, test hypotheses, and ideally reason about the complex problems in a way that
they can help us accelerate drug discovery timelines, but doing it by making the process smarter,
safer, and hopefully more explainable. So wrapping that up, our ultimate motivation is really
twofold, scientific and strategic. Scientifically, we want to give researchers tools that can model
chemistry and biology at a deeper causal level. Strategically, we see agentic AI,
as this powerful tool to support scientists, help them navigate complex data, generate hypotheses,
and ideally make more informed decisions throughout the process.
Okay, so let's talk a little bit more about this.
So Sandbox AQ calls its Agenic AI and AI chemist.
So what makes your system deserve that title?
Absolutely.
Our approach to Agentic AI is fundamentally human-centered.
So we're not building tools to replace scientists, we're building tools to extend their capabilities.
So the name really comes from the idea of how do we give researchers an intelligent, dependable partner
that helps them explore, reason, and decide faster without sacrificing transparency or control.
And what we think could really set us apart in this space is the philosophy behind it.
So we're not just building systems for scientists, we're actually trying to build them with scientists.
And that means listening to not only what customers ask for, but also trying to sense what they don't yet know they might need.
So we treat that as a design challenge and as a responsibility because in drug discovery, which is the primary area that we are focusing in, safety and explainability aren't features.
And they really should be non-negotiables and getting it right really matters.
So we're trying to invest the time to both get it right and keep scientists in the driver's seat while letting the same
system really handle the heavy lifting behind the scenes with full transparency.
Okay. So what makes Sandbox AQ's large quantitative models capable of solving problems that other AI systems might not be able to?
I'm so glad you asked this question. So at Sandbox, large quantitative models, or LQMs, are at the core of how we're tackling some of the world's hardest scientific challenges, especially in drug discovery and material science.
Where traditional LLMs are trained on the internet to reason about text, LQMs are trained on the first principles of physics, chemistry, and engineering.
We think of this as physical AI, AI that reasons about the physical world to create real-world impact, not just digital insight.
Unlike LLMs, which are increasingly hitting a data wall, having already consumed most of the internet,
LQMs get stronger as we generate more structured, high-fidelity scientific data ourselves.
That allows us to simulate at massive scale, things like 100 million variations of a potential
drug for cancer, or hundreds of millions of new material combinations to lighten next-generation
aircraft. In our agentic systems, LQMs are leveraged to enable agents to act like scientific
engines. They don't just auto-complete answers. They simulate and predict grounded physical outcomes.
That means we would like our AI agents to do more than retrieve information and generate.
They should work alongside us and our scientists to generate new hypotheses, explore chemical
spaces, and test them virtually before ever running a single lab experiment. And that's a huge
leap in how we can accelerate discovery. Ultimately, the role of LQMs is to bridge deep physical
modeling with flexible agentic reasoning, giving scientists tools that are both grounded and generative.
Okay, and what evidence do you have that your agentic AI really works?
So that's a great question, and one we're excited to answer a little bit further in the future.
So right now, we're still in the early stages of co-developing our system with our internal
scientific partners, meaning we're focusing heavily on thoughtful design, safety, and usability.
rather than trying to be the first to do it, because we want to be the first to do it right.
So we're being very intentional about testing in real-world context,
but it's too soon to share any concrete breakthroughs responsibly just yet.
But that said, we're seeing really positive signs that the system can reduce the load that we're hoping to reduce
and is getting people excited about the potential applications internally.
Well, good.
But, you know, there's a lot of concern about AI replacing,
people and scientists specifically. But your system seems built to empower researchers, right?
So can you explain the philosophy behind the human AI partnerships at Sandbox AQ and how you
ensure scientists remain at the center? Absolutely. And that initial concern is absolutely valid
and one that I and we as a company take very seriously. And our belief is that agentic AI should
amplify the human experience and expertise, not replace it. So we're building systems designed
to assist scientists by handling the repetitive data-heavy tasks so that researchers can focus
their energy on what matters the most, interpreting results, making decisions, and advancing
discovery. So there are places where agentic AI can be incredibly helpful, things like generating
hypotheses, connecting dots across datasets, or even running virtual experiments.
But there are also places where it probably shouldn't take the lead, so things like high-stakes, safety-critical decisions, especially in fields like chemistry or pharmacology, where ethical judgment and human accountability are just irreplaceable.
And in those cases, our agents support this process by providing insight, but not ultimately making final decisions.
We're also deeply aware of the risks of deploying AI and chemistry, especially unmonitored.
a model doesn't need to be malicious to be dangerous, it just needs to be unaware of the consequences.
So a well-phrased innocent query that could accidentally output a harmful compound is something we want to be
really careful about, and most models, especially LLMs, that aren't explicitly guardrailed,
can't tell the difference between a vitamin and a nerve agent, because chemistry risk lives in the
context. So that's why our system that we're trying to build from the ground up will include
safeguards, traceability, and human oversight at every step.
And this is one of the reasons that building these systems just takes a really long time to do well.
It's hard to think about all the things that can go wrong.
Right.
So can you talk a little bit more about that?
What guardrails are embedded into the system?
You know, why can we trust your AI systems to produce accurate results and work responsibly?
Great question.
So we think that this combination of age.
agentic systems with LQMs opens up this new kind of scientific workflow, one that's
interactive, ideally traceable, and hopefully grounded in the physical world.
So instead of bouncing between disconnected tools, scientists can work with AI agents that plan
and execute complex analyses, simulate outcomes using first principles, and critically show
their work.
Every step within our system, an assumption then, an output is both logged,
and ideally verifiable. So this means that results aren't just faster, they're also more reproducible
and as a result more trustworthy. And this shift towards explainable, traceable discovery
is especially powerful in fields like chemistry and drug development where safety is not optional.
A system can simulate millions of molecular candidates and that means we can explore a huge
chemical space while reducing trial and air. But it also means we have a great responsibility to ensure that these systems,
understand what they're doing and the risk and boundaries of what should or
shouldn't be proposed. So that's why we are so heavily invested in responsible
design. We're trying to build this technology with scientists not just to meet
their needs, but to ask them what they need and ideally learn from them to
anticipate the needs that they don't know they have yet. And in the long run,
we believe this approach combining agentic reasoning with physically grounded
models or LQMs will transform how discovery happens, not by replacing the scientific method,
but by scaling it, accelerating it, and making it safer and more transparent than ever before.
Sounds good. So thank you for your time, Tiffany. It's a pleasure meeting you, and hopefully we can
talk to you again, you know, throughout this journey. Thank you so much. I look forward to it.
That concludes our podcast. Thank you all for listening and have a great day.
Thank you.