This Week in Startups - Multi-Agent AI, Open Protocols & Startup Acceleration with Saurabh Tiwary | AI Basics
Episode Date: June 26, 2025In this episode, Jason dives deep into the future of multi-agent systems with Saurabh Tiwary, VP & GM of Cloud AI at Google. They explore how teams of AI agents—not just single models—can coll...aborate to solve complex problems, helping startups scale faster with fewer resources.Saurabh introduces Google Cloud’s open frameworks:Agent Development Kit (ADK) for building flexible, debuggable agentsAgent-to-Agent (A2A) Protocol, now part of the Linux FoundationThe emerging Multi-Agent Collaboration Protocol, enabling AI systems to work across platforms and organizationsThey also discuss:Why AI agents are evolving from task-doers to decision-makersHow multimodal inputs (text, image, video, voice) will supercharge agent capabilitiesWhy openness and interoperability will define the next era of AI infrastructure*Timestamps:(0:00) Jason introduces Saurabh Tiwary, VP & GM of Cloud AI at Google, and sets the stage for a deep dive on autonomous agents.(1:09) What is an AI Agent Today?(2:54) When Will Agents Actually Work?l(6:30) Inside Google’s ADK (Agent Development Kit)(8:35l) Agent-to-Agent Protocol & Open AI Standards(11:35 ) Why Human-in-the-Loop Still Matters(13:41) What’s Next: Multimodal Agents & Real Startup Use Cases*Google Cloud Report: “The Future of AI: Perspectives for Startups”Free download → https://goo.gle/futureofaiGoogle Cloud Homepagehttps://cloud.google.com/Google Cloud: Agent Development Kit (ADK)https://developers.googleblog.com/en/agent-development-kit-easy-to-build-multi-agent-applications/https://cloud.google.com/vertex-ai/generative-ai/docs/agent-development-kit/quickstartAgent-to-Agent (A2A) Protocol by Googlehttps://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/Linux Foundation: A2A Protocol Overviewhttps://www.linuxfoundation.org/press/linux-foundation-launches-the-agent2agent-protocol-project-to-enable-secure-intelligent-communication-between-ai-agentsGoogle Research: AI Agents & Multi-Agent Systemshttps://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/https://cloud.google.com/transform/ai-agents-evolving-with-google-agentspaceGoogle’s Agentspace https://cloud.google.com/products/agentspace?hl=enGoogle’s Vertex AI Agent Builderhttps://cloud.google.com/products/agent-builder?hl=enWatch more AI Basics episodeshttps://thisweekinstartups.com/basics*Follow Saurabh:X: https://x.com/saurktLinkedIn: https://www.linkedin.com/in/saurabh-tiwary/*Follow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanis*Follow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.com
Transcript
Discussion (0)
All right, everybody, welcome back to another episode of AI Basics.
Doing this in partnership with our friends over at Google Cloud.
Go check out their report.
The Future of AI Perspective for Startups.
It features insights from 23 leading voices in AI,
and it's available to you for free at g-o-o.geli slash future of AI.
And today we're excited to welcome Sorob Torwari.
He is the VP and general manager of cloud AI at Google.
Sir Rob leads cloud AI products like Vertex, Kaggle, and the agent development stack.
And his team works closely with DeepMind to bring AI research into production.
All right, Sir Rob, it is a crazy time for AI.
Thanks for taking a break.
No, thanks for having me.
And really nice to be here on the podcast.
My lord, have you ever seen a pace like this in our industry?
I mean, it is crazy.
has been crazy for quite some time now.
I guess you probably feel it as well
that the pace of things happening is accelerating,
if you can imagine that.
Which is really the crazy part of it.
We see all this amazing innovation
over the last two or three years
as AI becomes available to a broader group of individuals,
whether it's consumers,
business executives,
consultants, and of course developers
and people building products.
And I think today,
really good idea to talk
about agents.
Yep.
Everybody's been obsessed
for a couple of years now
with chatting and prompting,
but agents promise
a much different future.
Maybe you could explain to the audience
what the paradigm shift is
between a chat room
and talking to an LLM,
doing deep research,
all these great features
versus having an agent
and setting up an agent.
Yeah, sure.
So I think even the definition
of agents has been shifting
as the technology has been improving.
It's kind of like
the shifting goal.
post kind of a thing. People earlier were like, I will wrap an LLM with a prompt and that becomes
my agent. Then there were things like workflows where you have predictable outcomes and you were
kind of like codifying it through LLMs. That became agents. And now I think we are at a point where
we are basically asking these agents to make indeterminate tasks, right? Meaning take a set
up actions, look at the output, based on that, define what the next set of actions should be.
And so that leads to very interesting possibilities, if you can imagine in this abstract
kind of way, like, for example, or look at my email, see if there is something important.
The definition of important is a little bit vague and broad.
Pick those important emails, see what tasks are assigned, try to do those tasks, etc.,
just as an example over there.
When does that future arrive?
When do we go into, you know, open our email box and see,
hey, by the way, overnight, you had seven emails.
Two of them were critical.
And here's some research.
We prepared to reply for you.
And here are the tasks that we've already delegated.
One of the tasks went to the sales team.
Another went to customer support.
And another one went to operations and finance.
When will we start to see that future emerge in your mind?
So I feel some of the early foundations are
already being laid right away. Things are happening. What is happening is, and particularly for
these ambiguous tasks, as models keep on improving, what needs to happen is that these tasks need
to be done at high enough level of quality, because otherwise you or I am not going to delegate
these really, really important tasks to those agents, right? Yeah. And so as the quality of the
foundation models is improving, we are now seeing these bits and pieces getting connected, and that's
why all this excitement about agents, that we are going from one step task to two steps and
three steps and four steps and so on. And so those things are actually happening. There are products,
for example, we have something called Agent Space, which is our enterprise offering, where we are
actually building these type of automations. And so what are the precursors that you need to make
this work and to trust it? So is there a step that you see organizations doing
in order to build trust that the agent is going to do the right thing.
So I think at the end of the day, it boils down to quality,
meaning if you are executing or calling that agent to do certain tasks,
how often does it finish, right?
And let's say the accuracy is like 50% or so it's like,
it's correct one out of two times, not a great thing.
Once you go into a territory of like 80, 90% accuracy,
that's when things start getting interesting.
Another thing that we are observing from our customers is that the ambition of our customers is also increasing.
So let's say you have a two-step task and you can get to 80 or 90 percent.
Then people want to add a bit more complexity and a bit more complexity and so on and so forth.
So I think agents will continue to evolve in terms of what they are doing and their capabilities
and to this full-on ambition when we talk about in social media and other places about what should agents do of the future do.
but I think some of those basic steps are already being taken.
A lot of companies are already deploying agents into production environments today.
We're starting to do it ourselves internally when a founder sends us an update and they say,
hey, here's what's going on in my startup.
As investors, we have to read that update.
Sometimes they organize it really concisely.
Hey, here's our growth.
Here's our runway.
Here's our spend.
Here's the number of employees.
Here's the open positions.
Other times, it's just rambling, right?
And we don't have the authority or ability to say, oh, fill out of
form when they send an update. So now we have AI look at it and then normalize and say,
hey, here are the data points in there. Here are the ones that aren't in there. Now, we don't
have it reply and say, hey, you're missing your cash on hand. You're missing February's revenue.
But it will be able, it can now tell us what's missing so that we can save that step. And I would
say, arguably that saves us 15 to 30 minutes and increases our accuracy, but we're still
not ready to have it go back and talk to the founder.
Yeah, yeah, yeah.
As the quality of these agents to improve, right,
we would start having a little bit more confidence as we go along,
and then we start delegating more and more tasks to the agents.
All right.
Now, to do this, there's going to need to be infrastructure.
I know you guys are working on this ADK.
Maybe explain ADK and then agent-to-agent protocols,
because in this situation that I described of reading a founder's update
and then saying, hey, we're missing a couple of pieces of data here that we could use.
on the other side, if that was an email exchange
or a Slack exchange,
you know, I message, it could come
any number of ways. The founder could have an agent
who says, hey, yeah, we got that information for you.
It's right here. So maybe we could talk about those two things.
ADK and then A2A, the agent-to-agent protocol.
Yeah, so ATK is our open source framework
for building agents. What we believe is
and actually open source in the sense that you can
go download it, do whatever you want with it in terms of building multi-agent systems using ADK.
It has a lot of nice properties like debugability.
As I was mentioning, quality is really important.
And so when founders or developers are building agents, they need to ensure they have the right tools for tracing off like how the conversation goes, debugability, latency, stuff like that.
So it has a lot of tools within the agent development kit itself, or ADK stands for agent development kit.
So within itself to help build agents, it is not tied to Google per se.
So you can take an agent built on ADK, use any LLM of your choice, any large language model of your choice,
and you can run it in any cloud or within your on-prem environment as well.
So it's a fully open agent development kit.
Similarly, we have this agent-to-agent protocol, and this is where our thought process was that,
yes, we have this agent development kit through which development kit.
or founders can go build agents.
But at the end of the day,
there will be many, many different platforms, frameworks
on which people will be building agents.
And we can't just hope that everyone converges
to this ADK format, for example,
even though we prefer that and we would like people to,
but I don't think that's practical to assume everyone will be.
And so here what we did was we built this A2A
or agent-to-agent protocol so that if you have fully functioning agents
that you might have pre-built,
Whether using LLMs or whether using traditional rule-based systems,
the agent-to-j agent protocol can help you talk across agents.
So that now if you think about the multi-agent kind of ambition that we have,
where you have many task-specific agents and you can call them on demand to do certain things,
you can do it in a very interoperable way,
independent of where those agents have been built, how they have been built,
and on what platform are they running.
Yeah, and this protocol, it's very new.
Yeah, we're in year one of this, I think.
I mean, we announced it at Cloud Next.
The growth has been stupendous.
Cloud Next was in March of this year.
Oh, so we're in three months into this.
Yeah, actually April, April of this year.
So it's like month two, when we announced, we had about 50 to 60 partners,
different companies who are supporting this protocol.
Now that number has grown to about 120.
Just yesterday, we donated the A2A protocol to the Linux Foundation.
so that it has a completely open governance body
in terms of building and adding new features into the protocol.
And almost, I think all the hyper-scalers,
all the major companies, ServiceNow, Salesforce, Microsoft, Amazon,
they are all part of supporting and they have put in their weight
behind the agent-to-agent protocol.
And this is critical.
If we can get the entire ecosystem to support a standard,
then if in this hypothetical example,
I don't know, we have a billing,
agent that goes and checks invoices and which ones have been paid, which one haven't been paid,
it could ping sales force and say, well, who's the account executive here? Let's let them know
that this hasn't been paid. And then who's the customer and who's the support person at that
customer? Let's ask their agent to check. Hey, has invoice number, one, two, three, four, five, six,
seven been paid or not? Do you have that? It could go back and say, yes, it's been sent.
Okay, then we could escalate that call ticket. And these are things that would be done by humans,
right? These are chores that humans really don't like to do. Nobody likes to look up stuff. I mean,
in the old days, you'd have to send somebody down to, I don't know, the files in the basement to go find
the invoice, pull it, bring it upstairs. And there was a runner who did that. The equivalent here is,
I think, pretty clear. Clerical work, chores, they're going to go away. And that frees up humans to do
whatever the actual business of that business is, the more important high order bits. Yeah?
Yeah, yeah. Completely makes it.
And actually, we are already, I mean, to your, it's very interesting.
Like, we are actually working with SAP for some of these procurement-related agents and
working with Microsoft so that we can have an orchestration of agents across these different
very highly tactical or precise agents, right, which understand that particular domain
really, really well and can execute really well at high enough quality and do these higher-level
tasks on top of them.
And there'll have to be some permissions here.
So there's a whole.
layer of permissions that will need to be granted. You know, just like if you invited somebody to
collaborate in a workspace or a document, if you're using a Google sheet, you might invite somebody
from another organization to it and you approve it. This will have to happen between the
procurement departments and the billing departments between companies, but they're not going to be
sending money. That's in the future. And if they do, a human would be in the loop. So maybe you could
talk a little bit about the concept of human in the loop here. And is that part of the standard where it
says, hey, you know what? This seems important. Human in the loop time. Yeah, we actually
both within the ADK as well as in the, so the A2A protocol just talks about the protocol of like
two agents talking to each other and how do you register. The agent development kit actually
has capabilities for human in the loop, right, so that you can intervene, you can have function
calls or an interrupt where a human can jump in, give feedback that it's okay to proceed.
But the point that you are raising is really important, right? Like security,
like if we envision this view of where agents are going, right?
We are providing them a lot of capabilities to go and do and execute tasks on our behalf.
Now, almost the way I think about it is almost all the challenges that were there in the app ecosystem on Play Store or App Store.
You have similar kind of issues, right, meaning you submit code, you check and you make sure that the apps are doing the right things or not.
And also when you install it on the phone, you have permissions and so on, right?
Kind of similar kind of issues will pop up over here with respect to all the different agents which are there, whom should you trust versus not,
while also giving them what kind of controls or ability to make changes.
Do you want to provide them?
Because even if it is a trustable one, an agent, you don't want it to, I don't know, change the pay slip of an employee, for example, and stuff like that.
Yeah, if we don't want a changing passwords.
You don't need to change the passwords.
You don't need to change people's social security.
security numbers. In fact, you don't even need to look at that. We'll keep it a high level.
When we're looking at this system a year or two from now, what will success look like for you and
for the standard? What will success look like? What will common tasks that will be have been
accomplished? And then what do you think is going to take more time? I mean, as I said,
like, this will be, will be a journey. I think in terms of the complexity of the task,
Definitely this example that we said about like an SAP agent or a service NIO agent orchestrating across multiple different things and doing complex tasks, I think those are going to happen.
Multimodality will become very, very powerful.
Explain what that means for folks, yeah.
Yeah, so multimodality meaning right now a lot of like when we talk to these LLMs, generally people think about text-based operation, right?
You type something, you get something back, right?
Multimodality is where you have speech, image, video, text, all coming in into the model and the model is able to reason on top of it.
So, for example, when a receipt is submitted for expense approval, does it look like a receipt?
Or is it actually just a set of numbers which are submitted as an example, right?
That you are just processing text as an example.
So multimodality, I feel, will be a very, very powerful construct.
And the models are already capable, and we are already seeing some progress along that.
that domain. So the effectiveness of the agents will become much more stronger as we leverage
multimodality. Yeah, we started a couple of years ago where different sales training software
started to incorporate AI, started listening to, you know, video conferences, and then
coaching people, hey, you know, this other agent or this other account executive was doing a
great job asking these questions. You might want to incorporate those.
And to have an agent then go correlate, hey, here were five calls with potential partners,
customers.
These three actually followed through and bought the product.
These two didn't.
We're going to look back on the phone calls and the emails and we ping Gmail.
We ping Google meets or Zoom.
And we put all that together.
Here's why we think we lost that customer.
Or here's the customer support calls.
Here's the tone of their voice and the despair they had and the frustration they had with,
you know, your solution.
back to multimodal here.
It could be fascinating to be putting these data sets together
and then just getting coaching.
The coaching piece of it to me is so interesting.
Yeah, actually, I will share an example.
Like one of our customers, Shopify,
they actually are leveraging multimodality
for their customers who are setting up storefronts,
and they have their dashboard, right?
The admin dashboard that they have
for setting up the storefront.
Now it has lots of buttons and so on,
and if someone who is new to it may be stuck into,
how do I change the price or do something else,
add a new product or something,
what they are doing is they are sharing that dashboard
as part of the agentic workflow with the LLM.
It is able to look at the screen of the user and say,
well, click that button, not that one, the one next to it
and stuff like that, right?
So that becomes really, really powerful.
It reminds me of my early days in tech support
where I had somebody on the phone when I was in PC support
in the late 80s, early 90s.
And I said, okay, you know, we're installing this on Windows.
Just click any key.
And the woman said to me, I see a space bar, I see a control key, I see an enter key,
I don't see an any key.
And I said, oh, that means you can pick the key of your choice.
Like, so let's hit the space bar.
And she said, that's incredibly frustrating.
If you want me to click the any key, why don't you have a key marked any?
And I said, you're 100%.
Right.
But just coaching on those things, watching the path in which a
customer navigates your store and then finding out where they're frustrated or on the back end
where they're trying to figure out how do I find taxes and file my taxes. It's going to be
incredible for engagement on these products and then even getting to the product development team.
Hey, by the way, these are the three times and here are the paths where customers are most
frustrated with your product. Yeah. And here's how to fix them. And then eventually, hey, let's talk to the
UX agent. Let's talk to the developer agent, have them suggest a fix, and then human in a loop it,
somebody puts the bug fix in. I mean, can you imagine the world we're going to be living in in 18 months?
Yeah, yeah, yeah. No, this is, I mean, and we are moving along that particular direction right away.
I just love this idea. So I saw somebody had connected, I'm not sure which programs it was,
but, you know, a co-pilot for a developer with either Figma or Canva or something. And the two of
them started talking to each and that had to build a lot of glue to make this happen.
And this is pre-standards. But they got the two of them talking to each where Figma started
or Canvas started designing a website, the coder started building it. And they were going back and
forth, essentially talking to each other. Yeah. And you could just watch them work. Pretty interesting.
Yeah. If you think about it, like the progress that we make is because of the iterative design or
iterative work that we do, right? And if we can provide that capability to agents and provide that
agency to them. It opens up just new opportunities.
All right, listen, this is incredible. Google's an amazing partner. Thank you so much for coming on.
For more insights, check out Google Clouds, The Future of AI, Perspectives for Startups,
tons of predictions, real world, examples and startup advice from 23 AI leaders.
Get some great ideas in there, some great techniques, some tactics, some strategy, go-o.glo.g.g.
slash future of AI. To read the full report, thanks again for listening. Go to this week in
startups.com slash basics. To see all our basics. We got legal basics up there. We've got
accounting basics. And now we've added AI basics because this is the fundamental feature that's
going to drive startups and commerce and our lives over the next decade. Thanks so much for
joining us and we'll see you next time. Yeah. Thank you.
