a16z Podcast - Building AI Agents for Enterprise Operations
Episode Date: June 1, 2026Anish Acharya and Olivia Moore speak with Pablo Palafox and Luis Paarup about the challenges of deploying AI agents in operationally complex industries. The conversation covers the evolution of voice ...AI, enterprise workflows, and why logistics became an early proving ground for agent-based systems. They discuss context, coordination, and execution inside large organizations, as well as the role of forward-deployed engineering, enterprise deployment, and what it takes to move AI from experimentation into production. Resources: Pablo Palafox on X: https://x.com/pablorpalafox Luis Paarup on X: https://x.com/PaarupLuis Anish Acharya on X: https://x.com/illscience Olivia Moore on X: https://x.com/omooretweets Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Voice was the unlock to many of the operations that are really needed to move the world
if we talk about supply chain.
This is not a supply chain specific problem that we are solving.
It's actually an enterprise coordination problem.
The bigger problem in the coming years for voice here is really knowing when to talk and what not to talk.
So it's understanding all these nuances in the work more than making the latency faster or making
the voices more realistic, which I don't think that's a limiting factor today.
I feel like Happy Robot has always been at the forefront of kind of humanness.
Do you want the customers to know they're talking to an AI?
Where does that go?
I think it's super important that.
Most AI demos happen in controlled environments.
The real challenge begins when AI has to operate inside large organizations,
where information is fragmented across systems, teams, emails, phone calls,
and workflows that have evolved over years.
Logistics and supply chains have become an early proving ground for these systems.
Systems. Success depends not just on model intelligence, but on coordination, context, and the ability to execute work reliably in the real world.
Anisha Charya and Olivia Moore speak with Pablo Palafox and Luis Parag from Happy Robot, about voice AI, enterprise agents, and the challenges of deploying AI in operationally complex industries.
Olivia and I are here with the two incredibly talented founders of Happy Robot, Pablo and Luis. Welcome, guys.
Thank you, guys.
Super excited.
Very excited to have you.
We're overdue to have this conversation.
Well, look, we're here to kind of talk about the company and the incredible journey that you've
been on.
I know when we first met you, there had been a lot of buzz amongst YC founders and other
folks about how you guys are sort of at the edge of the technology and then really getting
a lot of pull from a go-to-market perspective.
So maybe take us back in time to the little office that had four or five people on 20th
street and what the origins of the company and the product were.
100%.
So Luis and I met on a lot of the company.
our second day of college, just to set the scene, ever since we've been building stuff together.
Our other co-founder, Chavi, he happens to be my brother, so I've known him for a little while.
We always wanted to build something together, right? So when we got into YC, we were looking
for complex problems we could solve. Keep in mind that Lisa and I had been literally building
submarines for robotics competitions to find mannequins underwater. That is the sort of problems
we were looking for. So when we decided on solving for that complexity, we looked at what Chavi
was doing as a CFO of the largest olive oil distributor in the world.
He was literally moving tons of olive oil across the ocean.
And that was that complexity that drew us into logistics and supply chain.
He literally had to hire insurance to call drivers to see where they were,
to see where the shipment was?
Because Walmart was asking him,
where the hell is my shipment of olive oil?
So that was the sort of problems that we wanted to tackle.
And maybe you can talk about why we actually started with voice there.
I guess we took it from a very tech-driven approach.
Really, the limiting factor back then was having an agent that could speak on the phone realistically.
Like, we were in conferences, how he was like traveling all around, like asking people,
hey, if we were to create a voice agent that could pick up the phone and sell these loads and track these shipments,
would you buy it?
And it's like, dude, of course, this is a no-brainer.
I just don't think you can do it.
So it was more so like the idea market fit or product market fit made sense from the beginning.
It was more so like, can we prove ourselves, we can build this technology?
And you know, LLMs were picking up.
We're talking about late 20, 23, probably.
LLMs were, like, decent enough.
Eleventh was picking up with the X2 speeds
and everything was kind of working together,
but we had to build something that could actually connect all the dots
and actually make something work, no?
That kind of shaped our company where we really had technology
and innovation as the core of our company
and always pushing this frontier and on.
Solve it probably like firsthand.
So that's how we got starting in the OIST phase.
Amazing.
One of my favorite memories of working with you guys
is actually when we first met outside a very crowded coffee shop,
and you called one of the live voice agents, and it was seamless,
and it did an incredible job in a very non-ideal environment.
I feel like a lot of people might know Happy Robot from your amazing demo videos of the voice agent.
And that's definitely not all the product is, but it's an important part of it.
So maybe walk us through like, why voice to start,
and then what voice is maybe unlocked for you more broadly.
Yeah.
What Louise was saying is very important.
Voice was the unlock to many of the operations that are really needed to move the world if we talk about supply chain.
So when we were going to these conferences and people were like,
know we're going to build these things that talk on the phone.
Negotiating rates on shipments was actually a big one.
So we actually fine-tuned LLMs back then.
Like we fine-tuned Mistral and Lama to actually make those voice agents faster
because otherwise using some GPD-4 at the time was like extremely slow.
And GP2 3.5 at the time was like terrible at reasoning and actually negotiating.
So we had to do a lot of tricks behind the scenes, build our own agent infrastructure, if you
well, but also build our own voice agent capabilities so that we could innovate faster than
competition.
And that actually gave us a really good edge in logistics and transportation in the early
beginnings.
So we started working with these freight brokers.
Then we expanded to these freight forwarders, then ocean carriers, then tracking companies.
And today we actually serve many of the largest companies in the space of supply chain.
We were discussing before, no, nine of the top 10 freight brokers in the U.S., seven of the top 10 tracking companies,
like some of the largest fleets that actually move our goods everywhere in the U.S., which is crazy.
Two of the largest ocean carriers, those big boats we see in the bay.
That is sort of customers that we needed to build for and where voice was the analog for many of the operations.
So it sounds like it wasn't just voice.
It was also voice plus negotiation.
So perhaps track and trace, which is customer support and sales, which is sort of this negotiation is where we started.
And I think that forced us to build a deeper set of technology than we otherwise would have built.
Maybe, Luis, take us on the technology journey a little bit.
Yeah.
So before I tackle that, I guess one of the things that we had very clear from the beginning when you're working on the frontier of technology is really what you have to reinvent versus what already exists, no?
And I think people might take an approach where they just reinvent everything just for the sake of it.
some people would just wrap around anything else and be like more of a go-to-market
thing, we started like tackling the limiting factor always. And again, back then,
GPD, Pablo mentioned, like 3.5 was relatively fast and not so good. So we had to find you
on the LLM. Soon enough, we realized that prompting and all these good models came out,
prompting was good enough, scratch that, let's do that. And always focusing on that limiting factor,
then voices, like the background noises, like supply scene is extremely messy. You're talking with
drivers in their trucks with the radio on and background music and noises and accents.
So always focusing on those limiting factors.
On the negotiation part, something we got very often was, how do you prevent the bot
from hallucinating a rate or like max by?
He's like, dude, I'm building this thing and it's just hallucinating max by and it doesn't
know how to negotiate.
How are you guys able to do that?
And I think it's because you don't need to show the AI why it doesn't need to see.
And I think we're very opinionated about this from the beginning where we're building
these proxy servers and actually exposing to the agent only the things they need to see.
And actually max by, the max amount of money the bot can actually see or actually negotiate,
it's not even exposed to the bot.
We were not exposing that.
We were doing external negotiation algorithms so that the bot would just ask for permission,
literally the same way a human would, like, hey, let me ask my boss.
And it was really just calling a tool and asking for permission to do more.
And we would inject back the rate, no?
So those sort of things, instead of like just putting in the context,
we're not having the LLN just freestyle it,
We do it in a more deterministic approach.
So it's always a mix of probabilistic plus deterministic
where you need to let everything to the AI, no?
It's building for the real world.
The real world is messy.
Those things are going to happen
where someone tries to jailbreak the agent
and get that maximum of money that they can get.
But we needed to build those guardrails very early on
so that we could actually go to the likes of C.H. Robinson
or Uber Freid and all of these big players
that would only trust us if we actually were building real technology.
It was pretty clear for us. We knew that we didn't want to focus on the long tail of logistics and
transportation because it's a very tricky space, but we knew we needed to serve the enterprise
in transportation and supply chain and logistics. So that was very clear. That shaped the type of products
that we had to build, the type of primitives that we had to build. It's so interesting because
Happy Robot was very early to both voice AI and enterprise agents more broadly, which is great.
And also, it's like the ground has been shifting under our feed because the models are
themselves are kind of changing and evolving. So rapidly, to your point about
fine-tuning versus prompting versus kind of what to do next, maybe we can talk through a few
of the use cases you have where it's very clear that a smarter model by itself doesn't just do
it and why you need to buy a platform. I can bring up the Coon and Agil use case. We recently
announced our partnership with the Marquis like Freight Forwarder, great partners. I was having a
personal lunch with their head of air. Shout out to our friend Inve at Kununegel at his
housing. And what I learned from their operations is that this is not a simple customer service
type of create a ticket in Zendesk and you're done or you reply based on a knowledge space.
Customer support for these real economy industries like logistics, transportation, freight forwarders,
broader supply chain, even other industries like the telco space or the utility space.
it's not as easy as just
replying base off of a knowledge base again.
There's a lot that happens afterwards
that really has to get done
to provide that update to the customer.
So example, freight forwarding,
Kununagel, they are serving customers,
very large customers, I cannot name who,
but imagine that you are a big customer of Kuninago
and you ask, hey, where is my shipment?
What happens now is an agent has to turn around
and go find it.
That goal find it is very complex.
You need that coordination.
You need basically an orchestration agent that is,
okay, this is an air shipment.
So obviously relates to airlines.
Who is the airline on this shipment?
Okay, let me go to the airline's website.
So we have browsing agents that go and scrape the website of the airline.
Oh, bummer, it's not there.
There's no update.
Damn, I need to go send an email.
Okay, I'm sending an email to the airline.
Two hours later, no reply.
Okay, I need to reason that if they don't reply now,
going to miss my SLA with my customer.
So now I need to call them.
I'm going to keep calling them until someone picks up at the airline and tell me where
the hell is my shipment.
So that is a sort of coordination that we need to make happen for transportation and logistics,
really.
And that has shaped the type of product that we had to build at an IV.
Yeah.
No, I subscribe everything.
I guess another example on the negotiation, which is always like how we started and all these
demos when pre-viral. And I guess one point of how raw intelligence really wasn't enough is when
you're negotiating, for example, loads and there's like 10 carriers or 10, like, a buyer's calling it
at the same time, you cannot have all those agents like doing work independently, which is what
happens to a certain degree with like humans. They're on a floor and sure, they shout to each other and
they're like, hey, this is a very hot load. Please negotiate hard. I have someone interested, you know. All this
information is really not in the model.
So what we started doing is
when you have inbound calls for the same load,
you can start like sharing context across them.
Like, hey, I have someone.
They're very interested.
Please push harder.
Like, this is a hot load.
So all this information sharing is literally what you put in the context
window at any point in time.
Like general intelligence or the raw intelligence
doesn't really know if someone else is calling on that load.
So it's all that about like, what do you know of the business?
What do you know about the negotiation strategies?
Maybe you know that pushing harder on this load because it's like cross-border is going to be better or whatnot.
Like that's not general intelligence.
That's very specific.
And different enterprises operate differently.
Like you cannot just build an agent, fine-tune it and have it work at any type of company.
All those nuances is outside of the model.
And it's that context later that we're trying to create.
No, and that actually like we can talk about how actually doing the work and executing the work is what gets you that.
It's like learning by experience.
You do something and you learn and you explore that space.
of the context later so that you can keep learning, no?
Really interesting.
So you talked about two different things there.
Pablo, you just talked about a very cross-functional workflow.
Luis, you talked about the complexity of really mastering sales, you know.
You guys started as sales and support.
And so what are some of the other surprises that you've had,
having started with more complexity, I think, than some of your competitors?
So one thing that we heard from one of the largest tracking companies recently was,
typically when we buy technology,
we see where we can apply that.
With you guys, we actually have a problem
and we come to you guys with that problem
because we know that as a platform that you've built,
we can pretty much build any type of agent for our operations,
from sales through customer service,
back office support and operations and even collections.
So some of the use cases that customers came to us
were, hey, we have a huge collections problem.
Can you build an agent to reach out a customer?
via email or voice and collect money.
We're like, of course.
We talked about these use cases
with one of our largest supply chain companies
and customers where we need to call customers
to recover duties on parcels.
And today we're running campaigns of 20 to 50,000
daily outreach to customers,
collecting duties on parcels that otherwise
they would not get if they don't pay the duty on the parcel.
So that that's,
that sort of surprises, if you will, we've gotten from customers.
No, like, yeah, I also need to recruit drivers.
Can you do that?
We obviously can build an agent that not only just recruits drivers,
actually connects to the operation so that now they know they can service a truck
with a customer earlier because now they have a driver to move that.
So there's all sorts of interesting connections between the functions.
Maybe I'll give you another example.
we built an agent to reach out to maintenance shops to see where a truck or when a truck was ready.
You could just leave that agent in a silo and just have an agent that is practically reaching out of those repair shops to see when the truck is ready.
Well, it turns out that the sooner you know when the truck is ready, the sooner you can put it in the market to sell it as capacity for your customers to actually move things.
So that was a very interesting realization of how sales in these case and maintenance were tightly connected.
So that is the context that we talk about.
There has to be an underlying context sharing across the different functions in a business.
So that the whole business optimizes for a global maxima, if you will, or a global minima, depending on what optimization problem you're trying to run,
versus just minimizing the problem in one function, if that makes sense.
And then maybe can you talk about, like, how do you discover these workflows?
Who discovers them?
Who builds them?
How do they get built?
I mean, maybe Louise, talk a bit about that.
Yeah.
So we're very forward-deployed.
So we were early on understood that really to solve the customer's pain point,
we had to build software that adapts to their operations and not the other way around,
which is like the old era before AI was you build something and ask people to like run their business,
however you think they should be run.
but we think it should be the other way around.
So from the very beginning,
we started like hiring and building this forward deployed motion
with FDEs, forward deployed engineers,
like everyone is talking about them now.
But I think it's about like really being customer obsessed
and really focusing on like the value at and their problem
and really sitting down with them
and going to their offices and learning what they need.
So sure, there's a lot of like synergies in the industry
and what you learn from a customer might be relevant.
to another, but very soon we realize that there's not a one-size-fits-all.
Even within inbound carer sales, even within this workflow, in the enterprise,
maybe there's like a long tail where this might apply,
but enterprises operate very, very differently.
And that's why we build a platform that is flexible enough to adapt to anyone's
operation.
And it's because we were trying to plug and play what we built somehow with a customer
to another one.
It didn't really work.
Like they want something different.
They want to change the procedure.
They want to call these tools.
They want to escalate whenever the carrier is not vetted
and someone else wants to do it automated.
So we really had to build almost like horizontal technology
because of the variety of all the nuances in this industry, no?
And that's how we create a platform that is not optimized for like specific tasks,
but more so optimized for like doing work.
So our primitives are around workflows and data and integrations and, you know,
SOPs, prompts.
You don't see like particular tasks being modeled because that,
that's almost too opinionated.
And customers don't want, like, opinion,
like their vendors forming opinions
how to run their operations.
Like, they've been running it for a long time.
They don't know more about their business than I do.
I just come with the technology,
and I just want to, like, solve their problems, no?
Yeah, absolutely.
I feel like the forward deployed motion
has been crucial for AI application layer businesses.
And also it's prompted a lot of questions about,
what are margins, what is, like, a service versus a product,
but kind of where is the long-term alpha and moat?
Maybe walk us through how you think about productizing the work that your FTEs do,
which I think is kind of a unique strength of happy robot.
Where does the forward-deployed motion start and end?
Do you do custom work for one customer?
I would love to understand how that works.
Maybe let's start in the beginning.
Yeah.
I was the first forward-deployed engineer without knowing it, I guess.
Yeah.
Which is pretty much what any founder would do, you know?
Like, you just go to your customers, spend.
a week there and just chase down the people that are actually doing the thing that you want to
help them automate, right? So I did that and I would be like pinging these guys like, do it. You need
to build this thing because it's going to make my life a lot easier and he's actually going to be
replicable across customers because I've seen it. So please build it. And he would be like,
really? Do I need to build that? So there was that good tension between kind of that forward
deployed motion and the product team. So we kept going with these like separate worlds for a little bit
where I would be leading the FDE team
and the deployment strategists that we realized
at some point we actually needed.
That was a bit of a realization,
a bit of a parenthesis here.
We started just with forward deployed engineers,
and then the customers like,
wait, you have these people building,
but who is managing?
I'm like, I guess that's like the deployment strategy
to some degree.
So the deployment strategy is a figure that scopes the problem
so that the forward deployed engineer
can spend more time on building,
although now what we see is that
the right FD or deployment strategies,
they have to be very cross-functional.
Close parenthesis on the type of profile.
So what ended up happening is
we were too disconnected
from like the forward deployed world
and the product team.
So we realized that that needed to be
part of Luis's world
so that the FD team
would actually be an extension of product,
which is what they should always be.
It's an extension of products
so that we can implement product faster.
We can gather the feedback faster.
from the customers, and hence capture that context faster than anyone else.
So it's a bit of these iterative loop that we let Louise really realize we needed.
Yeah, I'll add to that.
I mean, if we go with like first principles, what are we doing?
We're deploying agents across different functions and channels in the enterprise.
So our product is built for the deployment of an agent.
Like we really understand the deployment lifecycle because we work very closely with our
with our customers and we are actually deploying these agents,
and something cool that happens is that you have,
to a certain degree, your user in your house,
because we're building for the FDE for the most part.
And it's not entirely true.
Of course, some enterprises, they really appreciate having a platform
and we can talk about that later,
about how interesting the mix of coming with a platform
they can also use and an FDE that they can trust.
It's actually something very rare, and they mentioned that.
But I guess to the point of like the deployment lifecycle,
we really understand what,
what it takes to deploy an agent, no?
There's a scoping phase, there's a building,
there's testing, there's monitoring,
there's like a self-learning loop.
So I guess the point is
every feature we're building in the product
is optimized for that deployment lifecycle,
and the only way to know if that works
is being very close to the deployment, no?
And if these are doing these deployments,
or they're getting feedback from the other team.
So actually,
more than the FD is being very close to the product,
I think it's more so like our product
is a combination of a platform
and a forward deployed motion,
and it would really not exist.
And there's like this conversation
about like services and stuff.
The difference is that the forward deploy engineer
are like catalyst or accelerators to value,
but what we're living in the customer
are like agents running.
There's a platform.
Once the FDs have done the work,
they leave a working thing.
So it's almost like you spend that time,
you deploy a thing,
but you're not delivering like an output
that the FDE has done.
You're literally delivering the agent
working on a platform.
So it's a very difficult.
different distinction of pure services versus like a forward deploy implementation plus the
platform running the value forever, hopefully.
Awesome.
I feel like another thing that has been a topic of discussion is kind of what is the value of
systems of record in the AI era?
Does every application company need to become one of those?
And I know you at Happy Robot have a view on kind of systems of record versus maybe systems
of action or systems of execution.
So we'd love to talk through
kind of your view on that topic.
Maybe I'll start quickly.
We see ourselves as that layer of execution,
really.
That's where the magic happens.
You have to start doing the work
to capture that context.
So it's very important that we start
with executing work,
with getting the thing done,
implementing one agent,
implementing the second agent,
connecting them through that context layer.
But the context layer happens
after you're actually doing the work.
They're more than ever,
the importance is on the execution layer.
So for us, and Louise can comment on that more,
that data piece is a very important piece,
but it happens after the agents.
So what we've built is twin.
Twin for us is really that data layer
will reconnect systems of record of the customer,
your CRM, your ERP, your transportation management system,
whatever it is, your Snowflake instance,
and where agents can also populate their own
or restore their own context.
It's almost HapBot native data points.
So we've basically created these data layer
that holds both customer records
and HapBerabot agent created records, if you will.
Yeah, I think there's an interesting tension
in how much time you need to spend
ahead of deploying the agents on clean the data
versus just deploying the agents
and cleaning the data through doing the execution.
And I think it's a mix.
I think what we realize is these agents are creating a lot of information that really hasn't been captured before.
And it really doesn't fit in any of these systems.
Because it's more like high dimensional, semantic, almost like memory intelligence.
So I guess the point is many enterprises, I guess, are waiting to clean their data sources
so that they can power this workforce of agents.
And I think by doing the work and by actually having agents,
execute the work, you're going to clean the data as you go. Because humans are great, of course,
but they have a lot of limitations. Like they kind of be in the same place, in two places at the same
time. They drop a lot of threats. They're not very diligent and putting the data in the right system.
Like sometimes you forget, sometimes you write it down. So actually you can clean all your data
sources and then you can still run with humans and it's actually going to probably get dirty very,
very soon. The good thing about it is very deliant where it puts data. So it's through the process
of executing work, you're going to progressively start cleaning all your data sources because
you're going to get visibility into all these things, no? So not only are you connecting
the data, like the systems of record, like rows and columns and different entities, it's more
so creating relationships across them. So again, the shipment in the TMS is just a record. Like,
that's really not the IPs. How, that record might exist in many different enterprises. Like, it's
latitude, rate, whatever it is. Like, that really doesn't mean anything. What means something is,
how an enterprise is going to, what the enterprise is going to do with that.
Like once it gets into the system, how their processes are built, how their humans are going to
deal with that.
So all that is really not in the system.
It's more so, like, in people's brains, a lot of these contexts is like tribal notice,
the operators' whole.
And so a certain degree, they, it's not, it's super fragmented, no?
So actually, by doing the work, we're going to learn a lot about this more conversational
record or intelligence, but also we're going to start, like, cleaning the end systems
of record just by doing the work very consistently, now?
You know, Luis, that's such an interesting topic
because it's sort of like my intuition,
my naive intuition is that information about execution
is maybe ephemeral or the value that decays over time.
But I think what we're describing is how the value actually compounds over time
and maybe that actually enriches the information in the system of record.
Which one is true and why?
Yeah, I mean, I think you're,
so what you're doing by doing the execution, as I said,
is one, creating a better understanding about the relationships
of all these different entities.
So you're starting to connect the TMS, the CRM, the ERP, the snowflake, the notion page you have, the docs, everything is so disconnected.
You're going to start connecting it, but you're also going to start enriching the relationship to how to deal with those particular records, no?
So I guess the compounding comes from like two angles.
One is having clear or cleaner data sources, like literally the data points is going to make everyone's life easier,
but also understanding how to relate those different entities across the business, no?
So I think it compounds from multiple angles.
And then how much are you, initially I imagine you're capturing the way work has done on day zero,
but over time you're changing the way that work has done.
What is that interaction like?
Yeah, and I think if you think about it from a context perspective,
the FDs are really just seeding this state graph.
Like if you try to model the business as a world model or a model of the business,
you need to see it somehow.
You can just put the agent to work from the zero.
But then there's a point where there's a flywheel
where like the second and third and fifth deployment
takes less time.
But I think the FDs are the ones like going to the business
and starting to seat all this context layer
and actually leaving it there for like learning
and the second and the third one.
So there's always this call start problem.
And we talk about like fine tuning SLMs in the future
like reinforcement learning and all that stuff.
I think that really doesn't make sense
if you didn't have the basics
and you don't have the first and second
agent in production. And that's why
these are so important to, like, actually start
this flywheel. They would go there,
go there, interview
the operators, get all the specs
and actually put those first agents to work.
And from there, the system is going to start
learning and getting all these contexts and sharing
it across functions and across channels, no?
I feel like if I was trying to train someone
to do my job,
the context that they need
does not live in Salesforce or any
traditional software system. It probably would
live in needing transcripts. And
emails and casual conversations and even things that software can't capture, hasn't captured.
I know you guys have this concept of the pyramid of complexity and how starting with some of
these primitives allows you to get into more and more complex work over time. Maybe we could
walk through some examples of the type of work that happy robot agents can do.
So the pyramid of work, as we define it, is essentially the easy, repeatable,
low-hanging fruits type of work at the bottom.
Think about an easy B2B sales call,
an easy customer service type of operation,
some payment collection type of work,
kind of the highly-repetable,
easily automatable type of work.
One thing that we've already talked about here
is how those actually interconnect,
which is very important.
Like you might have these disconnected or siloed functions today
in a company, but very important to keep in mind that those are actually very connected.
And going back to the pyramid of work, what you have at the top is the deep, complex work
that is highly strategic, that is almost the information that the CEO of that company needs
to make decisions.
So when we think about the work that we're doing with our customers, we might start at some,
we might start somewhere in the bottom of the pyramid, but very,
fast we're going up the pyramid by combining those agents from sales and customer service and
collections, combining the context as Luis was saying so that you build on top and you build on top of
every layer so that every decision you make is based off of more context across the board.
When you're talking to that customer that has a complaint, you might want to remember that you
already upsold them last month. And sometimes human agents might not even remember that when you're
talking to a driver that had an issue at his delivery two weeks ago.
You might want to remember that from the operations team because maybe now you're more lenient
with the rate that you are given them.
Those things are highly interconnected and you need to build on top of them so that you
grow into the strategic type of decisions.
Yeah.
And I would add that the real, my opinion, the real economic leverage and value for the
enterprises really lifts at the top of the pyramid.
Like those are the decisions that are less volume.
Like if you think about it at the base, you have much more volume.
At the top, you have fewer decisions that are actually going to drive the outcomes of the enterprises.
And we keep talking and hearing about like outcome-based pricing or consumption-based pricing and whatnot.
I think really if you reach the top of the pyramid is where you really make decisions that drive the revenue of the company.
But you cannot start at the top.
Like those decisions are highly contextualized.
The same way you can probably not be the CEO of a company if you don't understand anything,
happening below. So actually the only way to get to the top and make those decisions by actually
capturing all the context underneath. And that's where everyone is getting stuck at. Like,
everyone is focusing at that base. It makes sense. It's already to a certain point being
commoditized. Like, those are simpler tasks and people keep talking about, like, you know,
like, AGI and general models being able to, like, automate that work, maybe. But the point is,
if you get stuck at a corner of that base, you're never going to climb that pyramid of complexity.
because in order to claim, you need to actually capture context across channels and across functions.
We've mentioned this a lot of bunch of times now.
When I was explaining the example of a negotiation, I was talking about phone calls,
but what if you get an email from another carrier actually putting an option?
Like all of a sudden, what if the voice agents don't know that there's an email coming through for the same load?
Like it's the same information.
It doesn't matter the channel, no?
And also what you learn from that carrier is the same, like the same customer you have or the same carrier you have
when you're tracking a load or doing all these things.
So if you focus on automating this part of the base,
that one corner for everyone,
you're probably not going to be able to climb this pyramid of complexity.
So it's our creating a unified understanding of the business
in order to start climbing that pyramid of complexity
and going to like the deeper complex decisions
that actually drive economic value for the enterprises.
Really interesting.
Maybe you guys talk about how that opportunity
has set you up to be pulled into other markets.
Now we're starting to see Poland financial services, utilities, telecommunications.
So why is the work that we've done in supply chain applicable to these other markets?
With DHL, we've deployed over 40 agents across 80 countries,
agents that are sharing contexts across regions and functions.
What I realized, what the team realized when working with DHL and many others like Kuninagle
or CMA-CGM, second largest ocean carrier in the world,
was, wow, this is not a supply chain specific problem
that we are solving.
It's actually an enterprise coordination problem.
When we think about ourselves as a startup
or like 120 people, you know,
we might have like some like miscommunications here and there,
but really we don't have a coordination problem in the company.
You can easily reach out to the people involved
and you just ask questions.
That doesn't happen in a company as big as DHL
or FedEx or Deutsche Telecom or T-Mobile or Telefonica,
these massive enterprises that have hundreds of thousands of people
just coordinating it in work.
We recently started working with one of the largest utility companies in Latam and Europe.
They have over 10 million customers,
dozens of thousands of employees across the world.
How on earth are they going to know real-time,
how to best serve their customers when they themselves
don't even have the tools to interconnect quickly
and to share context across them quickly.
So what we realized is we were not really solving
for a supply chain problem.
We were solving for the coordination problem of the enterprise.
Think about a utility, receiving a customer call
with someone complaining about a leaky boiler.
First of all, you should already know
that that customer already had the problem 10 days ago.
That's for sure.
Second of all, you should also know
that the technician you sent was not the right technician.
So now in this second attempt to fix that boiler,
you need to send the right technician
and the technician that is best suited for that particular boiler type.
So that is now on the operation side potentially,
or you could frame that as an operation type of problem, no,
versus when I started with the customer calling in,
that's more of a customer service type of problem, right?
Again, to the point of how these functions are interconnected.
But what happens after that technician is being dispatched to the customer's house?
Well, now you have an additional layer of coordination between a customer and the technician and the company that is lending the trucks to send that technician.
That is that coordination problem that we saw in these industries in the real economy.
Operationally complex businesses like utilities, oil and gas, telcos.
So we're now seeing this pull from the market.
We're already working with in POS, three of the largest telcos in the world.
We're being pulled into home and auto insurance because the sort of coordination problem of,
dispatching a tow track to help you when your car breaks down is very similar to when a trucking
company has a broken truck.
That sort of problems are repeatable across the real economy, if you will, when there's this
coordination problem across customers, partners, and your own employees.
I think this market, too, has, you know, broad-based voice-first customer support agents.
There's the models themselves in voice trying to move into business.
being agentic. And then there's more verticalized solutions that can move more horizontally.
How do you think about, like, what is a happy robot-shaped problem and where does that expand
into over time versus what are problems that are maybe less interesting for you to tackle
longer term? Yeah, I would say highly communicational. Like, and actually more than communication,
like interface of work to, like, interface to the external work.
meaning also like browsing a website to like retrieve the ETA of a shipment is some sort
like interaction with the outside world.
Voice to a certain point is a soft API as we were talking about.
Same as an email is a soft API or a website is a soft API.
Like when you're exchanging information between systems, of course an API programmatically makes
more sense, but sometimes that doesn't really, it's not the case.
So however we can help move the flow of information between systems via voice, email,
browsing a website or whatever it takes.
And also when there's this high complexity
when the decisions are like contextualized
and it's not like the SOPs are not super clear, no?
I think that's the bigger point where sometimes
the enterprise doesn't really know themselves.
Like people don't know what they know.
You can ask them what they're doing and it's like,
well, I'm doing this, but they really don't know
the specificity of what they're doing.
So it's actually through doing this execution of work
that we're learning a lot about how these companies operate.
So when the SOPs are not clear and it's like super communication driven,
I think that's where we shine.
Really cool.
Luis, I want to actually pick your brain a little bit about the voice models themselves.
Many of the other companies that we may overlap with rely on 11 labs,
which is a fabulous technology where, of course, investors in 11.
You guys have done a bunch of your own model work.
Why, what are the kind of tradeoffs of, you know,
a vertical model versus a horizontal model?
Maybe take us through a bit of that.
Yeah, my 11 is great.
We actually used them for a long time and they're great, of course.
I guess to the point before, I were always focusing on the limiting factor and seeing what do we need to do to solve the current problems of the market.
I guess we started very soon realizing how there was a problem in turn-taking detection.
Like end of turn is probably the biggest problem in voice AI.
And we realized that very early on because everyone was focusing on making the latest.
lower and making the voices more realistic.
And that's fine.
But I don't really think that's the bottleneck right now
to deployment of these agents, not even the intelligence.
Like model capability is high enough.
Like we're using models in certain uses
that were released like two years ago.
Right.
Like sure, like everyone is like pushing the frontier
and increasing context windows and making more reasoning budget.
And PhDs doing customer support now.
Exactly.
Like everyone is waiting for someone to release like a 10 trillion
talking context window to like do whatever.
Like, we were using models from one year and a half ago to call drivers and ask if they're going to make it on time.
You don't need PhD-level intelligence for that.
I guess the point is, as we make models faster, we realize how important the conversation handling and the flow of the conversation is.
If you think about it, the faster the models get, the more you're going to interrupt.
And the harder is going to be to have a normal conversation.
And actually, if you think about it, the bigger problem in the coming years for, like, voice air is really knowing when to talk and when
not to talk. And sometimes you need to speak fast. Sometimes you need to wait because a
person has not done talking. Sometimes you might need to stop and think. And that's something
that the models are not today very good at, like really stopping and knowing when a question is
hard and when they need to like probably trigger a reasoning thread that is more async and just
think about it and say something like, um, and really be thinking, not something you put in the
problem because it's cool, but just literally have them think, no? So it's all about understanding
the conversation, when is it my time to talk and what should I say, no? So we invest a lot in
this end of turn interruption handling, filler detections, background noises, like if my mom is
speaking at the back of the car, the butt doesn't need to know or interrupt, no? So it's
understanding all these nuances in the work more than making the latency faster, which of course
we can be improved or making the voices more realistic, which again, I don't think that's
the limiting factor today. Yeah, it's interesting. It feels like we're at the point where
the models are so good that as they get better, especially with voice, it actually takes us
further away from humanness in some cases. Like the latency is too fast, or the interruption
handling is too sensitive. Like if someone says a filler word, you don't necessarily want the model
to react. You want it to keep talking. I feel like Happy Robot has always been at the forefront of
kind of humanness, how do you think about how that shapes product development?
How do you think about what that looks like five years from now?
Do you want the customers, the end customer, to know they're talking to an AI?
Do you want it to feel like a perfectly human experience?
Where does that go?
I think it's super important that the experience remains as human as possible, even if you say
that it's an AI.
We're now live with hundreds of thousands of end customers or end users talking to our agents,
not only via email or chatbot or website, whatever it is, but mostly through voice.
Like voice is one of our more like one of our primary channels.
And one thing we saw is even if you say it's an AI, even if you disclose at the beginning,
hey, Mr. Driver, I'm an AI agent.
I'm calling you because I need to know where you are.
At the beginning, they might be like, what do you just say?
But then very, very soon they forget.
And they forget in a good way because they are now just having a normal conversation
with a system that is smart enough to not make their life or their day even harder than it was already before.
So I think the conversationalist, the conversationalness, the human-like capabilities are very important to make technology work.
So for us, the product is shaped around that experience.
Some people were telling us at the beginning,
like, no, you don't need these agents to sound superhuman.
Why are you investing so much on the text-speech?
Why do you care if the agent just mispronounces a load number, a shipment number?
Like, what do you mean?
That's the whole point.
You want the experience to be as good as possible.
So it's very important that we continue building towards a really human-like experience.
Again, voice is obviously a primary channel for us, but even across the board, like, everything should feel human.
Everything should feel just a very natural exchange of information as we were discussing before.
We're just trying to build an AI workforce that is almost colleagues to the employees in these companies so that they almost collaborate together.
That is very important to the DNA that we're building in Haberoa.
It almost goes with the name, if you will.
Like there's that, there's, there's that, um, human-like sense in the product we build for our customers.
Yeah.
Well, you know, Pablo, to build on that, it also strikes me that you make the employees,
the human employees of many of your customers also more human.
And so far as, you know, I think it was Keeley was telling me a story about DHL and Home Depot
and the folks that had previously spent all week on a phone trying to just schedule deliveries with Home Depot,
we're now taking folks out for dinner and building deeper relationships.
Maybe talk a bit about what is the future of sort of humans and agents working together in these enterprises.
It's a bright feature.
It's a very cool feature because a lot of the work that we're helping our customers automate is work that no one really wants to do.
Think about collecting payments from customers.
Would you really want to be calling your customer to be like, hey, like, you know, like this invoice is past doom, man?
like, are you going to pay?
Who wants to be doing that, right?
Who wants to be calling a list of doorman accounts to see who would want to ship with us
or who would want to be picking up a call from an angry customer that whose delivery was late
or whose technician broke the boiler or whose technician didn't fix the router?
That is the sort of problems that agents can help your human teams alleviate so that, again,
your humans can actually take that steak dinner with your customer and work on building up the relationship,
not unfixed in the operational problems.
That's the problem space we're looking at.
The operational complexity that these businesses have that no one really wants to do, but that has to get done.
Thanks so much for joining us today, guys.
We know you are very busy serving a lot of very happy customers, and there's so many more exciting things to come for Happy Robot.
Thank you so much.
Thank you for supporting us all the way.
Thanks for listening to this episode of the A16Z podcast.
If you like this episode, be sure to like, comment,
subscribe, leave us a rating or review,
and share it with your friends and family.
For more episodes, go to YouTube, Apple Podcasts, and Spotify.
Follow us on X at A16Z and subscribe to our substack at A16Z.com.
Thanks again for listening, and I'll see you in the next episode.
As a reminder, the content here is for informational purposes only.
should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund.
Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
For more details, including a link to our investments, please see A16Z.com forward slash disclosures.
