Daybreak - India's AI voice agents can detect your stress, catch your bluffs, and never have a bad day
Episode Date: April 22, 2026You pick up an unknown number. A bubbly voice starts selling you a credit card. You hang up in seconds. Except now, that voice may not be human.AI voice agents are already live across banks,... e-commerce and healthcare platforms in India, with startups in the space raising over Rs 280 crore. But behind that perfectly polite pitch is a more complex rollout — from pilots and script tuning to adapting across languages and dialects.So, what’s driving this sudden funding spree, and how are companies actually deploying these AI callers in India?Tune in.Daybreak is produced from the newsroom of The Ken, India’s first subscriber-only business news platform. Subscribe for more exclusive, deeply-reported, and analytical business stories.
Transcript
Discussion (0)
We're all familiar with this, right?
You get a call from an unknown number, you pick it up by accident.
A desperately bubbly voice immediately starts selling you, say, a credit card.
You hang up in less than three seconds.
Everyone has had this experience and everyone finds it annoying.
But because it's a process that still works and is pretty lucrative for banks, it continues.
Except over the last year or so, there is a new voice at the end of the line.
It's patient. It doesn't mind whether you cut the call in three seconds or ten.
When it answers any questions you may have, it's incredibly polite.
It's always on script, never late, and it never has a bad day.
If you hadn't guessed already, that voice is not human.
It's an AI voice agent.
And interestingly, in India, any bank, institution or company using them is actually under no obligation to tell you this.
These AI voice agents are already live across several Indian banks, e-commerce and healthcare platforms.
Think CRED, Practo, Flipcut and even snap it.
It's a real moment because the startups building them have raised more than 280 crore rupees.
But as the Kensmron Michael Carney found while she reported on the story over weeks,
the technology is more complicated than just a nice sounding robot voice.
For example, the AI can detect stress in your voice, though it still can't reliably tell the difference between a stressed Maharashtrian or a stressed Tamilian.
If you bluff about your salary, it can catch you because it has access to your account data.
And no matter how good it sounds, it seems like customers can almost always tell that it isn't human.
Funnily enough, it's the perfection that gives it away.
It's an exciting proposition for any business that relies on inbound or online.
outbound calls. But deploying them in India is no mean feat. Compliance issues, languages and dialects
and questions of customer trust mean that this new booming sector is navigating several
interesting grey areas. So what sparked the sudden funding spree in the space and how is this
new tech being adopted in India? Runma joined my co-host Snicktha in the studio to discuss
the rise of AI voice agents and what it means for India's BPO industry.
Thank you so much, Manmay, for joining us on daybreak.
So, you know, there's an argument that voice AI actually may be more honest than a human sales caller,
because, you know, there's no pressure, it also does not have a bad day,
and also doesn't go off script, right?
At any point in your reporting, did you feel that maybe Ananya,
who's the AI voice caller in your story, may be a better option?
Hi. So, I did get to talk to Ananya for my story and I got to talk to a couple of other voiceizations also.
And you're right in the fact that they don't really get tired. They never get cranky.
They never sound different when they've had two cups of coffee as opposed to, you know, very early in the morning.
What I do feel though is that that doesn't, that may make them more effective in terms of purely doing the job.
it doesn't necessarily make it better in terms of user experience.
See, when I speak to customers, people who maybe have an account at the bank
or people who are trying to book an appointment through PRACTO, for example,
these people are used to instinctively interacting with a human being.
And yes, for certain tasks, like maybe getting their insurance approved
or doing some minor checks, they're okay with WISIA agents being there.
but when somebody is selling you something,
they would really prefer to have a human
on the other end of the line.
And they can tell, by the way,
that it's not human.
Like, no matter how realistic it sounds,
I think the perfection of a WISAI agent
is really what gives it away
because they're not mad enough at you, basically.
That's so interesting that you say that, you know,
because sometimes when I am recording myself, right, on daybreak,
I find myself,
there are these small little things,
maybe a breath take in at the wrong place.
And earlier, I think I would find myself editing it out.
But now I'm like, no, I don't want to sound like an AI voice.
So, yeah, that makes a lot of sense.
One of the first things that you mentioned in your story that really stands out is how the amount of investment into these voice AI startups has gone up so drastically, right?
You said seven crores in 2023 to over nearly 300 crores now.
What is there a specific kind of trigger that caused this kind of insane amount of investment?
I would say that most of it has been triggered by how technology has evolved.
Voice was always sort of supposed to be the next step for AI agents after text.
And right now we are kind of at an inflection point in this sector because latency has improved.
and speech quality has become decent.
So it's actually getting to be realistic
to imagine that a little AI robot
it can handle your inbound and outbound calling.
Maybe you add a little supervision to it.
Eventually you can just check that off your mind
and the AI can handle all your calling systems for you.
So I think the main aspect is the technology
and it's also that a lot of really major companies,
for example, Google or Invadia,
are themselves shifting a lot of resources into voice.
It makes sense.
And India, as we know, is a country with a lot of languages,
with a lot of dialects.
Maybe a chatbot which just speaks Hindi and English
wouldn't be able to reach as many people as a voice AI agent,
which can speak a lot of dialects would be able to.
So that's kind of why the funding, I believe, has increased in the sector.
You know, when my speaking of languages,
in some way India seems like both,
the perfect market and the worst market for this because we have 22 official languages,
so many thousands of dialects and also one of the strictest telecom regulations in the world,
right?
How much of the pitch that these startups are making to investors actually holding up
when there is a stress test happening?
Okay.
So that's a pretty interesting question.
And in my story, in my research, I tried to look at it from a,
banks perspective, which was hiring these voice AI companies rather than from a startup perspective.
Because all the startups told me was that, yeah, we can deploy languages.
Maybe one day we learn to speak Tulu and let's just go ahead with it.
I did speak to someone from Kotak Mahindra.
And one of the interesting things they said was that their first question when they deploy a
voice AI agent is, can this thing get us into trouble?
So anything that can talk to customers has to follow RBI and try roles.
and they have to strict to approve scripts
and they have to record and log every call.
And more importantly,
the YZI agents have to have a provision
which lets a human step in instantly
if something feels off.
So the hands-off future, which we do imagine,
you know, it's not coming in any time soon.
So if a WISAI tool can't survive
a compliance review or an internal audit,
the quality of the demo, it becomes like pretty irrelevant.
And which is why inside a bank,
at first, WISAI is evaluated as a control
and governance layer. It's not evaluated as a shiny new toy or a shortcut. I think, yes, we do have an
advantage in the fact that we have a lot of dialects and our people and people in the country would
respond to maybe a voice request better than they would to a text. But like you said, we do have
strict regulations. And when companies hire these voice AI agents, that's one of the first things
they check. Also, Muramai, another interesting thing is the emotion bid, right? Apparently, some of these
most more advanced voice AI systems claim to detect customer emotions in real time, whether it's
frustration, hesitation, confusion, and then they adjust their tone accordingly. How close is that
actually to, in reality, in what Indian startups are deploying today? And did it raise any
concerns for you when you were reporting on this? That's actually pretty, it's one of the more
interesting aspects I found because that was the first time I realized that,
This is exactly what my voice sounds like when it's stressed.
This is exactly what I sound like when I'm a little bit sad.
Because, again, there is nobody who would sit in front of me and tell me you sound stressed right now.
But a voice AI agent catches that and it kind of modifies it.
I think that's more in practice for loan recovery voice AI agents because they are the ones who have to chase up people for their EMIs.
And if someone says something like, oh, no, I can't because, you know, I didn't get my salary this month.
or I didn't get, I have to pay double rent this month or something like that.
The AI agent is technically supposed to modify its tone and everything accordingly.
What I did find interesting was that it's not just looking out for your voice inflections.
For example, in the same case that I mentioned, in case of a loan recovery agent,
because it's being deployed by a bank, the agent also has access to data about whether
salary has really been credited in your account.
So yes, it can catch whether you're stressed out in your voice,
but it can also tell when you're bluffing because it has data.
So it's like if you're in an exam and you're caught cheating.
The invigilator can be a little bit sympathetic that you were stressed,
but you were caught cheating anyway.
So I do, I don't know if I find it concerning that an AI agent can tell
whether you're stressed out or not and it can catch voice inflections.
Primarily because it's still kind of a very new, very developing technology.
People express stress in very different ways.
And another interesting aspect I found in this one
was when I was speaking to someone who had worked in call centers for ages
and he was telling me that people culturally express stress
through their voice in very, very different ways.
A person from Maharashtra would sound different.
A person from a certain town would sound different.
A person from a different income bracket would sound different.
So if a voice AI can actually encapsulate that
through a very, very diverse range of people,
I'd say that's great tech.
I just don't know if it's here yet.
Bryn Mawai, in your story, you also quoted a former executive at a major business process firm saying something that was quite blunt that voice AI does not actually get its own budget.
It has to displace something that already exists, right?
How much of the optimism in this market is actually built on that inconvenient truth being ignored?
To be completely fair to everybody who's funding voice AI, I don't quite think they're ignoring the truth.
One of the things I sort of learned when I started reporting this story,
it's amazing how every different story gives you a perspective in very different people,
is that when a VC firm or any, or an angel investor is funding something,
it's not because they expect the company to last forever and ever.
They do expect the company to do very well in a short period of time.
And after that, maybe the company gets acquired, maybe it merges,
maybe it becomes a feature in a bigger company's process.
and, you know, that's completely fine too.
And the company makes money, the VC would make money, that would be a thing.
And as for what the WNS executive was saying in my story,
I have found that sort of skepticism to be common across people who are hiring voice AI agents as well.
Like, for example, the bank which hires Ananya, which has deployed Ananya in its system.
my friend who works
with startup
just deploy such voice AI agents
she introduced me to her boss
and her boss pretty much told me
that the rule they follow is differentiate or suffocate
meaning that your voice AI agent
has to either provide you a very different service
maybe it has to be a very efficient
voice AI agent maybe it has to require
minimum amount of supervision
maybe it has to
you know
do a lot of auditing
processes along with, you know, just calling customers and stuff.
But this isn't something which people are blindly optimistic about.
They are actually taking it with a pinch of salt and they are really evaluating all
the options they have before them.
Got it.
Also, you know, startups talk a lot about scalability, right?
But in your reporting, you describe something that is very different because, you know,
there's so much of pilot going on.
There's custom scripting.
You know, you're constantly tuning the model.
then of course like you mentioned there's so many compliances to meet
at what point does that stop being like a software business
and become you know a services business
it's interesting that you would call it a services business
because I remember in the edit meet when I had pitched this story
one of the feedbacks I got from people when I was expressing a little bit of
skepticism about startups was that why is this even a startup
why is this not just a feature in an existing company already
And like you said, the process of deployment is complicated.
It's not just, you know, you join and suddenly your functions are taken over.
Like I spoke to Grey Labs, a voice AI startup, and one of their founders told me that they start with a pilot.
Okay.
You take an enterprise's existing call recordings.
You train an AI agent on those and you deploy that in a controllable way.
Once they are comfortable with how the bot sounds and behaves, they run like a slightly larger experiment with maybe a thousand customers.
customers. And that to get to that level for the company to be comfortable with how a bot sounds
is a process in itself because I think I've mentioned this point in my story, bots have to
sound different for different types of businesses. A bot which is maybe working with banking customers
has to sound different. A bot working with e-commerce has to be more familiar with regional dialects,
more familiar with processes of returning a product. And maybe a bot working in, say, a medical
field has to be more familiar with practices of confidentiality,
with not asking maybe weird questions like,
hey, where's your rash?
So, once a board does that, once it's approaching a thousand customers,
at that stage, the agent is generating leads at roughly the same level as human callers,
and then they deploy a voice AI solution.
So, yes, it is quite an extensive process that actually goes in there.
So also, Muran, in your story, it's very clear that,
seems there seems to be an over-crowding problem across this space, right?
A lot of these startups, they seem to be solving nearly the same problems for the same type of
clients.
What generally separates the ones which are likely to survive from the ones that are just burning
away investor money on the same pitch?
So, you're right when you say that there is an overcrowding problem in this sector.
I would make the argument that whenever a new technology does come up, there is always a phase
where there is an overcrowding problem in a sector.
I mean, that is exactly what happened to grocery startups.
And I think this is an example I have probably mentioned in the story right now,
that this is what happened when the idea of a cloud came up.
There were a lot of companies bringing in tech,
and only a few of them are existing right now.
Most of them are consolidated under big names.
So consolidation, I think, is the future.
And interestingly,
even though YZI companies,
WISI startups right now are saying that
you know, we have this level of technology,
we have that level of technology,
we have suddenly handled about more than 200 million calls
in the past year and this is how we're going to go forward.
I think what differentiates a company,
I'm not even saying a startup at this time,
what differentiates a company is the amount of distribution that they have.
So, and that distribution right now does lie with call centers,
with BPMs.
So eventually I believe there's going to be a point where the best startups,
the ones with the most adaptable technology,
the ones with calling adjacent technologies like handling workflows,
those startups may survive on their own a little bit.
Maybe 10 or 15 of them would maybe try something new,
five of them would scale it.
But most of the other voice calling technologies would get absorbed by BPMs,
would get absorbed by large-scale enterprise players.
and that's how it would function.
So, you know, in that case,
obviously you spoke to a lot of these
YSI startup founders, right?
When they're raising Series A in this space,
what is their end goal?
Is it actually to maybe have an IPO at some point
or just get acquired?
I did try asking that question,
both directly, both indirectly.
And in both circumstances,
I was met with the statement of, bro, we just raised series A.
We've just come up with the technology.
We have just started our journey.
We have no idea where exactly we're going to end up.
But what I can say, though, is that most of these startups are built on really good tech.
And they are really, really enthusiastic about the work they do.
And objectively speaking, these are good companies.
I think obviously they are like the market dynamic of a lot of companies.
space is not something that one startup can control.
And eventually a startup would do whatever is most profitable for them right now, whether it's
being acquired, whether it is actually diversifying into a new technology so that you become
a major player on your own.
That's really up to them.
But for now, they are beginning and they are very excited about it.
Speaking about excitement, you know, there was this Swedish fintech company that you spoke
about in your story that was also very excited about voice AI.
its voice AI assistant and it announced,
and this made headlines,
that its AI assistant was doing the work of 700 customer service agents.
And then very quietly they went and started hiring humans again
because they just realized that kind of quality was not there, right?
So what's happening in India?
Are we learning from these experiments that have already happened in the West?
Or do you see us repeating the same mistakes again?
Yeah, that's a pretty.
interesting story. It's about the company Klarna. I think it was a few years ago when they came up with
their own voice AI agent and they suddenly decided they don't need their people anymore. They fired
a lot of people and deployed voice AI. And obviously they had to backtrack after that. But I do see a lot of
very healthy level of skepticism across, you know, adapters in Indian enterprises. And
As one of my startup founder friends told me,
nobody is firing entire floors of call center agents right now.
They don't see them getting fired in the future either.
There are functions which are shifting predominantly towards voice.
I can tell you that.
Like maybe credit card sales, for example,
is something where you do have a very fixed script.
You don't need a very particular read on the customer's situation.
And again, this is a volumes game.
You call a lot of people.
You get a chance of converting a lot of leads.
That is somewhere a YGI agent would function very well.
So I can see some functions shifting away from human roles to maybe AI rules.
But yes, we aren't repeating the same mistakes.
And I don't think we'd ever be in a situation where we fire 700 people.
Then we have to bring them back with a very nice sorry.
We don't have to do that.
Right.
That sounds very hopeful.
Thank you for that ray of sunshine in these bleak times of AI,
where I'm thinking about,
am I going to lose my job because of voice AI?
But, okay, on a serious note,
in many countries,
there are very strict legal requirements
that require voice AI agents
to identify themselves to the caller, right?
We don't have that rule in India yet.
Did you encounter any conversation
in your reporting about whether customers
even have the right to know
or who or what they're speaking to?
Okay, we do have an interesting
predicate in India where there are obviously rules from the government saying that AI generated
content has to be marked clearly as AI generated content. And one can make the argument.
Like I did speak to a lot of lawyers and IT experts for the story, for a bunch of my stories as well,
who make the argument that technically this isn't AI generated. This is an AI board speaking in a script
which you have generated for it.
So there is a little bit of a gray area
as to whether a voice AI agent has to say that it's a voice AI agent.
For example, when Ananya is talking to you
and she's asking you, have you thought about getting a credit card?
She hasn't generated that sentence.
Like, we are telling her to ask that.
It's essentially a company replacing a human talking to you on the phone
with a bot talking to you on the phone.
Then again, do I feel like ethically people should know
that they're talking to a bot?
Yes, obviously.
Like, the more information you have, the better.
And there are, like, people who are obviously very good at also telling, like,
you're talking to any agent, like I explained before.
But we do exist in quite a few gray areas in this country.
And I do think we are coming up with new AI regulations, with AI rules to get there.
But it'll take a bit.
All right.
On that note, thank you so much, Manmay, for joining us.
Looking forward to your next story.
Thank you so much for having me and no, I don't think an AI agent could ever replace you.
