The Data Stack Show - 176: The Fundamentals of Event-Driven Orchestration and How Generative AI Is Shaping Its Future with Viren Baraiya of orkes.io
Episode Date: February 7, 2024Highlights from this week’s conversation include:Viren’s background in data (0:39)Evolution of Orchestration (1:52)AI Orchestration (3:00)Understanding Conductor and orkes (6:26)Event-Driven Orche...stration (8:10)Viren’s Transition to Founder (12:27)Non-Technical Aspects of Being a Founder (15:50)Democratizing AI for Developers (18:16)The evolution of microservices orchestration (21:56)Challenges in appealing to the 99% developer group (24:32)Value of orchestration for developers (30:31)Role of orchestrators in managing faults (37:37)The intersection of AI and orchestration (40:27)Evolution of AI (44:04)Thriving in AI Environment (47:58)Final thoughts and takeaways (51:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome to the Data Stack Show.
Each week we explore the world of data by talking to the people shaping its future.
You'll learn about new data technology and trends and how data teams and processes are run at top companies.
The Data Stack Show is brought to you by Rudderstack, the CDP for developers.
You can learn more at rudderstack.com.
We're here on the Data Stack Show
with Viren Bhairia
and it's
so great to have you back on the show.
Amazing that it's been, I guess
we can say years, about a year and a half.
So thanks for giving
us some time, Viren. Absolutely.
Nice to be here and thanks for
hosting me. Absolutely.
Well, you've covered quite a bit of ground since we last talked, but can you just give
us a quick overview of what you've been up to for the last year and a half and the company
you've been building?
Yeah, absolutely.
So I remember the last time when we chatted, we were just kind of had to come out of the
stealth mode.
We were still kind of focusing on building the product.
We were basically taking the conductor
and building Orcus to
make it available on enterprises, on various
clouds. And of course,
it almost feels like
now decades that we started
this. But
in the last couple of years,
we have built out the
product that works in all three
clouds, built the partnerships with all the cloud vendors, have customers onboarded, which is great, keeping us busy.
And fortunately or unfortunately, they are across pretty much every time zone.
So, you know, also keeping us busy all the time. But it has been an exciting journey building a company from ground up and
going all the way from zero revenue to some revenue. Absolutely.
Yeah, that's amazing. And Viren, last time we talked a lot about microservices,
orchestration of microservices, right? And orchestration in general comes up like more and more often lately.
There are like a lot of conversations about when it comes like to, let's say, more of like the application development layer and like this fusion of databases, like transactional
systems together with orchestration.
And then of course there's AI, right?
Which anyone who has tried like to build something around AI, they definitely know that it's all about how to guardrail
all these different models and services
to achieve a consistent result.
So orchestration is becoming a very hot and interesting topic
and broader topic to what we were discussing a year and a half ago.
So I'm very excited to talk about that.
And also hear about like the evolution of orchestras from back then to today.
Right.
Yeah.
What, what about you?
What's you are excited to talk about today?
Yeah, I think as, as you rightly pointed out, right.
Like when you think about orchestration you know, the humble roots, right. Orchestration has been around for a long time. But lately, you know, it has kind of taken its own kind of form when it comes to AI. And just like everybody else, I think that's one thing that is very exciting that is happening. And you know, how that lands with orchestration and you know, where, I mean, now we hear a lot about AI orchestration, right?
That definitely was a thing before,
but nobody talked about it.
So I think overall,
I think the entire orchestration space
is kind of evolving very rapidly.
And where it is going,
I think there is certainly an exciting place today to be.
Yeah, 100%.
So I think we have a lot of very interesting things
to talk about. What do you think, Eric?
Well, let's jump in.
Let's do it. day before we recorded this, TechCrunch published an article about Netflix abandoning the Conductor
project, which you helped to build inside of Netflix, and Orcus, your company that you've
been building for the last couple of years, forking it and taking ownership. So love the
timing on that. Can you give us a little bit of the backstory
and sort of tell us about that news?
Yeah, absolutely.
I mean, title is a bit clickbaity,
but in the end, we have been working with Netflix for a while.
And the idea was this, right?
That Conductor has become very popular
as an open-source project.
Today, when you look at the number of companies using it,
these are like who's who of a lot of tech companies and large enterprises right and supporting them as
a community is definitely a full-time job it becomes a challenging thing so you know we kind
of stepped in but started working with netflix and the overall goal there was that like you know how
do we enable community to be kind of you know partner here right how do we kind of get there
get them to be more excited about it,
you know, get them to give more ownership stake
into, you know, the product roadmap,
how everything kind of moves.
And only feasible way there is, you know,
essentially, you know, you create a foundation around it
and, you know, get everybody to participate.
So I think that's basically what we decided to finally do.
It took us a while because, you know,
you have to kind of go through all the legal
and other kind of, but good thing is it has happened now. So, you know, you have to kind of go through all the legal and other kind of things. But good thing is it has happened now.
So, you know, there is an exciting place.
You know, the initial feedback from the community also is very encouraging because suddenly they feel like, you know, now they have ability to kind of be part of just going to make the project much more stronger in terms of,
you know, its adoption, its visibility, and, you know, how community can contribute.
So it's very exciting. It's super exciting for us.
That's great. I think maybe what happened with the clickbait title was that
originally it was Netflix hands off, but then, you know, they asked the ALM to write something
that would get more clicks
and they changed handoff to abandon.
Well, let's just, you know,
for the listeners who didn't catch our last episode,
which if you didn't,
you should go back and listen to it.
Just give us a breakdown of Conductor and Orcus
and just describe what the products do.
Yeah, absolutely.
So Conductor, essentially, its core is an orchestration engine.
And orchestration, of course, is an extremely loaded kind of term, right?
It means different things to different people and personas.
But at the core of it, the Conductor was designed and built to build event-driven applications,
right?
Applications that respond to events that are happening in the business context.
And this could be orchestrating microservices,
orchestrating kind of events on different,
you know, messaging bus and things like that.
So that's what Conductor does.
And it does it very well in terms of, you know,
handling both business and process complexity,
as well as, you know, being able to handle it
at much, much larger scale and Orcus basic was founded to
kind of take conductor and, you know, provide an enterprise version kind of
realizing that, you know, there is a need and a demand for a product like this.
But at the same time, just like, for example,
the Linux is completely open source.
You can just go to kernel.org
and build your entire Linux
and get all the GNU projects.
But for an enterprise,
you probably want to work with a vendor
to get everything ready, right?
Sure.
And then this is the model that,
I think over the last,
I would say a few years,
has been perfected by a number of companies
in terms of how do you build an open source project and also monetize that.
Yeah, love it.
And do a quick refresher for us, because like you said, orchestration is a very loaded term.
And when we think about the world of data, you tend to think about pipeline jobs starting or completing or failing.
But Conductor really includes pipelines, but it really encompasses sort of any microservice.
And so just help us understand that's a much, much scope you know of orchestration that is correct yeah
like when you think about data pipelines you know data pipeline tend to be you know a lot more kind
of i would say a code screen right you have kind of pipelines running on a daily basis every step
in the pipeline runs for sometimes hours and hours at the end And then there's an extreme end of that where you have
microservices or event orchestration where every step completes in milliseconds and then you are running millions and millions of them every day. And the audience is very different in terms of
who is writing those things. On one hand, you have data engineers focusing on data pipelines.
And the important thing there is also the dependency management, right?
Like, in fact, data pipelines are a lot more, are kind of really event-driven systems.
Because, you know, you start a pipeline when something happens, right?
A file arrives or some job completes and whatever not.
We don't really think about it that way, but that's really the essence of it.
And on the other hand, it's very similar as well. And what's interesting is somewhere in between comes this process orchestration,
which is a lot more kind of human centric, where you also have human actors in inside the process,
you know, taking different actions. And very good examples here are like, you know,
the approval processes, you know, with various use cases, right? Loan application approval is a classic example where somebody has to kind of review it, right?
Oh, yeah, yeah, sure.
Right?
So, and, like, you know, that's in terms of, you know, what kind of systems you are building.
But then, like, you know, orchestration for, you know, the end user also means different things.
Like, for data practitioner, of course, it's data pipelines.
For software engineers, it's more about micro services and events but when you start to go outside of the
boundary of just the engineering right when it comes to product it's about how is my product
built you know and what are the nuts and bolts right what are the optimization opportunities
that i have a very good example i would say is let's say in a supply chain if you look at your
as a product manager if I look at my process
and see, you know,
how long does it take
for somebody to place an order
and until the order arrives
at doorstep,
if it takes three days,
what are the steps?
How long does it take?
And if I want to cut it down
from three days to two days,
where should I optimize?
Like, where is that process?
For them,
it's a totally different,
you know,
thinking about it, right?
And if I'm in the support,
I want to know
what's going on and, you know, how can I fix things, you know, more than anything else.? And if I'm in the support, I want to know what's going on
and, you know, how can I fix things,
you know, more than anything else.
I don't care about how it's built
and what's the use case and things like that.
So I think, and then I think on an extreme end of that,
like as we go more closer to the metal,
you have orchestration of the infrastructure, right?
Kubernetes is an orchestration engine in the end, right?
Orchestrates your components.
Sure.
So I think when people think about it,
depending upon what persona, what head they are wearing,
you know, they think about it differently.
Conductor is, as you said, it's a very broad,
as a matter of fact, when we built Orcus,
our entire Orcus tech runs on Conductor,
which means our CI, CD, our deployment,
our entire cloud provisioning infrastructure runs on Conductor.
So we use it as an infrastructure orchestration to process orchestration.
We run our customer service on Conductor.
We run our stand-up bots on Conductor.
We run our AI bots on Conductor.
It's like basically dog food.
Wow, that's a, what a cool opportunity to dog food your own product and sort of build
your entire company
infrastructure. Well, I'm interested to know, I want us to dig into the tech aspect of this
on the show and just hear about what you've been building, right? Because it's been a year and a
half. But I'm interested to know just on a personal level, you've tackled some gigantic
engineering projects. Obviously, Cond know conductor still has a lasting
legacy and you're a big part of that what has the transition been like going from being an
engineering leader inside of these really large organizations with really gnarly engineering
problems to being a founder and you know a couple a couple of years ago, starting out with just a couple of
people and, and building. Yeah. I think that's an interesting, uh, uh, you know, interesting
question. Like when you think about like, you know, working at a large organizations, right.
Like, and you know, Netflix, Google, for example, where, you know, as an engineering leader,
you know, you are a lot, one is like, you know, you are part of a really big machine.
So, you know, there are a few things that you never have to worry about.
Things like, for example, you know, how much is it going to cost?
Like a good example is, you know, when we ran predictions engineering at Google,
the amount of resources it took was insane.
Like it took a data center to run those things, right?
Because it's processing data from internet, which is kind of huge.
So, you know, you go from that level of things, right? Because it's processing data from internet, which is kind of huge. So, you know, you go from that level of things, right?
Like, and when you think about even the numbers, right?
Either the revenue or the users you talk about in hundreds of millions and billions.
Yeah.
It's your kind of denominator, right?
Like when you say three, it means 3 billion and not 3 million or not 300 million.
And then you go from there to being a startup founder you know you have to now think about
everything right like you know is it cost 200 to run or 300 where can i optimize so you know
cost is one part but more importantly now you know there is there's no support system you you
are the support system you are at the end of the the chain right there is nobody to complain to
which means you know you you are not only responsible for engineering decisions,
but also business decisions, company strategy. You are an engineering leader. You are a product
leader. You are also an HR person, right? At least in the early days. Um, and how you build
your product, how you build your team also has a lasting impact on how your company is going to
grow. Because, you know, if you investing drone tech or wrong people wrong people, that's not going to kind of turn out very well, right?
So there is definitely kind of that major shift
in terms of, you know, how things kind of move.
At the same time, like, you know,
when you look at like big companies,
when you're working for it, right,
there is kind of a cushion, right?
What's the worst that can happen
when you take on a project?
The project can, you know,
take longer to complete, it can fail.
It does not necessarily materially impact you.
Yeah.
I think company is different.
If a company fails, you fail and there's a lot more at stake, right?
So the stakes really go up quite a bit.
Right.
You have employees, you know, who are relying on you.
And now suddenly you have to think about like, you know,
if you have 20 people in your company now, you know,
and if you make a wrong decision, you're going to impact the livelihood and
basically you have to impact 20 families.
You have to be very thoughtful about, you know, what you do, how you do things.
And this has nothing to do with the business.
It is, it's purely just people, right?
Yeah.
So you have to also think about the people aspect of building a company,
running a company, and It's very different.
A lot of learning experiences.
Early days, we were also the business development people.
We were also doing sales.
I had no idea how sales worked.
If somebody asked me, what's the sales cycle, I didn't know what it meant.
I can today tell you what is my sales cycle, right?
I can talk to, I can interview a sales leader.
Let me tell you this. So you end up learning a lot more. So I think that's an interesting
journey as well. Yeah. You're obviously the CTO and so your job is deeply technical.
What, in terms of the non-technical aspects of your job as a founder, leader, you know,
wearing multiple hats, which aspect of your job do you like the most from a non-technical aspects of your job as a founder leader you know wearing multiple hats which aspect
of your job do you like the most from a non-technical standpoint i would say in terms of
non-technical understanding you know how do you kind of sell like we are an enterprise SaaS company
right so learning and like you know understanding how to sell to enterprises is really eye-opener you know
and like you know understanding those kind of processes you know how companies operate
yeah it has been a very fascinating thing because i have always worked in large companies where like
you know the purchasing decisions are always made by the purchasing team you never directly deal
with them you are on the consumer side but you're on the other side.
And like, you know, understanding those nuances is very interesting.
The other thing that I really love is, you know, we are a company founded on the foundations
of an open source site.
So I think putting out communities and working with community, it's not very technical, but
it is also deeply satisfying
because when you see people commenting good things about the product, you know, adopting
that, even they don't have to be your paid customers.
But, you know, it's deeply satisfying from that end that like, you know, what you did,
like, you know, definitely helps people.
And, you know, so yeah, these are the two aspects that I really enjoy.
That's so great. Well, congratulations. I mean, it sounds like it's been an incredible journey
and learning experience. Let's start to focus a little bit more on the technical side.
I know one of the things that you and Kassus are excited to talk about is, you know, AI and how
orchestration and AI fit together. And you have some really interesting thoughts about
software development. So as a preamble to that, what I want to ask about is the perspective that
you're bringing to that from your time at Google. So at Google, you worked on a product that allowed
people to take advantage of Google's machine learning infrastructure. So a product that allowed people to take advantage of Google's machine learning infrastructure.
So, you know, a product that made predictions.
You know, when you sort of were working on that, you know, sort of maybe it seems like right,
right up a little bit to the edge of, you know, this massive explosion and,
you know, the AI craze driven by large LLMs. But what perspective do
you bring to AI based on your experience at Google and actually building a product around that?
Yeah, I would say, you know, when you think about like, you know, working on AI or machine
learning, right, there are two aspects to it. Either you are a deeply technical person
researching models or working on foundational frameworks like TensorFlow, for example,
or working on the hardware side and building chips. What was interesting about my time at
Google was our focus was how do we democratize AI for an average developer whose job and kind of
whose primary thing is like, you know, I'm building an app and my app kind of sustains
itself through either ads or kind of, you know, in-app purchases and subscription management.
And, you know, when you look at companies like Netflix, right, who actually pioneered, you know, how do you kind of create higher level of user engagement through A-B testing and personalization?
How do you kind of make the same kind of technology available to kind of an average developer who does not have that kind of resources?
And it's not even possible because they don't even have that kind of data available to begin with.
Sure. Yeah, yeah. And the challenge there was like, you know, how do we kind of take, let's say you are
building an app and this is a simple game with maybe say 10,000 users who are playing
the game.
Now, 10,000 data points is probably not sufficient to train a sufficiently large, you know, largely
accurate model.
So, you know, how do you solve that problem?
And the way we thought about it was that
yes, you have 10,000 users,
but if you look at internet as a whole,
there are probably
4 to 5 billion apps
in the world
from which you can train a model.
And that's large enough. It's more than
a large state of data
to train a federated model. So basically
what OpenAI did today, we kind of took the same approach that like,
you know, there's all this data coming in, right?
Can we get insights?
We can be infer and, you know, figure out the user personas and, you know, essentially
make it available as a service to developers.
So that now, instead of like me trying to kind of either, you know, invest, if I'm becoming
like Unity, maybe I can do that.
But, you know, small time developers or even an average unity, maybe I can do that, but you know, small time
developers or even an average developer, you don't have to think about it.
I can say, you know, Hey, this is a user, you know, tell me likely you would have this
person making a purchase or clicking on an ad or, you know, staying engaged in my app,
you know, and if it is, so we actually do two things.
So, so first of all, this, you know, that like, you know, telling me the likelihood
of person doing next.
So that's like, and the way we thought about it was that that was good enough initially,
that let developers make a decision. Now, if you think about it from a developer's perspective,
what decision are you going to make? Most likely you're going to flip a coin and say,
I'm going to try something. So that's basically 50% chance in terms of you're going to be right or
wrong. So we started working on the second part of that to say, you know, can we now optimize this?
So we essentially launched later on an optimization as a service as part of Firebase, which is still
there, I think, in production is, you know, you tell your objective, you want to increase engagement, spend, or get them to renew the
subscription, and we'll figure out what are the right experiences that you should deliver them.
And you tell us what kind of experiences you can deliver, A, B, C, D, and then we'll find out the
right user bucket. So that was what we did with the AI and machine learning. And I think
the whole thing around democratizing
AI, it has become now commonplace, but those were the days where Google pioneered some of those
things. Yeah. So, Viren, about a year and a half ago, we were again chatting and we were talking
about microservices of the scale of Netflix
and what it means like to have all these and why they are needed and why we need like software
like Conductor like to do that.
What has changed since then?
Because the reason I'm asking is because like I think one of the, and I'd love to hear your
opinion, personal opinion, like on that too, and experience. But I think one of the things that
happens with people that, you know, they, they work in these very unique environments, right?
Like Netflix or Google that have unique problems in terms of scale, but also like unique, as you said, resources
and unique talent, right?
Like the talent that you find in these companies
like is rare.
But where you go out in the market
and you build a company
and you try like to bring,
let's say all these innovations from these companies
like to the rest of the market,
you start experiencing like differences, right?
Like the rest of the world is not the replication
of like Google and Netflix, right?
So my question is about what you have experienced
through this one and a half year,
like working with the market out there
and what the difference is between like an organization like Netflix and Google and the rest of the market out there and what the difference is between like an organization like
Netflix and Google and the rest of the market and what has changed that's the second question
about like the product or the technology from purely talking about microservices orchestration
to what orchestras can do today right And what's the link between the two?
Because when you build companies,
obviously you react to the market signal, right?
That's why I'm trying to bring these two questions together.
Absolutely.
So I think the first question is a great one.
And that was a very insightful thing
as a founder also to kind of learn.
You know, my history has been
like an I work at Netflix, Google, Goldman Sachs,
so, you know, very tech forward companies. And you at netflix google government sag so you know very
tech forward companies and you get to work with a group of talent right and when you build a product
for companies like this you know you have certain user persona in mind right these are my developers
this is how they work and then you try to bring that market right so one kind of thing was that
like you know when you look at the developer side there is what i would like to call
this 99 developer right and the one percent i think when you think about tech companies those
tend to be more one person developer they like hard problems they like to solve big challenges
and then think about distributed systems and everything when you look at the rest of the world
let's say if i if you go to let's say general electric right ge or some
traditional company the focus there is i have this feature to be built i have this product to be
launched and this is my timeline how can i get there fast right there's less thinking about that
also you know not everybody can pay google netflix level 7 and there's not that kind of talent
available also in the market right which means you, you're also working with very different levels.
You know, you have kind of very junior engineers,
you have principal engineers,
but, you know, there are less of them.
So one thing that we quickly realized was that,
you know, for a product to be successful,
it's very important for the product to appeal
to that 99% developer group
who is not interested in solving
distributed systems,
very difficult, hard NP, hard problems, right? They're interested in solving distributed systems very difficult hard
np hard problems like they're interested in solving their current problem which means usability is
paramount we think about usability in like you know when you're building an app or a site right
people a lot of times don't think about developer experience but you know developer experience is
paramount i in my personal opinion i don't think we are a hundred percent there. I, we constantly keep on, you know, improving and try to kind of work with
very junior, sometimes, you know, freshmen developers to kind of figure
out like, you know, where can we improve this?
But that was the number one thing, right?
That what matters to them is very different from what we kind of initially built.
Right.
And their skill sets and, you know, where you should focus is an,
was an interesting insight.
Um, yeah, that makes total sense.
And I think it resonates a lot with also my experience.
And actually, I would add to that,
it's not just a matter of quality of talent.
It's also that when you're talking about General Electric,
the core competence of the company is not distributed systems.
They don't care about that.
They shouldn't care about that.
That's not their thing.
The same thing with Bank of America.
All the companies out there,
they are obviously Fortune 100 type of companies
because they do something really well.
That's right.
For sure, not how they build data centers
and distributed systems.
Exactly.
For example, Bank of America would probably want their last industry systems right exactly yeah exactly like you know for example
like you know
you know
Bank of America
would probably
want to spend
their efforts
and energy
on making sure
that banking
is a first-class
thing right
it's the rock
solid
it's the best
in class
not how do
I solve
the industry
but like
that's not
their core
competency
there's no
point
investing
in that
right
it doesn't
make any
sense
from the
business
perspective
companies
like
Netflix
they're all
about tech
tech is what
drives those
companies
so yeah 100% so okay perspective. Companies like Google and Netflix, they're all about tech. Tech is what drives those companies. So it's a different perspective. Yeah, yeah, 100%. So, okay, you mentioned
developer experience and a different, let's say, prioritization in terms of what features
are more important or what value is perceived by the user, right? Like, or what is valuable for the user?
What has changed in terms of like, let's say the use cases?
Because we started and we were talking back then again, a year and a half ago about orchestrating microservices, but what are the use cases today, right?
The dominant ones.
I think it has definitely evolved and in some of the surprising ways as well.
So, you know, microservices where one thing I think today, when I look at even the current
set of use cases, predominant are more event driven kind of orchestration, you know, service
workers, you know, microservices also kind of got a little bit of kind of negative press
recently, right? With
you know, the blobs
and everything. Not necessarily everything is
kind of, everything is in the right
context, right? But
like, you know, you sometimes don't need
microservice as in like, you know, an HTTP
or gRPC endpoint for every
problem and every solution. Service workers
are actually much more lightweight
and probably
better in terms of infra and speed of development and deployment and everything, right? So that's
one area where I've seen like a lot more kind of usage of Conductor and how people are using it.
They think less about, and also like, you know, instead of saying that like, you know,
every deployment is one microservice, you know, sometimes your deployment with,
which almost looks like a monolith, but then you
have different components talking to each other
asynchronously, and therefore it is still not
this monolith, but rather kind of
an event-driven system.
So that's one area where we have seen.
Another surprising set of use cases that I've
seen is, how do you build
user experiences? And this has come to me
as a complete surprise is,
there are times where you want to drive your user experience
based on various different parameters and you want to make it dynamic.
And traditionally there is one way to do that is that you encode the entire logic
in your UI application or your mobile app.
Mobile app actually where one of the trailblazers in that area, because you
know, UI is very straightforward.
You can deploy a new version and everybody gets it.
Mobile apps, people go download.
So developers are already kind of doing that by using things like remote config or launch
directly where they were driving user experiences based on the feature flags on the server side.
But now I have started to see those things happening with Conductor as well, where you
have a UI flow designed in Conductor and then UI is driven based on that flow.
And then a product manager is changing the flow based on the experiences they want
to drive and things like that.
That was a very surprising use case.
But I think what I think about it makes a lot of sense.
Yeah.
Okay.
So that's super interesting.
But who is the user here?
We talked about like product people, we talked about developers.
Obviously, we're talking about application development here,
but application development is a complex thing, right?
It involves a lot of different products in there,
different types of developers, from front-end developers
to even DBAs at the end, like managing their own cases.
So who is the user who gets, let's say the most value, like from a system like Orkish?
I think it's the software engineer, right?
Like the developer working on the backend or frontend.
I think in the end, this is the persona that we kind of build a product for.
Everybody else gets benefit out of it, but that's not intentional.
It's more of a by-product.
But in the end, you end, it helps the developer.
If I'm using something like
Conductor, now I don't have to think about
handling error cases and
resiliency parameters and all of those things.
I can just
think about building stateless
screens and orchestrate
that separately.
In the end, the developer, their life becomes much easier
more than anything else.
Do you see more of the front-end developers
being, let's say, the owners
of that, or is the
back-end developer? And how do they work
together, right? Because that's always like
I see their faces between
developers are always a very interesting
topic and
a hard problem to solve in general.
I agree. I don't think
we have seen a solution either.
Today, predominantly,
our developers are mostly backend engineers,
the way I see, and
people that we interact with. Frontend
apps,
again, it's still a very
small percentage. It seems to be growing,
but I would say right now, it's mostly backend very small percentage. It seems to be growing, but I would say right now it's mostly backend engineers.
How do they work together?
I think systems like Conductor, typically what I've seen backend developers doing is that,
they built out the API using Conductor and mock out the data and then frontend can keep
working on it.
And then they slowly implement stuff that starts to give you real data as opposed to
mock data.
We have at least a couple of customers that have adopted that kind of implement stuff that starts to give you real data as opposed to mock data. We have at least a couple
of customers that have kind of adopted
that kind of strategy and
have been kind of pretty good at that.
Yeah, well, that makes
all sense.
Okay, there is like
there is a new wave
of, let's say,
transactional systems out there, right?
Like there is this attempt, especially after, I would say,
Heroku went out of market,
to, in a way, build on the legacy of Heroku.
Because Heroku, I think at the end,
was just too early in the market.
But they had some amazing ideas there.
I think the legacy of Heroku will live
and will drive a lot of innovation
now that the market is more mature
for this kind of product.
So we see a lot of conversations
about these new types of backend systems
that are kind of a fusion between like a database
system together with an orchestration system and okay like in some cases like some other stuff too
but i'll focus on this too because i think like the main conversation is like how you mix
workflows together with transaction boundaries there? And what does this mean in terms of managing infrastructure
and building applications?
So what do you think about that?
And what's your opinion?
And what do you see is going to happen at the end?
I think that's... I mean, yeah, you're right.
The kind of void that Heroku left,
there has been some attempt of fill that gap.
And I think conceptually, if you think about, right,
like mixing databases with workflows is an interesting concept.
Like I think back in the day, we used to have stored procedures,
which will do kind of almost the same thing.
And it worked well when like, you know,
you had all of your data and, you know,
business logic inside one database.
What has changed now with this new kind of databases and systems is instead of a stdproc
in PL-SQL, you are writing JavaScript code to achieve the same thing, and you're working
with more like a NoSQL database, like a document-oriented database.
So that's an interesting concept i think and in some ways kind of firebase also did similar stuff
with you know combination of firebase database and triggers that you know executed firebase
functions to you know do kind of a lot more stuff right and there is kind of definitely a need for
a group of developers you know i would say a lot of app developers for example who needs
back-end because you know they are very good at building mobile apps,
for example,
or building the front-end experiences.
But to drive the data and business process,
they need some backend.
And either you kind of have another team
that is focusing on backend development,
which may not be possible
if you are kind of a singular developer
or a small team of developers
focusing on building a game, right?
Focus on game experience, then building a backend.
So this is where I, I see there is kind of a value where like, you know, your
processes are relatively simple.
You, when you want to insert a record, uh, you want to run some small process
or a business logic that drives a bit of a workflow and you have a single
source of truth for the data.
So I think there is definitely a place for it.
One, my experience is that Firebase also told me that, like, you know, that quickly, it becomes very good for prototyping stuff.
Yeah.
But almost 100% of our customers did that, you know, they did use Firebase.
And the moment they got more production ready, they moved out into more MongoDB or something like that,
or Cassandra, for example.
And then they kind of invested into more proper serverless
or container-based systems and whatever not.
So I think that's how I see those systems
that are very good for getting your prototype up and running
without having to have a full-fledged backend.
Yeah. getting your prototype up and running without having to have a full-fledged backend. My question is not from personal experience, to be honest, but
more of my overall experience as an engineer.
The value of having, let's say, an external orchestrator, and again,
my experience comes more from
the data infrastructure, where things tend to run much longer.
So the possibility of something breaking there is like higher, right?
So having like an external orchestrator is that you have like in case of a fault happening, you have a different system that can take control. If your database
fails, let's say, then your orchestrator can execute logic about how to manage the failure.
But when you put the processes together, things get a little bit more weird there, right? So is there, and that's like a purely,
let's say like engineering question.
And I am asking you like as like,
okay, like one of like
the most experienced engineers
that I know with like
these architectures,
is there like a way
that like we can guarantee
that if we put together,
let's say like a transactional database
with an orchestration system,
that when something will go wrong with the database
for whatever reason, the other process that is responsible for the orchestration is going
to remain fault tolerant and do what it's supposed to be doing.
Yeah.
And I mean, that's exactly the purpose of having an orchestrator, right?
And especially the next generation of orchestration, like Conductor, for example, which are basically
not a single point of failure.
They are more distributed, right?
So like, you know,
gives you much higher availability.
And, you know,
you are also kind of de-risking
and decoupling yourself
from a single database.
And it does kind of help you
two ways, right?
One is,
if your database goes down,
you can operate on a cache data
and apply circuit breakers
and things like that.
So, you know, depending upon
how, what kind of user experience you want to drive, you know, that becomes kind of real thing,
right? You kind of avoid at the same time also, like, you know, you can also do things like,
especially in a read-only scenario, you can also do things like hedging, you know, send requests
to multiple databases and make sure that like, you know that you are able to serve it.
And also, now suddenly it opens up an opportunity for separating out your local transactions with a global kind of transaction.
If you think about it, let's take an example of a payment processing system.
When I want to transact and do kind of, let's say I want to send an ACH through the FedBuyers system, I want that system to be a transactional.
So, you know, that particular service could have a local database that maintains the transaction and operates in a global thing.
But globally, you know, you could have other systems also, like, you know, sending out an email, which is like, I get two emails, you know, do anybody care?
As long as I get two emails, I get an email, right?
At least once.
So now I suddenly, you know, there's an, there is also, you can, there's an opportunity to kind of optimize and like, you know, make systems a lot more decoupled.
And Orchestrator essentially maintains your state, right?
Overall.
And then you can change it however you wish to, depending upon your, how your process
change. So not only it gives you you resiliency but also gives you flexibility in
terms of you know how we can you know do things yeah 100 that makes sense cool so let's move a
little bit like to ai because i think if anything about like ai is what happened is that somehow everyone is like becoming some kind of
trying to figure out like a new type of orchestrator there because at the end what we
have here is that we have like a system that it's not reliable by definition so what we used to do
as the exception with like distributed systems for example, where we had provision for when something goes wrong with the orchestrator, now it's pretty much the opposite.
Every time you get a response, I mean, on a semantic level, it's not like an API error, but you need to ensure that at the end you get what you are looking for, which is in a way,
it's like managing faults at the end, right?
Like it's not that different, like from an engineering and
like design perspective, right?
So what I've noticed is that like, okay, like systems like a land chain, like
well, all that stuff that we see out there at the end, they're like
specialized, like orchestration systems.
What's going on with that?
How many different flavors of orchestrators will have?
Because it starts becoming a really hard thing to talk about that stuff.
It's almost funny.
If you talk with a distributed person and you say orchestrator, something almost completely
different compared to what orchestrator means for the data engineer.
And if we talk about someone who builds applications with LLMs, again, something completely different.
So it's very interesting what's going on there with the definition.
So tell me how you think about that and what you see actually happening out there.
So yeah, I think LLMs are interesting, right?
Because suddenly, you know, because LLMs are inherently kind of asynchronous, very
latency, high latency systems, right?
And you know, it does this one thing.
By just executing a prompt, you can't build a system, you need to also chain other things,
right?
And that's where kind of, you the, suddenly the orchestrations and workflow
engines kind of became a lot more important to when you want to build
applications that leverage is LLM.
There's one more problem with LLM, which is when you think about building an
application using a traditional way.
It's almost pretty guessing or traditional, right.
Is through APIs and all they are deterministic.
If I send a simple query, I know what I'm expecting.
LLMs are non-deterministic.
So you also have to handle non-deterministic aspect.
And the moment you put non-determinism in your application flow,
now you have other things to worry about, which is compliance and security.
And I've seen this now.
Companies want to use LLMs, but they are very worried about the aspects of compliance, security, and reputation damage if something goes wrong.
So now you also have to put some guardrails on top of it.
And the guardrails can come in the way of leveraging another LLM to do some sort of
adversarial validation of the output.
And for very highly sensitive systems, maybe also humans, right? Who can actually review and validate whether this makes sense or not.
But all of this requires orchestration.
It requires you to build flows, which are kind of very flexible, can change.
Um, and if you want to kind of run this in an environment,
you also need a distributed system because, you know, now, you know,
everything could be running differently, right?
Your LLM is running on OpenAI or Azure or Google, whatever, right?
And your systems are running somewhere else.
And then came an interesting mix with vector databases,
where suddenly now, you know,
you have this retrieval augmentation generation where now you also look up a
vector DB with a namespace.
How do you protect that? Again, the same set of problems.
So I think as you say, right, like suddenly, you know, now a different
class of orchestrators are coming up focusing on just purely prompt chaining.
But prompt chaining alone doesn't get you an application deployed.
You also need to make an API call and look up a database and process and,
you know, add humans and everything.
Right.
So that's where I think long-term everything is going to converge into, need to make an API call and look up a database and process and add humans and everything. So
that's where I think long-term everything is going to converge into probably one or two
orchestrator which can do all of these things and do it very well at a scale.
Yeah, makes total sense. Eric, I have a feeling you might have more AI related questions,
so I want to give you time to do that. Well, I think we're fairly close to the buzzer here, but
Viren, one of the questions I have is, I think
you have a very interesting perspective on AI in
general. Obviously, there's a ton of hype. There are a lot of statistics thrown
around, a lot of clickbait
article titles that you've built. You've built products like
this inside of companies like Google with vast, let's just call it unlimited access to data and
unlimited access to compute resources. How can people separate the hype from what is real?
And I think that's, even for people who are very technical,
it's easy for us to see
the immediate sort of practical benefit of,
you know, being able to draft a blog article
or, you know, have, you know,
even get support on a SQL query,
the best way to optimize it, right?
I mean, that sort of very synchronous feedback
on specific problems that would normally take
a long time to research through traditional search methods.
I think everyone's like, okay, this is great, right?
I don't want to go back to the previous world.
But when you talk about, okay, I want to use an LLM to predict the next best action for this user based on all these
inputs. Well, even though you don't have to build the model that recommends the next best action,
right, which is a huge step forward, that is still phenomenally difficult to get right.
And so there's this huge gap. How big is that gap? Or maybe I'm over-interpreting it.
I mean, definitely there is a gap in my opinion. And definitely the gap also is shortening because
there's a lot of investment that is happening both in terms of research, people,
money, infrastructure that is going into it.
But the way I see it is, if you look at the current state of the world, there's a lot
of investment happening onto the foundational models.
Like started with open AI, but now you have plethora of models.
Some of them are completely open, like Lama 2.
So there is one aspect of that, which is essentially democratizing the foundational models.
Like foundational models are going to become like Postgres.
Everybody has access to it and everybody can use it.
The question is how do you use it and what do you build out of it, right?
So then the question comes is like, you know, okay, I have a very powerful model that can,
you know, do a lot of interesting things, but how do I actually use it to solve my business
use case? And I think that's where today the gap is you know to be able to say because as i say right
like elements are still non-deterministic and they lack the consistency that you know we are used to
having in a kind of a normal system yeah so you know bridging that gap is kind of where i think
there need there is a need to kind of do that so that as an enterprise, as a company, I can say, you know, I can safely incorporate this LLMs into my flows and kind of make use of it.
And I think the way things are also probably going to go forward is it also is changing the way we interact with the systems
right today you know everything has to have a ui and you know you have actions and buttons and
everything a lot of times you can build a lot more chat based interfaces right so like assistants
are probably going to become a lot more commonplace but that also brings in an interesting question in
terms of how do you actually build those things right that's one area where we are also focusing
on yeah and hopefully by the time this is published, we'll have something
really exciting there. Very cool. Very cool. So do you think that the companies that are
going to thrive in this environment are the ones that help fill that gap? I think so. I think so.
I think so. Because there is going to be like, you know, I think it's almost like
a supply chain, right?
Like at the bottom of the supply chain is the hardware manufacturers and then you have
models and end of that is the kind of end user building an application.
But in between there is nobody right now.
Yeah.
That's where a lot of investment is going to happen.
And I think that's where the big opportunity is.
Yeah, for sure.
Because I think that there is this interesting dynamic right now of essentially wrappers
on an LLM that's just a UI.
And I mean, there will certainly be some companies that make progress there because there are
specific needs.
But I also think there's going to be a ton of companies that fail because the open AIs
of the world are just going to productize all of that, right?
And just completely take that business.
I mean, Google kind of is notorious
for doing that sort of thing.
So I'm interested to know,
when you think about at the enterprise level,
and especially as you think about Orcus,
we're talking about sort of incorporating an LLM
into a much larger
system. But there's also software providers. So let's just, you know, we're, you know, earlier
talking about sending emails, you know, and things like that. There are also a lot of software
providers who are packaging an LLM and then essentially, you know, sort of building functionality
on top of that within their own software, and then, you know, sort of building functionality on top of that
within their own software and then, you know, sort of reselling that packaged functionality.
Yep.
Where, what are the limitations there? How do you think about, you know, incorporating an LLM
in a bespoke way, you know, at the enterprise level versus, you know, sort of buying it as
part of a packaged software suite? Yeah, I think the former is more about like, you know, sort of buying it as part of a package software suite. Yeah, I think the former is more about like, you know, the,
how do you enhance a product using LLM, right? I want to send out an email. And if you notice,
Gmail had that feature for a long time, but it could auto complete your sentences.
Yeah.
So, you know, that kind of gives you the personal productivity and like, you know,
you enhance your product using LLM.
Copilot is a good example, right?
I don't have to necessarily push for the same amount of code.
I can just autocomplete everything.
IntelliJ has been doing that for a while.
So that's one area where, you know,
there is kind of the application of LLMs
into very bespoke products.
Enterprise applications are very custom.
Like, you know, they are rapidly changing. They always change. Like, you know, custom like you know they are rapidly changing
they always change like you know usually you know
you build an application the
kind of the life cycle of an
application or a lifespan of an app is
probably two years or max two and a half
you have to rewrite that thing again
and then you need basically tooling to be able to kind of
leverage an LLM to kind of
you know redo those things
some stuff or put it in a different
way. So then I think the other class of
application is where you
leverage an LLM to build those
bespoke experiences, but
you are using internally
or selling it to another customer,
but that's the other aspect of it.
And to me, that's a lot more interesting
because instead of a very fixed
problem, these problems are very dynamic. They are changing and they are very different from
company to company. Yeah. That's super interesting. Okay. Well, last question,
which this is always a fun one. Is there anything that worries you
about all of this new technology around AI?
I don't think so.
Like, wasn't the same thing when people were talking about when the computers came that
computers are going to like, you know, put everybody in more jobs?
I mean, it's good in more jobs, right?
The same thing happens, right?
I think there is, I don't have the specialization in like, you know, how people think about
things, but I think it's natural, right?
Like we are always a little bit wary of something is going to come in, you know, take over jobs
and everything. I think developers are getting more productive because Like we are always a little bit wary of something is going to come and, you know, take over jobs and everything.
I think developers are getting more productive because I don't have to
always go to Stack Overflow search for it.
My ID can complete for me.
I can do more stuff.
I can probably do a lot more and I can probably have a lot more quality
time for myself.
Businesses can move faster.
Maybe they can do more stuff.
So I have more work.
So in the end, I think everybody's going to benefit.
Yeah, I agree. I agree. Well, Viren, this has been such a wonderful time on the show. do more stuff so i have more work so in the end i think everybody's going to benefit yeah i agree
i agree yeah well viren this has been such a wonderful time on the show what a year and a
half it's been congrats on all the progress congrats on orcas congrats on you know the
fork of conductor and the foundation around that just so impressed with everything you're doing
and we'll keep cheering you on from the sidelines.
Yeah, absolutely.
Thank you for having me again one more time.
Yeah, absolutely.
We hope you enjoyed this episode of the Data Stack Show.
Be sure to subscribe on your favorite podcast app
to get notified about new episodes every week.
We'd also love your feedback.
You can email me, ericdodds, at eric at datastackshow.com.
That's E-R-I-C at datastackshow.com.
The show is brought to you by Rudderstack, the CDP for developers.
Learn how to build a CDP on your data warehouse at rudderstack.com.