The Data Stack Show - 238: What Every Developer Needs to Know About Microservices in 2025 with Mark Fussell, Founder & CEO at Diagrid
Episode Date: April 23, 2025Highlights from this week’s conversation include:Mark’s Background and Journey in Data (1:08)Mark’s Time at Microsoft (5:33)Internal Adoption of Azure (9:20)Understanding Pain Points (11:06)Comp...lexity in Software Development (13:15)Microservices Architecture Overview (17:15)Microservices vs. Monolith (22:08)Modernizing Legacy Applications (24:39)Dependency Management with Dapr (29:43)Infrastructure as Code (33:04)AI's Rapid Evolution and Vendor Changes (37:27)Language Models in Application Development (39:05)AI in Creative Applications (42:59)The Future of Backend Development (47:22)Streamlining Development Processes (49:29)Dapr as an Open Source Solution (51:11)Getting Started with Dapr and Parting Thoughts (51:39)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Hi, I'm Eric Dotz.
And I'm John Wessel.
Welcome to the Data Stack Show.
The Data Stack Show is a podcast where we talk about the technical, business, and human
challenges involved in data work.
Join our casual conversations with innovators and data professionals to learn about new
data technologies and how data teams are run at top companies. How to Create a Data Team with RutterSack
Before we dig into today's episode,
we want to give a huge thanks
to our presenting sponsor, RutterSack.
They give us the equipment and time
to do this show week in, week out,
and provide you the valuable content.
RutterSack provides customer data infrastructure
and is used by the world's most innovative companies
to collect, transform, and deliver their event data wherever it's needed, all
in real time.
You can learn more at ruddersack.com.
Mark, it has been almost three years since we've had a microservices expert on the podcast,
which is way, way too long. So we're so excited to have you
on the show. Thanks for joining us.
It's fabulous to be here. Thanks for being on your show. Yeah, I've got a background
about myself. I joined and worked at Microsoft for over 20 years, built many different developer
technologies there, and then started some open source project there called Dapper, and
then built and left Microsoft to start Dagger as a company.
And there we kind of focused on building distributed application microservice
architecture. So it's fantastic to be on the show. Awesome. So we were talking before the show some
Mark and one of the topics I love to dig into is this microservices versus monolith discussion,
which you had a great perspective on. So I'm excited to dig in on that. What are some topics you're happy to hit?
Well, I think that was the evolution of architectures from the client service
that are modern application development.
What are the challenges that developers are struggling with today?
Cause you've got a lot of complexity happening out there and how do we
simplify their world and how do we kind of make the lives of developers so much
easier rather than getting through the pain and suffering that they need to.
And they can focus on their business problems and not on sort of building infrastructure
platforms both. So that, and of course, the rise of applications with language models,
and how it's upbringing and changing developers' ecosystems. So it's exciting to be here today and
can talk through all of this with you. Well, let's dig in.
Right. Let's do it it. Mark we love talking about
the collision of data and software and especially how software best practices
are sort of infiltrating the world of data and so what a treat to have you on
the show to dig into microservices because there is just so much for us to
learn from this specific area so thanks for joining us. I was so happy to be here and I'm excited
to kind of dive into this topic with you and your audience.
Okay, I wanna talk about Microsoft,
your time at Microsoft,
because you did some incredible things there.
You sort of did platform and distributed systems
before that was cool.
But can we rewind just a little bit
because you were pretty instrumental in the early days of
satellites and telecom. Can you just give us... I want to hear about space. Yeah, I want to hear
about outer space. So can you give us... Well, if I go all the way back, my first job actually
out of college was actually trained as an electronic engineering. And so I actually joined Vodafone,
which is a kind of a well-known company in the UK,
built the first digital cellular base station.
So in that world, they were going from analog,
which had analog frequencies in your analog phones,
to the digital world,
which allowed you to kind of multiplex the signal
and put more data on top of it all,
and basically take more advantage of your bandwidth
and the whole digital ecosystem.
I worked for a company that building
those first base stations.
And it was super exciting at the time,
but then I realized that my world
was should be more drifting towards software
because there was much more opportunity inside that.
So I did an exciting time to understand
what was known as the digital mobile specification
at that time.
But I moved into software soon after that.
Do you remember a moment when you sort of completed maybe an initial project and then
sort of made this digital connection that was fully digital as opposed to analog? Was
there a moment where you're like, wow, this is pretty wild that we've made this transition?
Well there were, I mean, you could just see it in the technology at the time because there
was only a certain number of frequencies at that time. And it's dialing all
the way back now to the end of the 80s, beginning of the 90s,
you only had a cellular, the cellular concept was still new.
And of course, you can actually attach many signals to a
particular tower because bound by frequency now, but like 100
signals down the same frequency and think, right, it was just
the explosion. Yeah, you can even remotely have
the number of phones attached to a tower today without
digital multi-blocking inside the signal.
So you saw that whole trend happening.
Of course, the biggest innovation they had in
that was they put a 30,
I think it was a 30 character limits inside there,
so you can send a little data message.
And that turned out to be SLS and it turned out to be,
it changed the world.
And I was like, a little side note at the bottom, at the very bottom of the specification
was all about voice.
It was as if everybody was like, oh, we can sit this little packet of openings I did and
post SMS and that changed the world about how people communicate.
Read the footnotes.
Yes, read the footnotes.
I guess today you would like to do an LOL emoji
when reflecting on that.
Okay, so you got into the world of software.
So tell us about your time at Microsoft
because you did a number of things there.
Well, I joined Microsoft back in 2000, a long time ago now.
But when I first joined that.NET had been released
and I got the privilege of actually landing
into the data team there actually.
In fact, it was in part of the SQL Server team and inside the SQL Server team.
Oh, cool.
It was actually, I was part of the XML team that was inside the SQL Server
team and we were developing XML technologies because that was the next,
that was the big thing then in the beginning of 2000s, XML was going to
change the world, it was a new data format.
And so actually developed that inside.NET, there was a stack of system.xml,
which was the namespace that did all the XML processing.
We had the XML readers and writers and XSLT transformation.
And so developed all that as part of.net.
I actually did ADO.net as well.
If you're familiar with any of their data access technologies,
OLDBC, ODBC, ADO.NET, all those classic data access technologies.
They were around that time and I actually worked inside.
I actually even did XML support inside SQL Server as well.
So it was a very interesting time.
But soon after that, I sort of moved on, went through a few other teams
in terms of communication and stack service, one that could Windows Communication Foundation.
But eventually started a team inside Azure where we were imagining a new
platform for the company in order to build its services on.
And what we realized was that if you're going to run these services at scale,
you had to build a platform that can handle the scale of the services.
And those days at the beginning of the, the beginning of the 2000s, those classic
client server design, you had your front-end tier, your middle tier, your
backend database and how you build these systems.
But of course, cloud systems can't be built like that.
They have to deal with millions and millions of requests and be able to scale
out and so how do you build a platform that scales out in its design?
And so when you are in 2008, 2007, thinking about this whole thing, that
emerged eventually into a project called Service Fabric, which actually was the
kind of a key platform that many of the larger, especially the data access
services inside Microsoft was built on.
So Azure SQL DB, Cosmos DB, the whole data stack and the data analytics stack
all got built on the service fabric.
Distributed systems platform and they had very core primitives, everything
from replication technology and self-booting technology up to kind of
its middle tier where it did sort of discovery of services or up to a
developer stack at the top.
And it was a very innovative time inside the company.
What was key, interesting about this platform is that we actually gave it out as a service to customers
as much as it was used with inside Microsoft itself.
Oh, interesting.
Yeah. And so we had a lot of external customers who also built on top of the service fabric.
Right. Right. Right.
And then of course Kubernetes came along
and the world changed a little bit again in 2014
and that sort of shifted it all.
But this platform still is a sort of a key platform
in their Azure today and I did that for sort of 10 years
and that gave me a lot of understanding
about how you design these platforms
and particularly because we shifted externally
how other people have to build their own services themselves that are mission critical.
So I'm always curious with technology companies, especially ones that have been around, the Azure thing is a big shift.
And you, like, Microsoft's already a company very much at scale.
There you're building this new thing. Obviously, you want to have companies using Azure.
And then how does that work internally at Microsoft?
Because obviously there's some things you're doing
at such scale that maybe in the early days,
Azure isn't the right thing for it,
but you want to dog food as much as you can
and kind of use Azure yourself, I assume.
So like, how does that work?
Well, I mean, it's a large organization,
all of these things are getting very complex.
But I mean, when we were building the platform, Service Fabric platform, in the early days,
there was still a lot of other teams.
You had to convince them to kind of build on top of it.
And you had to be like a startup yourself.
Right.
Right.
I remember one of the early teams there adopted it.
It was actually, it actually became Teams team in the end, but
Microsoft had a Skype like product that was on top of it all. They sort of decided to
adopt and build on top of the platform as one of the first companies. You basically
were inside there building a platform, having convinced other people that they shouldn't
build it themselves. They should get your one for the benefits. I think the SQL Server
team was also one of those early adopters because they recognized
that they had to scale tens of thousands, millions of databases out.
And how do you think about running 10 million instances of SQL Server and scale and the
machine goes down and it might have 300 instances of SQL Server running on it.
And all of a sudden they've got to be distributed across the remaining machines inside the cluster
without any loss or downtime of your data.
And it's a very hard problem to solve.
So yes, you are very much a startup, convincing other people to use this.
And no one wants to be first.
Yeah, well.
More and more teams adopted it all and became a standard of course.
It became easier and easier.
Yeah, yeah, yeah.
I feel as if I'm in the same space right now
with Diver in my company.
Sure.
To get people to come along and use it and be first.
Well, what are some of the key learnings from that?
So essentially, like you said,
you really did have to sell internally to get developers,
like what were the things that like,
what are those moments or conversations where somebody was
like, oh, the things that like,
what are those moments for conversations where somebody was like, oh, this makes a lot of sense?
Well, I mean, you have to always target pain points, don't you?
You have to find that pain point. So you're saying to them, why are you writing a failover replication technology yourself?
Because it always fails and having to deal with that, or why are you writing a service discovery mechanism where one service kind of discovers and finds another one
or any of these other sort of distributed systems paradigms,
why don't you hand that problem over to us
so you can actually focus on the business
of making SQL run at scale,
making sure that databases can be found and connected
and give that into the underlying platform,
the problem space,
so that you're not sort of managing infrastructure, the machines and building this all yourself.
So I always refer to this as don't reinvent the pattern.
Developers love building things themselves, but you know, the more kind of, let's just
say conscious development orgs realize that they should focus on their business problem
rather than reinvent the pattern.
So once you've found their business, their technology problem and focused on their business problem rather than the read and write path. So once you found their business, their technology
problem and focused on that, then you could get them
to kind of move on to your platform entirely.
What are some of the, I mean, it sounds like you saw
a lot of this internally in Microsoft and I'm sure
that you going after the pain with Diagrid and I want
to dig into Diagrid and microservices in a minute,
but one of the things that's interesting is there's,
even though in a vacuum,
if you just step back and looked at things objectively,
like a reasonable person would say,
okay, there are things that make sense to build
and there are things that make sense to outsource
to a service that only does this group of things
like extremely well.
But somehow we keep returning
to like, oh, well, we can just you start down this path and it seems really innocuous.
And then you realize like, well, goodness gracious, like I'm managing, I'm building a failover
replication service, right?
And it's like, well, that is a monumental thing to build and manage and maintain in
and of itself, then it's not directly related to the core product you're delivering in terms
of the user experience that someone has in solving their pain point.
Well, I mean, I think there's always been this role of software over the years that
if you look at, if you go back 20 years and compare software 20 years ago, 10 years ago
and then today, it's becoming
much much more complex. And I think level of abstraction kind of a slowly filtering
into developers mind. So if you roll back 20 years ago, people probably still like to
write their own linked list libraries and put that together and not take advantage of
it in a pronged model. Well, no developer in their right mind would write a collection library,
linked list library anymore.
They're like, well, why would I do that?
It's because I need the speed of my development.
And then I think the next level on that is, well, how is it I can take and not
build, say, for example, a mind messaging system for doing pub sub-messaging
between services, why can't I just take advantage of one?
And I think people have.
So, and then of course, the next level of complexity on top of that is well,
how do I now take the whole solutions and use that and stitch them together?
And you also then get today to like API design, where developers are
just stitching together APIs and call a Stripe API, call a Flight API and build
an application together as the ultimate thing.
I think that developers slowly get more and more pressure on them that levels of
infrastructure become available.
They're more apparent and they get more used to them.
And that's what happens.
And I think today the complexity is hard enough that developers don't want to go
to two level, they want to pressure from the businesses and they've got to get
these things out as well.
But not all still developers love to build things themselves. Yeah, yeah, for sure. I think there's
this thing too that I've seen where I think I'd call it sunk cost fallacy of like, well, I've
already built a PubSub like for this other project. So we're going to use that. Or I've already built
this other thing. Sometimes it's I want to build this thing because I want to build it. But other
times it's like, well, we've already built that. We should reuse it.
And even when that's the case, that still very minimizes
the updates and the maintenance and the patching
and security requirements and a million other things, right?
Yeah.
I think there is, I mean, I think what happens over time
is that there's ways of desire, at least
from the business leader's perspective, to choose a level of standardization that they can use, because if they choose a level of
standardization and consistency, it allows them to kind of deliver their software faster. And I
think that can standardization can be some companies say you can only program in these languages,
or they can only use frameworks, or use this particular database, and not those databases.
So they choose some technology standards. But there's also increasingly API choice standards
as well, or standards around protocols, or standards around anything that you can imagine.
And I think the more a company can set itself down golden paths, where they choose standards
and these golden paths that they either define
through standards or they define with a set of technologies they put themselves together
themselves.
That's what they sort of bring into their company's philosophy and mentality and then
that helps them deliver things faster.
And so I would say that particularly we identified for developers who were building these microservices
application, we identified a set of golden paths instead of API standards, those things because they were
common patterns. And as I said, don't reinvent the pattern. Why build a linked list class inside Java?
Why build a pub sub messaging system when something like an open source project like DAPA can provide
that for you? And then, yes, is it
a matter of just kind of ripping out what you have today and putting something else
in? Because I think the other thing is that over time, it's not just the messaging system
that you've developed, but you realize there's observability data and telemetry. There's
security. And then there's things like failover and retry and resiliency and these cross-cutting
concerns that you have to add everywhere. and that's the thing that adds up. Yeah, totally. Well, okay, let's talk about
Let's talk about dapper and
Diagrid and maybe we can do it from the standpoint of kind of a micro services 101
Yeah, and I know that some of our listeners, maybe they're even managing a Kubernetes cluster.
They listen to this podcast.
Right, exactly, exactly.
But others may be more deep into,
let's say working with data itself
and modeling it within a data store, et cetera, right?
And so there's a broad spectrum.
But microservice architecture is increasingly prevalent
in the world of data and even sort of internal
data stacks.
Yeah.
And so let's talk about data services 101 as a way to understand DAPR and Diagrid.
Yeah, okay.
All right.
Well, I mean, you know, what's I mean, what a microservice architecture is, is actually
driven by the business in the end, and the business kind of drew the requirements that
they needed to be able to ship
and ship things faster.
And if you go back to sort of traditional monolithic designs,
that was like a few big executables
that you compiled together,
the team defined interfaces at the code level,
everything got linked together in terms of linked libraries
and you built an executable and you shipped it.
But of course you couldn't wait on everyone.
The business said, well, say I'm building
a data processing application
and it consists of an ingress service
and then maybe it consists of the other services
that do like filtering and pipeline processing
and they've got to all work together.
And the reason why they did that is because one team
can deliver some business or some technology functionality
faster than ever in the distributed architecture.
So microservices architectures,
which is the predominant architecture now
that people build when they build backend services,
particularly running on cloud technologies.
And what I mean by cloud is I mean things like anything
from VMs to containers, to platforms
like Kubernetes, where you have to put it on multiple machines.
But the breaking up of this application for agility caused a different problem, which
was complexity of communication over networks, having to send messages between things instead
of it all being compiled into one binary, which just had simple local calls.
But it also gave the advantage of you could scale things out.
So if I wanted more of the ingestation service and I wanted a thousand of those to ingest more data,
but only one of say the data filtering service, I could make that choice,
which was another architectural advantage around these things.
So you've got the ability to scale, distribute,
and kind of business agility at the cost
of distributing this round and more network communication.
So what you have to do is solve for that problem.
And so that's where, you know, what we did,
when you start to understand that space
and the architecture of these apps,
how is it you can help common design patterns
for communication and coordination?
And those are the sort of big ones. And when you get down to communication, it's like,
how do you send messages between these services? And typically you do that with a message broker
in between. So things like Kafka, Rabbit, NQ, and protocols like AAMQPT around this,
NQTT. These are all sort of message brokers with particular protocols and that's
been a very well established pattern for years. I mean if you go back to IBM had MQ series how you
just sent messages between queues and the queue was a permanent data store for messaging between
things well that's what message brokers do today they allow you to communicate between things
or this sort of request reply where I call on to the payment system and I wait for a response there and then it called me back.
So these messaging patterns coupled with coordination patterns of how do I call
this, then this became the hard problem to solve.
And so that's the sort of thing that we tackled in DAPA, the open source project
to help developers build these microservices applications because of the
business needed them
by giving them common design patterns
such as messaging, state management being one of those,
how do I say state into common databases
and then sort of workflow orchestration
is how do I coordinate these things,
sometimes known as this sort of orchestration
or saga pattern.
So hopefully that's kind of like the 101 of microservices
and today we see this as the predominant design.
Sometimes people call this cloud-native design, microservices design,
distributed systems design for these backend services.
So we talked a little bit before the show on this microservices
versus monolith, like the classic headline, right? You and I both
have probably read dozens of articles that start something like that. Let's talk more about that
because I don't, I think a lot of the things I read about it, it's really hard
to tell what people are even talking about, honestly. And then specifically,
like, how your organization actually should dictate more of this, like, how the
services, let's not even call it micro versus, like like how services are architected. Love your perspective on that.
Yeah I think there's this whole battle of monolith versus microservices is kind of like
very silly and meek because it really kind of comes down to like
where do you want to draw lines, where do you want to draw boundaries and yeah I mean a
monolithic application is perfectly fine. If you know
that you just want to build one application, you deploy it, maybe you've only got maybe
you've only got 10,000 users and you could put it on a large machine and you should totally
do that. And it's only a few and a couple of developers who built this thing and you
and your ship maybe once a month or something like this. Right. So totally fine to do that.
I mean, I'm totally encouraged.
And then you kind of get these other boundaries.
Well, there's a team over here who wants to build a payment service and there's
a team over here building the ventures service and they're going up different
schedules at different speeds.
And so you have to sort of draw these interesting technology boundaries, but
also sort of organization and application technology boundaries, but also sort of
organization and application boundaries in the right sense. And yes, you could definitely
design that and have, you know, a thousand microservices all doing teeny little things,
or you can design it more sensibly, which is like two applications and there you are,
communicating between them. So I think that that's the skill and really the slight art of
picking the right level of segmentation of your application so that you're not going to the extreme.
And in some ways, some people include all this serverless model functions, they say they use a
microservices and every function is a microservice and for these Lambda architectures has a thousand
functions running in it and you can't get your head around it.
Right.
And you think of little Lambda application
is like, you know,
and if you look at the Lambda applications at scale,
people are like, oh gosh, this is terrible.
And so there was this great article,
I remember a few, like a year or so ago,
where, you know, this team at Amazon claimed
that they had moved from microservices architecture
back to a modelist and everyone was like, well, there you go, microservices architecture back to a monolith
and everyone was like, well, there you go, microservices have failed.
Like a thousand lambda functions and they would do what the hell was happening.
They wanted to get back like five microservices instead.
That was it.
Yeah.
Because they've taken it to the wrong level.
It's a shame there's not more terminology because in my mind there should be like microservices,
nano services, pico services, like they should be like different levels.
Because that's the reality is it's the proper number of services for said for the organization
for the solution.
Yeah.
But, you know, going back to these designs, I mean, typically, yeah, I mean, we see to
a large extent, lots
of people doing migrations of code today with their legacy Java code or legacy applications
and moving it to the cloud is still dominating enormously how people are what they say modernizing
existing applications and typically they've run it on a physical machine, they built it
before.
Right. a physical machine, they built it before. For them, modernization becomes,
hey, I've got to be able to split the thing
into a few more different parts
because I want to ship it a little bit faster
or scale it out differently.
It's often a scale problem, actually, well-meant.
And also, I feel the need to deploy it
on some cloud infrastructure, like a container service,
like Fargate in AWS or GCP Cloud Run,
or even Kubernetes as a platform
as a way of scaling the underlying infrastructure
so that they can put those demands on it.
And so that's where Dapr effectively comes in.
It allows you to have consistent design patterns for that.
And so to take it one typical problem that
Dapr solves is that you get all people go right, I'm going to call between these two services with
Kafka. And they choose Kafka as a message broker. But Kafka is not really designed just for sending
messages. It's like a streaming service. So they spend sort of a couple of months building this
pub sub messaging, publish messages, send messages, receive messages, then they build like
message security inside all this, all of this all around
Kafka. And then the client is built into their application.
They break this all in. And then there was a line of business
code. And then they they they find that while they can't
develop this locally very easily, and then you have to
move it to the cloud. then often they get into a
problem where someone else comes along goes well I never really like Kafka can we use rapid MQs
and they're like well we have to start all over again or often people do multi-clouds like I
deployed and did this on Azure but I've got to move it to another platform. So what Dapper does is it takes away these problems
of you having code evolution, flexibility of change,
as well as multi-cloud and developing locally
because it defines a well-defined interface called PubSub.
And then it implements that
on many different message brokers for you.
So it implements it on Kafka, RabbitMQ, Solace,
all the public cloud providers around
these things. And so you can sort of plug and play your favorite message broker, independent
of the API that developers use, and give a clean contract and giving this design flexibility
and forcing you and taking you down golden paths for designing Microsoft's architects
is very well.
We're going to take a quick break from the episode to talk about our sponsor, Rutter golden palace for designing Microsoft's
architectures very well. it can be to make sure that data is clean and then to stream it everywhere it needs to go. Yeah, Eric. As you know, customer data can get messy. And if you've ever seen a tag manager,
you know how messy it can get. So RutterStack has really been one of my team's secret weapons.
We can collect and standardize data from anywhere, web, mobile, even server side,
and then send it to our downstream tools. Now, rumor has it that you have implemented
the longest running production instance of Rutter Stack
at six years and going.
Yes, I can confirm that.
And one of the reasons we picked Rutter Stack
was that it does not store the data
and we can live stream data to our downstream tools.
One of the things about the implementation
that has been so common over all the years
and with so many Rutter Stack customers is that it wasn't a wholesale replacement of your stack.
It fit right into your existing tool set.
Yeah, and even with technical tools, Eric, things like Kafka or PubSub, but you don't have to have all that complicated customer data infrastructure.
Well, if you need to stream clean customer data to your entire stack, including your data infrastructure tools, head over to rudderstack.com to learn more.
One question I have, because you've seen this at scale, I would say more than most people, right? You're doing this at Microsoft internally and for customers at Microsoft. And I know at DIAGrid you've seen this with your enterprise customers as well.
You mentioned two things and I think they're interesting because
microservices can help solve both of them. One is removing dependencies for delivering software
and the other is scale. Can you speak a little bit to, and I'm thinking about our listeners who
maybe haven't seen the level of scale and sort of felt the searing pain of some of these problems as you start
to face them, what are the symptoms on, like for each one of those that they should have
in the back of their mind as they think about addressing problems at scale?
So for dependencies and then scale itself?
Well, I mean, just on the dependency side of things,
just take one of the advantages that Dapr provides,
and that is it has a state API,
and you just save and receive data state as key value pairs.
Yes, so it's very common for you to save states
as a key value pair into a state store,
and that might be just because you've got some session state
or a shopping cart or something like this. And so
one of the things that Dapr has is it has an API called key
value pairs, saving state management. And then what you
can do behind that is you can plug in any one of the databases
of your choice. And there's like 30 different databases, and
you name it, Cassandra, all of clouds, they're all there. And
so the developers can use the state API
to save and receive data,
but plug in different databases behind it all
and effectively deal with an independence of those two.
And then if developers don't need to change their code,
they can just swap in a different state database
around the same API.
So that's kind of one of the abstractions it provides.
So getting into your conversation about scale,
what that really comes still with is like,
if you've got sort of incoming requests,
how is it and say you have, say you're having,
say you're, for example, doing database,
say you're doing a payment system,
or you're doing like a transaction system
and having lots of requests coming in
and every transaction has to be processed.
Well, clearly, if you just have one machine running there,
dealing with 1000 requests, or maybe scales up to 10,000
requests, it's not gonna be able to handle it because it has to
do the greater processing aside that. So the whole idea is that
you can have 10, 20 things that you scale out. And typically,
there's technologies where you sort of look at the queue of data that's coming
in and as the queue starts to grow larger and larger in size and hit some threshold, you
spin up another instance of a very process that then takes them that load off thing.
So you get these consumer patterns where you're looking at a queue of data, it hits some threshold
and spins up another instance until you want this automatic instance creation to kind of deal with messages
that are on a queue to scale out.
So the classic one is messages on a Kafka queue, let me scale out.
Here's a number of consumers listening onto that message queue, scaling out
and might go from one to two to three to four to five, and then you're sort
of doing round robin each one and they're all doing
their own business processing and sending it off and doing some calls to a
database to retrieve some data and processing those messages.
And then as the queue goes down overnight or you're not having so much
processing, cause it's nighttime or something like this and there's nothing.
You scale down your servers again from 10, nine, eight, seven, six, five, four,
three, two, one.
So I'm also going to think scaling is very much an aspect
of cloud native technology.
And this is sort of one of the things that DAPA really helps
you with is that you scale out these instances and you deal
with a number of instances of how your application deals
with load over time.
Does that make sense?
Yep. Yep. Super helpful.
Yeah. I think one of the things that I'm curious about
is you've got technologies,
like I think like Terraform comes to mind, right?
Like infrastructure as code type stuff.
How would you position, does that like work with this?
Or is this kind of infrastructure as code as part of it?
Like, how does that work?
Yeah, that's a great question. In fact, I'm glad you asked this question because often it's kind of
cleared this all up. So Dapr has it, there's a separation between provisioning the infrastructure
and using the infrastructure. So Dapr and building the application itself doesn't have anything to
do with provisioning the infrastructure. The developers are simply using it all.
And this is where actually a very strong topic, I don't know how much your listeners are interested
or aware of platform engineering, but there's a rise of platform engineering today inside
many organizations.
This platform team is servicing now many application teams sitting on top of it all.
And that's how the engineering team typically provides a message
broker service for communication. They typically provide database services for storing state and
data. They typically provide a compute service for like where you run your code. And they typically
provide some sort of networking service, provide these to the application teams and they stitch
them all together. So that rise of the platform engineering team
really kind of defines how they set up the infrastructure.
But then the developers who are using that infrastructure
is where they come in and that's where
DAPA fits into their code.
Because they want to say, right,
I want this application that just has pub sub semantics,
but you provisioned to me,
Kafka message broker through Terraform.
Yes.
So platform team have given me Terraform
or Vue gave me these databases.
When I'm using that key value store API
in my application to store my session state,
I'm just using the MongoDB database you've given me.
And if all of a sudden you said to me,
well, here's a SQL database instead, I don't have to change my code.
I can just do a config mapping over to this.
So that provides a sort of platform layer of mapping between the API and the underlying infrastructure.
But it doesn't do any of the provisioning of it.
Right.
This is the key thing around it all.
There's a clear segregation of concerns.
Platform engineering, deployment, and then application team uses it.
One of the spaces I think this would be super useful would be mergers and
acquisitions, right? Can you imagine if you had a merger and acquisition where
both teams were using this technology? I mean, that would be...
Well, exactly. Yes. I mean, this is, I mean, we see this all the time, but
people buying the client
code from some infrastructure into their application. I mean, I dealt with a large bank that baked
Kafka into 2000 microservices. Yes. And literally every microservice have Kafka SDK in it. And
then they decided they didn't like Kafka and they wanted to move to a cloud. Yes. And
they wanted to move. So it took them 18 months just to simply re engineer Kafka, and they wanted to move to a cloud. Yes, and they wanted to move one of the cloud. So it took them 18 months just to simply re-engineer
Kafka onto the new cloud provider
because they had to somehow keep the existing one going
and do the new one and do all the testing of this all.
So it was like simply with Dapr,
they could have done that in a day.
Yes, effectively switching between
kind of using Kafka baked into 2000
microservices, just switched that over to use AWS SNS as a service. I'm in sync with
their databases around these things. So I think this is a key element that you see with
that. And also you see a lot of this coordination as well. Another very important thing that
DAPR does for application developers and business in the whole is kind
of this workflow coordination. I don't know how familiar you are with things like airflow.
These ones that do. Dapr has a code first developer friendly workflow engine built into it all.
And it basically does durable execution. So you can call on to the data ingestion service,
you could call on to the data filtering service and maybe the data ingestion service, you can call onto the data filtering service
and maybe the data processing service over here.
And if it fails and your machine's recover and start up again, it doesn't try and do
this all again and remembers where it was and just carries on and keeps going.
And this is the key aspect of workflow engines and sort of office, saga pattern,
the coordination or orchestration.
Super, super hard problems to solve.
Incredibly easy with Dapr to help the microservice
architecture coordinate the services in a consistent way.
Yep.
So Eric, we've made it almost 30 minutes
without talking about AI.
That's pretty good for us.
Wow.
But I have to ask-
More than 30 minutes.
Yeah, more than, yeah, you're right.
Yeah, that's good.
But I have to ask because just thinking through these,
the patterns that we're talking about here,
AI is moving so fast, there's so many vendors.
I could see this being extremely useful if it's like,
okay, we're gonna like, imagine like essentially baking in,
going really all in on like open AI or
Claude or whatever. And you've got everywhere.
You've got an a thousand microservices and then your company like, well,
we just signed a deal with fill in the blank some new service.
And then it's like 18, like you said, like an AI is moving so fast.
I can imagine this is super relevant for people.
Exactly. I mean, if I could go through the subject,
just in the last release of DAPA,
we introduced this API called the LLM
conversation API and what it does is an API that you call and you say here's a
prompt and you can plug in open AI or Anthropic or AWS bedrock or Deep
Seeker, any one of these ones.
So today same problem is developer binds the open AI SDK into their code and it only has a certain set of
capabilities and then if they want to switch it or you have to rewrite their app, well Dapr has
this conversational API that allows you to plug in any language model underneath it and not only
that it also does some really cool features that it layers on top of this. It actually does caching
of prompts so that if you did a prompt before, it doesn't test again, saves you money. So data obfuscation where if you send in some social
security numbers and email addresses or anything else like this and either going in or coming out,
the API will scrub all that information out and sort of prevent you sort of either giving
the language model some sensitive data or seeing something else out.
So not every language model has this, but DAPA adds this capability.
So you see, we see a lot more now in application development that developers are switching
out a piece of procedural code that they had to write before for a language model to do
things and they're using language models to kind of be part of their application, whether it's sort of data translation or your query of your data or particularly
bring back some data and then use the language model to kind of inspect it and
give you some insights to it all.
And so language models are being deeply integrated into all developer
applications nowadays.
And increasingly we're getting to this world of agentic applications as well,
which is autonomous funds who are sort of making decisions on behalf
of the user to do things and sort of figure out
how the user language model to make an application much more
dynamic, particularly in sort of these workflow scenarios.
So this is a throwback for me
and I don't know if you'll remember this, Eric. Do you remember
when there's a couple of open source projects This is a throwback for me.
like operator or agentic models that will run and like click things on your desktop. I think auto hotkey and obviously it's a little bit different but I think especially in the
beginning it's going to be the same problems you had with like yeah with a with a port.
Apple's automated or Apple script or automator. Do you remember automator? Yeah yeah there you go.
Yeah yeah well yeah order hotkeys. They record these 15 steps and then play them back. Yeah
totally and that's deterministic and the LLMs are
not deterministic there are advantages
obviously with LLM but I think I still
just I just think people are
underestimating like because it's like oh
look at think of all these applications
like yeah I was there too and I really
wonder like what the curve is going to
be where it's like extremely reliable
to do operations. I mean you have to get
the balance right here again it's just like a design decision.
You're not going to get a language model that's working on a factory floor to kind of make
and it's welding doors onto a car to something. Well, I don't weld a door on a car anymore.
I might weld to the left or to the right or something like this. It has to be very
well designed a procedural. So yeah, manufacturing things can't decide LLM walls.
What should I do next on this car process?
Because that'd be a disaster.
You'll have like terrible results.
But using language models, I think particularly for kind of human
engagement around these things, basically asking it from bringing
back, you're saying, does this look right?
And if you look at developers today, like every developer I think that I can think of now
who is taking advantage of language models
can make themselves 20, 30, 40% more productive
just because it's doing tedious work.
You just say, well, I wrote this code, write the test for me
and just write the test and you have a test case
and you drop it in there.
Some of the same is happening
inside the applications themselves.
Rather, I mean, anything from a vision model is like, well, first, trying to
create a vision model in code is really hard.
Right.
Secondly, if it's a language model, can you take advantage of just giving the
language model to kind of inspect some data or give you a summary of it back so
you can make some better decisions around these things?
And so that's what you're seeing.
You're just seeing, yeah, yeah, language models give you something back that's 80% good for
people to take advantage of.
Whether it's a salesperson, scanning of their sales data and just say, give me the summary
of these 15 meetings for last week and just give me the highlights from it all to figure
out some best flights for me around these things. So it's that human language model interaction that's going to
happen quite a lot in this world.
Yeah, I think another two other interesting ones I've been thinking about are essentially
low stakes applications and two I thought of is one is graphic design like, hey, give
not like what we're seeing now, we're just like generates an image,
but actually have it like, here's Photoshop,
here's Figma, here's whatever.
Like I want 10 iterations of this,
like built in this tool as a vector image or something.
And then another interesting one would be like CAD, right?
Like give us the CAD and like, give me 10 prototypes,
write them out on a 3D printer and like,
I'll examine them and I'll tweak them.
Like those things I think are gonna be really cool
for creativity and innovation.
I mean, you've seen this already.
I mean, just the other day I was trying out
where I took a picture of a cup,
and then I said, oh right, let's see what I can do with that.
And I said to the model,
can you fill this cup with coffee?
And it came back and like,
there's a full cup in the same picture with coffee on it. And I was like, well, okay, can you put this coffee on a table now?
And it was like, they're all saying, Mike, that I took my coffee on a table.
And then, and you can, you can tell it to do the things that you want to do.
And I think as the language models and image models become more and more.
Interactive, you know, you will definitely see editing around these things rather
than me dragging and dropping some things. I'll just, I'm sure one day we'll get to a point where you're just voicing
and you're just talking through it all. And it's going, Hey, move the copy cover to left a little
bit. Right. Yeah. Squeeze it. It was very minority report issue. Yeah, sure.
One question. I mean, this is getting a little bit meta, but in terms of
One question, I mean, this is getting a little bit meta, but in terms of using AI to operate Dapper or DIA Grid,
what do you orchestrate microservices and all of that?
It makes total sense that storing state,
caching prompts, all of those things provided as
a service for companies that are integrating APIs for these models
into their applications.
I mean, it's a no-brainer to have a framework that helps you do that.
But it would surprise me if you are not also thinking about, well, how is that baked into
actually operating using developing the platform itself?
Well, I mean, yes.
I mean, in terms of AI being part of that, is it being part of the platform itself? Well, I mean, yes, I mean, in terms of AI being part of that,
is it being part of the platform itself?
Or?
Yes, yeah, yeah.
Yeah, I mean, and so I would say that,
I mean, let me give you one example
where I think AI is being very used from a developer side.
What we do spend a lot of time,
so DAPA is very good at accelerating your development
and building your distributed application faster.
And we've shown that if you embrace DAPA, you can deliver your business application
anywhere between 30 to 50% faster than you could normally.
So there you are.
We did a state of DAPA report that we published about a month ago.
That report interviewed over 200 engineers and all of them, 96 of them came back and
said the minimum time is help me deliver my applications 30%.
Most of them were like 50% faster because it did all these common patterns.
So that was amazing.
So all these developers building back end services were accelerated,
but they still had to sit down and write a proof of concept and build these
things and put together the API.
We do this with a lot of companies today at Diagrid.
We come along, we provide enterprise support,
we provide architectural guidance,
we provide best practices,
and we take you down your cloud native journey.
We cleanly use DAPA because we see the acceleration
around all that to help you build
these business applications faster.
And so often we build a POC with a company,
and it may take a week to build a POC, maybe two weeks. Well, one thing
that we've investigated, we started building is how can we
do a diagram of an architecture and turn that into code in a
matter of minutes. And in fact, we actually have now developed
technology, where we could sit down with a customer.
We could say, well, here's my data ingestion application.
Here's my filtering application.
I'm doing a communication between them all.
Here's my database state store down here.
We draw a picture on a whiteboard.
We then turn around a camera.
We point it at the whiteboard and we press a button.
And it goes off and it looks at the diagram.
It understands all of the annotations,
it looks at what you put inside there and it starts to generate the actual
backend application for you and all the code and you have a and guaranteed that
it compiles at the end of all of this.
So if you can imagine this sort of like diagram to code generation,
wow, now all of a sudden you're building these things.
And I think that the exciting
world that I should be is that there's a lot of this discussion around front end development and
vibe coding and all this type of things. But what's the backend side of these things from
architecture to code? And I think that's the new frontier, effectively, that speeds up building
these applications. In many ways, we have an unfair advantage because by using Dapr, you also provide all
the benefits of golden paths, features, APIs, flexibility, multi-cloud, multi-language and
all this sort of thing.
That's where we're taking all of this today and baking that into the runtime itself.
Man, that is so cool.
I mean, that is you mentioned minority report, but that feels very
futuristic because anyone who has any of the models to build a landing page, it's like, okay, that is shockingly good relative to the entire history.
But to do what you're talking about is a level beyond.
Yeah. And I think that this is a very exciting place that we're exploring at DIAGRID a lot.
We've started with workflows.
It's very common to see a business workflow
drawn out as a BPM diagram, for example,
and other sort of steam machine-like diagrams,
but BPM is both common.
And then someone says, well, I've got my BPM diagram here.
Here's what my business team gave to me.
Here's the 50 steps in my business process.
Can you turn this into code?
It goes to it and it goes workflow and task one and task two as well as typing
it all out.
Yeah.
Well, I'm mentioning that you just take that VPN diagram.
You have a language model understands it all.
It generates all of the workflow code for you and generates a
compileable running version of that in code for you and often generates a compileable running
version of that in code for you with all the right interfaces and minutes or even seconds, and then you deploy and run that.
Well, maybe you're filling your business logic, of course, because you're
a great worker, you've still got to write the business logic in that.
But those are sort of things that I think AI is amazing at dramatically
speeding up such that you can take business
visualizations and turn them into code.
And you also get to skip the entire,
let's load 50 or 75 JIRA issues with a Chinese and do that.
I mean, like there's a lot of steps,
especially if it's a bigger team,
that you can go direct to a prototype that you skipped.
A bunch of people just went.
I'm gonna do it.
Yeah.
Right.
I think that's the AI.
I think that's the AI.
The Scrum Masters.
The Scrum Masters.
Yes.
So AI is a huge potential in developer space.
I mean, it's been talked about by lots of developers now.
I think it's huge potential in the data space.
It has huge potential in it. I mean, it's going to change.
Everything has changed, I should say,
and it's changing the whole data space,
developer space around these things.
And at DIGRID, we very much embrace that,
and we're very much there to sit down
and build these distributed systems
using open source technologies like DAPR.
We continue to be the main maintainers of the project.
DAPR is a very well established graduated project inside the Cloud Native Computing Foundation,
the CNCF.
That's recognized as the highest level of endorsement that you can get inside the CNCF.
Yeah, the graduates thing.
It's also by the CNCF team.
They had to go off and interview lots of end users.
They had to kind of make sure that the project was healthy and multiple
contributors inside it all.
So you get all these advantages around for those projects.
And then here at Diagred, we take you down that journey of providing, not
only you helping run Dapr on top of platforms like Kubernetes, but also just,
if you just want to have the Dapr APIs as a platform yourself through a product that
we have that's a fully managed experience we have that as well.
And if your, if your listeners are interested in like, well, how do I
kind of get started on building?
You took the words out of my mouth.
I was going to say, I'm salivating.
Yeah.
Where do we find you?
Well, yeah, go to diagram.io and hit contact us. And that's where you can find us. I'm salivating. Yeah, where do we find you? Well, you go to diagram.io and hit contact us. And that's where you can find us.
You could also go to the dapper.io project.
And that's kind of where the open source project is.
So between those two URLs, you can find out everything you need.
Awesome.
Well, Mark, this has been absolutely wonderful.
I feel like we could just keep talking, but we're at the buzzer.
Brooks put a hard stop on us.
But yeah, we would love to have you back on in the future so we can continue the conversation
about microservices, AI, and Dapper and Diagrid.
Well, thank you for having me.
It's been fantastic talking with both of you.
The Data Stack Show is brought to you by Rutter Stack, the warehouse native customer data
platform.
Rutter Stack is purpose-built to help data teams turn customer data into competitive advantage.
Learn more at ruddersack.com.