The Data Stack Show - 242: The Data Convergence: How Operational and Analytical Data Are Merging with Ruben Burdin of Stacksync
Episode Date: May 14, 2025Highlights from this week’s conversation include:Ruben's Background (1:14)Defining Operational Data (5:20)The Convergence of Operational and Analytical Data (10:53)Evolution from Data Warehousing to... Fulfillment Centers (13:19)Challenges of API Integration (18:25)Understanding Data Complexity (22:18)Database vs. API Calls (25:32)Real-Time Database Views (28:15)The Evolution of Data Technology (32:37)Future Topics on PostgreSQL Scaling and Parting Thoughts (34:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
For the next two weeks as a thank you for listening to the Data Stack show,
Rudderstack is giving away some awesome prizes.
The grand prize is a LEGO Star Wars Razor Crest 1023 piece set.
They're also giving away Yeti mugs, anchor power banks, and everyone who enters will get a Rudderstack swag pack.
To sign up, visit rudderstack.com slash TDSS-giveaway.
Hi, I'm Eric Dotz.
And I'm John Wessel.
Welcome to the Data Stack Show.
The Data Stack Show is a podcast where we talk about
the technical, business, and human challenges
involved in data work.
Join our casual conversations with innovators
and data professionals to learn about new data technologies
and how data teams are run at top companies.
Before we dig into today's episode,
we want to give a huge thanks
to our presenting sponsor, RutterSack.
They give us the equipment and time to do this show
week in, week out,
and provide you the valuable content.
RutterSack provides customer data infrastructure
and is used by the world's most innovative companies
to collect, transform, and deliver their event data
wherever it's needed, all in real time.
You can learn more at ruddersack.com.
Welcome back to the Data Stack Show.
We are recording live on site in Oakland, California at the Data Council Conference, ruddersack.com. you know, 30 second background on yourself. So my name is Ruben, you know, I grew up in France, you know, now I moved to San Francisco.
I'm co-founder and CEO at StackSync.
So I worked in several startups and medium companies,
you know, medium size companies,
in everything from ETL, reverse ETL, data stack,
as well as on AI, you know,
before it became something big.
Before that, I was, I did a master in computer science,
like in Switzerland, and as well a bachelor in business.
Actually, this is where actually I come with
a double perspective on business and data world.
That's awesome.
So what are the topics we're excited about talking about?
Is this convergence, I would call it,
of analytics and operational data and what that looks like.
So what's something that you're excited to chat about?
Yes, exactly.
So what really excites me at the moment
is really to see how what was traditionally analytical data
and what was traditionally operational data
actually like they really start to merge
and actually meet the same level of SLA.
So actually to see that a data warehouse
can become operational and a database can become analytical.
And like, this is really something which I see coming
in many of the small to large size companies.
So it's really like a trend really in the industry.
And I really think this is here to stay.
Yeah, so this is really something which excites me.
All right, let's dig in.
Let's do it.
All right, Ruben, so excited to dig into
operational versus analytical data,
how and why those two things are converging.
But before we do that, let's talk a little more about your background.
So you mentioned you studied both business and computer science, and that gives you a
unique perspective in your job today.
I'm sure there are some stories there, maybe just kind of throughout your career so far,
what would you say, like how has that impacted
and maybe given you a kind of superpower
to get you where you are today?
Yeah, absolutely.
So when I was a kid, I really started to hike around
and doing so much email pages and whatnot.
And so what happened is when I started university,
I started with a bachelor in business administration.
So, and my university just launched what was called
in all this data science fundamental program.
So actually like nobody heard about data science
and all that.
I was explaining to everybody what was data science
when they went and told me like, what are you doing Ruben?
And so, and from there, you know, I really understood,
you know, this is where I can have a mean back photography.
I can, I really think I can identify patterns
that nobody is seeing around and this,
then this just turned out to be
the largest company we know today.
So, and this was like probably like seven years ago.
So what happened is that basically I then I went to Singapore to work in startup, also
to do an exchange semester and there I was fully technical and then I came back to Switzerland
where I grew up and then I did like a Master in Computer Science.
So I think really the ability to come up with computer science and data science with a perspective
of business, so if you're not thinking fully as an engineer with all the complexities
and details, enables you to actually simplify some concepts which you just don't know,
but actually you just end up abstracting them away and finding a better solution.
And being able to think as an engineer into a business ecosystem really helps you to actually
frame and make your way much quicker to things even without having the full business knowledge.
And this is really something which enables you to go much faster.
And so that's what I value a lot to do into my education.
And then after doing my master in computer science, I really rolled out in several companies.
I built out data capabilities from zero to hero for several companies. I built out the capabilities from zero to hero
for several companies.
Also in AI before all this LLM thing was a thing.
And yeah, this is where I come from.
Awesome.
I mean, it makes so much sense too.
I think Tavik will dig into operational
versus analytical data really matched
those two things we're talking about, right?
Like your operational mind kind of business. How does that meet analytics? And we match those two things we're talking about, right? Like your operational mind kind of business,
how does that meet analytics?
And we tie those two things together,
which is what you're doing at StackSync.
So super excited to dig into that,
but let's just kind of back up
and even just define operational,
then analytical, like in the traditional sense
before we talk about merging those,
how would you just define operational data?
Yeah, so operational data, imagine yourself more as a full-stack developer, right?
So you're building an app with notifications and a UI and a postgres backend, for example,
right?
Most typical setup.
So what you have is actually like operational data is really the data which is based out of example, right? It's most typical setup. So what you have is actually like operational data
is really the data which is based out of events, right?
Out of clicks, you know, out of actions that the user take
and which need an answer right now.
So actually there are actions which are not batched.
They are okay in real time.
And they are, so actually there are two characteristics.
Like one is they're real time.
And second, they are not batched.
They are like single events where it is single JSON, not even a list. It's a single JSON.
I sign up, it writes to Postgres and then I get access to the app like to Ponsign.
Instantly. Exactly. This is like, there is no delay. And then there is this, so it's
like the data which actually you even need a response to. And so, and this is like, and so then like you have this, the contrary, right?
The most opposite spectrum side, right, is more the analytical data.
So, maybe like the way I define analytical data is data which actually makes sense at
a navigated level.
So it's not a single event, it's actually a navigate of the level.
For example, let's say your revenue metrics,
they only make sense once you can compute
several deals together, right?
Like a single deal makes no sense,
it makes for a revenue metric.
So you would need to actually aggregate.
So aggregate means stepping back enough
to have several data points and not,
because if you dig too much into a time range,
for example in a dashboard,
you would only have one data point,
which you go very precise.
So this is where analytical data
is make sense at a aggregate level,
is oftentimes data from yesterday or from even earlier.
And so also if it breaks,
if there is no criticality on fixing it.
I mean, like there can be business consequences, but at least a customer does not lose access to the
app.
This is really so it's like batched and it's non-real-time.
This is really like the elements which make analytical data analytical.
Before we talk about the convergence of those two things, how is that possible today?
What are the tech advancements that have made that possible?
Well, I think there are plenty.
There are plenty.
There is the emergence of ETL.
In the history of data engineering and data in the world, in the beginning, we didn't
have enough, anyway, just storage.
So we just made, hey, we need some storage to store some data. So now we
invented stuff like BigQuery, Snowflake, you know, like database in general. And so, I
mean, just database. Then we made, oh, no, it's bigger data. So now we did data warehouses.
So the different thing. And now to have a data warehouse, it's useful, but we have no
data inside of it. So now we started to create data pipelines, right?
And this is where actors like Fyfran and Urbite
also come in, among many.
And then I just ship data to the data warehouse.
Yep.
So now you have the data warehouse with some data inside.
Now how do we make it even useful?
Because it's useful, but you know,
just there, just costs money, right?
So now we're gonna make data boarding, right?
So now, Nucle Studio, you know, Tableau, Sigma, the sort to emerge, and become very, very advanced.
Now that we have some data warehouse with some data inside, and also some dashboard,
we have some metrics, something which makes sense, some metrics which make sense from
a business perspective.
And now what?
Now what happens?
At that moment, you actually need to take action
on the data and this is the emergence of the reverse
ATL kind of trend, right?
So it's like taking data from a database
or data warehouse, I mean more data warehouse
but like shifting it back to the external systems
where your business teams take action and live daily, right?
And so this is where the reverse TL.
And then you have, OK, now we have this loop, right?
But this loop needs to be now faster, right?
The data grows bigger.
You know, we have much more product data.
We have much more logs.
And so now we need it bigger and faster, right?
So bigger means, OK, different technologies to handle things.
Now you have to batch.
You have to take care of API rate limits, quotas and all this, but
also you need to make it faster, right? And faster, the way you build a real-time infrastructure
is completely different on how you build a batch infrastructure, right? You can think
of real-time as smaller and more frequent batches, but actually if you think like this
and you try to sync to your snowflake like this,
it's gonna destroy your cost.
And so this is why streaming is not really equal
to many small batches.
Even though in technical terms it actually is, right?
So in the end everything is a byte.
So yeah, so this is where basically now
you're becoming into real time
and where two-way sync comes in, right? So now you don't have like one way sync plus all the way and if we realize it
One way sync and all the way does not make two way sync
And now because actually you have conflicts which happen because real time actually makes so that you don't have time to
To consolidate data before conflicts happen
Yeah, and this is why you know two way sync does not equal anymore one way plus other way, especially into this real-time context.
And so the beauty of this is that we didn't talk about this whole transition and evolution
of light, one way is the other way.
In storage, one way is the other way, two ways.
The evolution of this, we didn't talk about the fact that the data itself, the underlying
asset, by nature became more real-time, right?
Because now you think it's faster, so it changes faster became more real time, right? Because now you sync it faster.
So it changes faster between the two sites, which is an external system, like
Salesforce, CRM, and a database like Postgres or Snowflake, Data Warehouse.
So now that you have these two systems, which just go more and more into real
time and constantly even more, you know, and to what I think is an ultimate step
into making this fully consistent, then it becomes real time.
So now you are also delivering more real time dashboards.
It's the one we talked about, Luka, Sigma, and all this.
And so this actually is an entire quality of the data ecosystem increased on every side.
And so this is where we start seeing a pattern where analytical data becomes more and more
fresh, now like fresher.
And this is where we start actually joining the attributes of operational data, which
is events, right?
So it actually calls single events, actually more towards training, as well as real time
and analytical data, right?
Actually, you can actually ship back scalable data, which also shares that root of being an aggregate and making sense at an aggregate level because
you have more data available into the rounds. You can actually ship real-time aggregates.
And this is actually where the two worlds start converging. So we're not that convergence,
but what do you think, Eric?
Well, okay, I have a question. So I think about because this is really interesting you're describing and this is essentially what stack sync does
You're describing at a very fundamental level communication between two databases, right?
Which was typically done through a series of pipelines and patch jobs and you know all that sort of stuff
And so now we're talking about this when I I think about, just instantly if someone said,
okay, think about an example of analytical data
being used operationally, and you bring real time into it,
I instantly think of something like Flink,
which is very different than two databases talking, right?
But you have an actual event stream,
you're running calculations over that,
and then you're basically using
what would typically be considered like aggregate analytical data, but you're doing something over that, and then you're basically using what would typically be considered
like aggregate analytical data,
but you're doing something like maintaining a state
that you can then use to like enforce something
within an application, right,
that needs to happen in real time or whatever,
which is, you know, that's a streaming architecture,
that's pretty common,
but we're talking about a similar type of use case,
but between two database systems,
which is really interesting.
Right.
I think another, I really like to overlay like real world physical examples here.
Another background in logistics and distribution.
Absolutely.
And what I think is really interesting about that is the generation that we're moving past
was called data warehousing, right?
So like a warehouse is somewhere like you store things,
physically store things.
And then I was thinking like, well,
like how did we evolve this?
Like how did Amazon and all of their,
like their distribution evolve?
It's called a fulfillment center, right?
And fulfillment centers are highly operational.
Warehouses are more like you bring something you store
it might sit for a while, it's low transaction,
like it's not much, if you're in a center. He then like things are constantly going in and out
You're doing value-add work like white labeling for example
And it's just a very different feel in a warehouse and I think we're seeing this and it wouldn't as warehouse has become more operational
They're closer to these fulfillment centers increases complexity higher standards for like low latency and speed and then lots going in and out Versus like just going in and out
versus just going in and storage.
Give us an example from a customer,
you talked about Postgres and Salesforce,
which I think is probably really familiar
to all of our listeners, right?
Yes.
Give us a real customer example from someone of the chiefs worked with at Stackless where that dynamic of real-time two
way, how does that work?
What's the use case?
And then how would you actually build out?
Yeah, absolutely.
So actually, there's several very typical use cases.
And for example, let's say one very typical use case
is how you actually build internal tooling on top
of your ASML system.
So for example, let's say you have a portal where you actually activate user access rights
to a SaaS product, for example.
And so, or eventually also like where, for example, let's say you would actually connect,
you would actually check if your invoice has been paid by a customer, enforcing some business
rules, which are very hard to actually frame with Salesforce workflows
because it's a very limited tool.
Yeah.
And so, and you want to actually, you want your business team to actually manipulate
this data into a very, into a way which is very customized to your business.
Right?
So, how do you do it?
So, actually now you're going to have to actually build a UI, almost like an app.
And actually, like instead of querying the database,
which gives you some consistency guarantees,
you actually would need otherwise to actually send
API calls to the API all the time, right?
Which involves, you know, like pagination, you know.
Thinking about building this in Salesforce and I'm...
You know what I mean?
I feel like a sick feeling in my stomach.
Oh really?
It's a lightning app, right? It's a simple lightning app. It's a simple lightning app. Yeah, It's a lightning app, right?
It's a simple lightning app.
It's a little Apex code.
That's it.
It's done.
The trick is done.
We just don't want to be that guy to build it.
So we just want to be someone else.
And this is your alternative.
It's like, okay, now you get stacks, which powers real time into a thing
between your Salesforce and your database.
So you would actually synchronize all of your data
to all of your relevant data
from your Salesforce to your database.
Actually, you have the exact same schema,
so exact same objects,
so exact same everything, right?
Data types, perfect everything.
And so it's like if technically,
instead of writing to the API of Salesforce, which would
actually let's imagine now you actually have a UI and like 100 employees, a team, I could log in
and actually start doing the same transactions. It's going to be, it's going to make like hundreds
of calls per second. Right. So, and this is not sustainable, right? So this is why actually you
need a database, which actually can handle much more and actually where you can batch this. So
Stacksync would just come in
You have all your data into your database and you would build your internal portal
Exactly as you would build a normal app into the database, right?
So very familiar for your engineers to build so your engineers and you go to market knows exactly
How to behave because like the database is a very comfortable zone for them
It's what they use and daily, and what they love. They
know it very well. No need to read documentation. So now they would just read and write from
a database. And so every time there is an access right that needs to be changed, someone
calls you and say, hey, Ruben, please change my access writing for this product. I've got
an invoice and I didn't get it. So I wouldn't go into my internal portal. I would actually
be able to change anything. And this would actually change into my database, right?
So very easy to build.
And StackSync would just synchronize this directly
to your Salesforce.
And so you didn't make a single API to Salesforce,
but you actually modified your data
from internal portal to Salesforce
because you use a database as an integral tool.
And you can use a data warehouse for this as well
for different use cases, right?
So I mean, like this is extensible to technically some reverse CL use cases where you actually
have both read and write operational, but technically imagine yourself as an evolution
of software.
In the beginning, when you built the software and someone would, like in the very early
age, the first days of software, a client of a software would tell you, hey, please give me access to my underlying data. They would give them access to the
underlying database. But the problem is that the customer would mess up
with the data, we completely unmanageable. So we invented the concept of
APIs. It's a structured firewall which actually enables only
certain type of operation which can be validated before it's actually accessed the database. And so Stacksend by doing
the 2S sync actually replicates this database. I mean use an API to replicate
the database. So if we put back, okay we give you back access to the database
which is underlier system and this also breaks vendors lock-in. So instead of
like before you were mentioning like lighten Hub, right? So John, and so John, now instead of building
the Lightning app, you know, which is crazy,
and you are forced to do, because actually
the data is in Salesforce, so you have to use
whatever can access Salesforce database.
Within your more locked in.
Yeah, and now you're even more locked in.
You know, and so what happened is that now basically
because you have the same data, but actually
you have it outside of your Salesforce ecosystem or your NetSuite ecosystem or your server
ecosystem.
Actually, you can build exactly like you would build a Lightning app, you know, customized
thing, but with your stack, which you're familiar with, with the database you're familiar with.
And it's not locked in because actually it's like with like your owner of your own data,
which is within your own premises, right?
And so actually like in the end you end up with like the exact same product which is
like a custom app on top of your data, but you are just outside of this locking ecosystem
and do we think just removes API layer which actually was very useful for many years but
now creates a complexity as data scales.
I have a question about the schema.
It sounds like it works like magic, but I just need to verify that, right?
I have a Postgres database that backs my app, and I have Salesforce. And my guess is that listeners who have had
to build custom integrations to do some of the type
of stuff that we're talking about,
or have been exposed to the nightmare of,
we have this Frankenstein Salesforce application
and we need to get Dana into it or whatever.
The Postgres instance that's backing my app
has a data hierarchy that doesn't necessarily
represent the same information,
the same information hierarchy that's in Salesforce, right?
So one good example would be,
I have the concept of users in my app,
but in Salesforce I would have the concept
of also an account because the sales team rolls that up
and looks at it, like would roll the concept of also an account because the sales team rolls that up and looks at it,
like would roll multiple users up, right?
But my app doesn't really have that concept necessarily
in the same way that it's reflected in Salesforce, right?
And so there's a difference in the way
that the data is actually structured.
And so you kind of mentioned,
you know, it's like, okay, you have the same schema,
but I'm sure a couple of listeners were like,
wait, how does that actually work?
Because the schemas are fundamentally different
in terms of the way that the information is represented.
And either there's a subset in one or a subset in the other
and the actual relationship between the data is different.
Yeah, so Eric, I think here you are absolutely right.
So actually here you are mentioning that
because the apps are different,
actually so your in-house grown app
and you want to integrate with your CRM,
and CRM itself actually have different data models, right?
So they organize accounts and contacts differently
as you would use workspaces and users into your app.
And also like the IDs that you use to associate it.
In Salesforce, you're gonna use Salesforce IDs
to associate your contacts and accounts,
but in your app, you would use your user ID,
WordPress ID, and whatnot.
So these data models are fundamentally different, right?
So actually, how does that actually work?
So think about already having all of your Salesforce data
being able to be synced in written, write fashion
from your database, already abstracted away
all the complexity of the API, of the authentication,
rate limiting, like formatting, like formatting, you know,
like also casting data, right?
Of course, if you write a data type, you know,
like a timestamp and a float, you know, in a database,
you know, might not be the same as in Salesforce.
So actually you lose precision on decimals, you know,
timestamp, you know, have different time zones.
So actually all of this complexity,
you have to take care of.
And so already let's assume that, you know,
now you just have
your data, your Salesforce data into your database.
So you would have, let's say different table,
so your users table would not be synced,
would not be the one to sync with your Salesforce.
Actually you would have a Salesforce users table
and an app users table.
And the reason why is because also like,
not only the schema is different,
but also the content of different semantic content.
A user in the app is not necessarily the one contacting of CRM,
which might just be a prospect, you know? So also, otherwise it means, you know,
everyone, every time someone texts you on email,
it's obviously updating to your CRM and now you create an account for this
person as they receive a welcome email, you know, like, yeah, it's very strange.
So, so, so, you know, like a magic thing to log in, right?
It's kind of aggressive.
So what you would do actually,
would actually synchronize the same,
but actually you would be able to just,
from your app, just write this new user
into this user's table of your app,
and this new user would just stand up
into your contact table in sensor,
so actually you would have to make one transaction
with two tables that you write, and this would actually synchronize to both places. So actually you would have one transaction with two tables that you write
and this will actually synchronize to both places.
So actually you can manage this difference.
So there is still a small difference, right?
But also it's because like the semantic content
of both is not the same.
And also maybe like, so when you want to create,
you know, someone signs up,
it's likely someone that already texted
your customer success team or sales team, et cetera.
So the contact is already existing
into your CRM. So you don't want to only insert but you want to up-cert in reality and also update
some fields. For example, let's say you want to maybe auto complete, I mean,
auto fields a customer, your customer profile into your app from the information you have in the CRM
and vice versa. Update the CRM data. So this consolidation actually is something you can actually build much easier in the postgres
with SQL queries, right?
Then you would actually have to add with a SQL query
because and an API code,
which actually are not atomic by definition, right?
So one can fail, the other one cannot.
And this is where the CMO runs to you and say,
hey, the metrics are not correct,
and I didn't get my bonus.
Can you please check your cash flow?
Oh, actually, we had a bug in like 50 customers
actually didn't backfill until you share it.
We got throttled on.
Yeah, salesmen throttled us.
Yeah, yeah, salesmen throttled us,
and like, sorry for your Bahamas holidays.
You know, like, I will run back, you know, copy paste manually.
You know, that's a list.
Thank you so much.
Now this is, this cannot work into a reliable setup.
And this is why, you know, this mission critical use cases
cannot just rely on having like API calls,
which are not on topic with that any queries and you know,
like which actually have like different realities.
And so, and also like, imagine yourself,
also the token is revoked, right?
You know, the person who authorized Salesforce,
you know, got fired.
Like, so now the user is canceled.
So now actually what do you do?
You know, your app just uses access
to your Salesforce systems.
What do you do with the leads?
You need to have a track cache, you know?
Which actually save all of these records
and becomes like a huge mess.
I usually send to engineering to make it really robust.
This is hundreds of thousands of dollars of consulting or engineering time. And so adding into a database is much better
because now you can just work into a database. If your CRM is out of sync, you know, it's
disconnected, you would just, actually StackSync will just tell you, hey, we can't connect
to your Salesforce anymore, please re-authorize. And once you re-authorize, we're going to
catch up everything which happened in between. From your app perspective, the reality of Salesforce,
which is what is supposed to be into your database,
is still up to date with the app, right?
It's still like atomic and sub-mini-second, you know,
like in sync.
And so this is exactly what two-way sync enables.
And so, and if we see, and if we see it like, you know,
all this, it transforms both the operational world and also like the ETL plus reverse ETL world because
actually like we understand that as real-time pace you know also the way you
handle conflict between apps is actually very different. So but
about all these errors and all these problems. So this is what ETL and
reverse ETL does not equal to async. But two-way sync, you know, because it has both direction
plus a conflict resolution mechanism
and a consistency mechanism.
This is two-way sync as one would intuitively understand it.
Totally, it makes several cents.
We're gonna take a quick break from the episode
to talk about our sponsor, Rutter Stack.
Now, I could say a bunch of nice things
as if I found a fancy new tool,
but John has been implementing Rutter Stack for over half a decade. Now, I could say a bunch of nice things
And if you've ever seen a tag manager, you know how messy it can get.
So RutterStack has really been one of my team's
secret weapons.
We can collect and standardize data from anywhere,
web, mobile, even server side,
and then send it to our downstream tools.
Now, rumor has it that you have implemented
the longest running production instance of RutterStack
at six years and going.
Yes, I can confirm that. And one of the reasons we picked RutterSt at six years and going.
of your stack. to learn more. Right, so marketing or whatever. 100%. What's interesting is that I'm essentially maintaining a database in the middle here.
And so I can sort of almost this concept
of like create real-time materialized views
that apply accurately to each system
but based off of like a single database
where all of the data lives and I can match the schemas.
Yes, that's totally correct.
And for example, I see a lot of companies
which use Salesforce for sales,
App Store for marketing, and you know, NextS And for example, I see a lot of companies which use Salesforce for sales, HubSpot for marketing,
and NetSuite for accounting,
but the customer record is in all of them.
And also, there's their app database
to power their products, whatever,
the logistics system or whatever.
So imagine yourself, before we discussed about,
okay, you have a SaaS app, right?
So we have a user's table from the SaaS app
and a user's table from,
in a context stable in Salesforce.
Now you add NetSuite, HubSpot,
and all your other systems,
and you have the exact same logic.
So the SQL query just get a little bit longer,
but you didn't have to learn about the APIs of each system.
You just actually make the same generic abstraction
over all tables that you are thinking,
and this is exactly what's actually happens,
and what's gonna make happen.
So I had a customer last week who told me like,
I'm very grateful for you guys to actually read
the documentation which I don't want to read.
Right?
So he is really, you know, I'm very thankful for this.
You know, I'm like, yeah, thank you so much, you know.
Actually, the guy did not, did never open the HubSpot documentation
or system documentation, actually,
both in the case of, for this case,
actually it was both cases.
And actually he was just able to actually get
a production app, really, within two weeks
and without even touching the documentation
or making the single API call.
Actually this guy has no idea how the rotation
of tokens works in, for most of the system
which is actually different because he didn't even have to do it.
That's the magic and this is the speed you want to give to your teams.
This is how your teams can go faster by having the right tooling.
So final question, why does it take so long to get here?
Why do you think we went the ETL route,
the first ETL route?
Like, if you sat down, I think,
with somebody with no context,
this is probably the most intuitive solution,
but that was not the evolution of how things have evolved,
at least in the analytics space,
of reverse ETL vendors, separate from ETL vendors,
separate from, so why do you think a progression happened as it has?
Yeah, exactly.
So I think this is a great question, right?
It's like, no...
Why not have some...
Ruben is asking the same question.
Yeah, I want to.
It was great.
But actually, it's essentially...
I mean, my suggestion to this is that,
first of all, the market evolved in a certain manner, right?
And so, actually, we can't really forecast and actually it follows what the market evolved in a certain manner. Right? And so, we can't really like, you know,
forecast and actually it follows kind of
what the market says.
So, it leads to, so, Stacksync is a natural evolution
of the ETL, reverse ETL and two way sync industry, right?
It's like the data industry just evolves like this.
And so, I can ask the same question and say,
hey, why do we use IBM DB2 and Oracle databases
when we could actually just use POSRES, right?
And maybe something else,
even like a snowflake for everything, why not?
So just because we actually like the technology
was just not there at the moment, there yet.
And then like, I mean, if we think about like,
maybe seven years ago, streaming use cases
were only for this elite companies
which were able to achieve
this.
You know, like even Flink, you know, or something like this, you know, was not even existing,
right?
And now some people build white label on Flink, you know, what label on like any kind of Kafka
use cases.
So this is just the underlying technology just became different, right?
So Snowflake or even BigQuery, right?
You could only have maybe, I remember when I was working on Snowflake a few years ago, right? So Snowflake or even BigQuery, right? You could only have maybe,
I remember when I was working on Snowflake
three years ago, right,
and seven years ago, six years ago, actually.
So the concurrency limit was only seven queries concurrent,
right?
How do you want to do some real time
when you have only seven queries concurrent, right?
So maybe three are taken by the business team
to refresh all of these dashboards,
and you only have three for all of your pipelines.
So how do you want to do real time on this?
It's just not possible.
But now, you know, it's like much more.
And Snowflake had no CDC,
but now they have Snowflake streams.
So you can capture changes much faster
and before you would have actually
I don't even know how to do it.
So this is how you can actually,
there's an underlying technology just evolved
and make this use case possible.
And you know, it's not that the people before
didn't do the job,
it's just because like they didn't have to tooling so they didn't but think
I don't know to think to be very honest, right
To a sink is a much more complex technology than ETL or STL would be and this is why you know
Like the market is just so juicy already for these technologies, which are simpler. Why would you engage in more complexity if?
The market is already juicy, right?
No, it's just.
Compared to that, like just writing a Python script
is way easier.
Right?
I mean, for an individual use case,
like I need to get data from here.
Yeah, limited use case.
Into Salesforce, right?
Right.
You're just gonna write, like,
just write some Python and, you know, whatever, right?
You're not gonna actually introduce a database system
into that equation,
because it does add a lot of complexity. But as a managed service, it's like, well, whatever, right?
You're not going to actually introduce a database system into that equation
because it does add a lot of complexity.
But as a managed service, it's like, well, yeah, I mean, this is like, makes way more sense.
100%. This is exactly correct. also what you're doing in Postgres because when you think about scale, like that's pretty interesting from like a tech standpoint.
Come back on and let's get really nerdy and dig into like how you're actually.
Yeah, absolutely. How to scale like Postgres to hundreds of millions of
fracers and real time sub-second sync, you know, on this.
It's really impressive. So yeah, so just for like, let's talk for everybody,
you know, like, so StackSync powers real time and to a sync between any CRM or ERP and databases.
So Stacks supports Salesforce, HubSpot,
which we mentioned, NetSuite as well,
but we also support any other CRM
and any kind of database, right?
So from Oracle DB to MySQL, Postgres, MongoDB,
Snowflake, BigQuery, you know, all the,
I already mentioned MongoDB,
but you have any database and this, this generic pattern is something you can apply
to any system.
And so yeah, more than happy to get in touch
if you guys have any questions.
And so like, if we have this discussion about this,
how to scale the postgres, this is another chapter
to actually take the technology for just much more full.
Yeah, yeah, yeah.
Awesome, weird.
Before we wrap up, folks want to learn more about StackSync
or connect with you Ruben,
where can they find the company, where can they find you?
Absolutely. So people can find us like on Stacksync.com or just email me at Ruben at Stacksync.com
and I'm super happy. I read all of my emails. So you will get an answer from me personally.
Awesome. And that's R-U-B-E-N at Stacksync.com.
Exactly. Awesome. Cool Ruben.-U-B-E-N at Stacksync.com. Exactly.
Awesome.
Cool.
Ruben, thanks so much for joining us.
This is episode two live from Data Council A RAP, but we will have another episode dropping,
so tune back in and we will see you on the next one.
The Data Stack Show is brought to you by Rutter Stack, the warehouse native customer data
platform.
Rutter Stack is purpose-built to help data platform. Rutterstack is purpose-built
to help data teams turn customer data into competitive advantage. Learn more at rudderstack.com.