The Data Stack Show - 03: Turning All Data at Grofers into Live Event Streams
Episode Date: August 27, 2020In this week’s episode of The Data Stack Show, Kostas Pardalis connects with Satyam Krishna, a data engineer at Grofers, India’s largest low-price online supermarket. Grofers boasts a network of m...ore than 5,000 partner stores, a user base with three million iOS and Android app downloads, and an efficient supply chain that allow it to deliver more than 25 million products to customers every month. Satyam offers insights into how he helped build the data engineering function at Grofers, how they developed a robust data stack, how they’re turning production databases into live event streams using Change Data Capture, how Grofers’ internal customers consume data, and the company made adjustments due to the pandemic. Topics of discussion included:Satyam moving from a developer to a data engineer (2:43)Describing Grofers’ data stack and data lake (6:41)Who is consuming data inside the company and what are some of their common uses specific to Grofers? (12:03)What are the biggest issues day-to-day as a data engineer? (18:21)COVID’s impact on business practices and the data stack (21:28)The big problem of data discoverability and metadata cataloging (27:44)Completely changing architecture to something that can scale up (33:16)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome back to the Data Stack Show.
Today we're going to talk with Satyam, who's a data engineer at Grofers.
Grofers is a huge grocery delivery service in India.
And we had a really interesting conversation about a lot of the challenges that they faced with the pandemic
and the changes that were happening every day and how they had to adapt to that.
It'll be interesting to hear about how they had built out a robust data stack and were able to
manage through that. I think one of the other interesting things that we'll dig into is how
they're turning some of their production databases into live event streams using change data capture,
which is just a really neat component of their stack.
What other things are you excited to discuss
from a technical standpoint, Costas?
Actually, it was very interesting to hear from him
like the different open source components
that they are using.
I mean, he touches on a couple of different tools
that they use internally, like for example, Debezium to recreate streams out of their production databases
and a couple of other tools to manage, especially to manage their data lake and their data
warehouses. That's one thing that I believe is super interesting. Another very important and interesting concept that we touched
was about what to do with the data after they are delivered.
I mean, there are many issues in every organization,
and I think that Satyam did a very good job
to describe these challenges that they have
with the internal customers, like all the product managers,
the engineers, the analysts, the marketing teams or the sales teams that need to consume
this data.
And I think he's really sharing some excellent insights around that, what are the challenges
and what kind of solutions they built internally. And he also touches some aspects of data governance and data quality
and how this can be addressed inside the organization
and why they are so important.
Great. Well, let's dive right in and talk to Satyam
about his role at Grofers and dig into their stack.
Hi, Satyam. It's very nice to have you here with us today and participate on this episode
of Data Stack Show. Can you give us a quick intro about yourself, your background, and
the company, Grofers, that you work at? Hey, Kostas. Thanks for having me here.
I'm Satyam, and I lead the consumer data engineering team at Growforce.
I've been working in Growforce for a very long time now.
So it's almost been six years.
And I was actually the third engineer here.
And I started working as a mobile engineer back then.
I bootstrapped the mobile applications for some time and decided I wanted to do something else.
And so I shifted to data engineering two years back and have been working on data problems since then.
About Growfirst, so Growfirst is a low price online supermarket.
I would say we are basically an e-commerce platform that's making quality products affordable
for Indian customers.
And we have our in-house techs which manages kind of right from managing our partner stores that enable the company to run fast and we try to have a lean supply chain.
And ultimately, the goal is to have an efficient supply chain
to deliver millions of quality products
to the consumers every month.
Sounds great.
I find very fascinating that you started as a developer
and then you decided to move into data engineering.
How do you feel about this, by the way?
I mean, are you happy with this choice?
Also, can you share a little bit more about what made you make this choice
and from being a developer,
specialize more into anything that has to do with data
and become like a data engineer?
So it kind of started from the fact
that I wanted to look at my product
from all the different angles.
And I had spent a good enough time
in building that consumer application,
but I wanted to see how the users interact with it,
what's the data around it.
And that kind of always excited me to look at
how we are getting the conversions,
how the different metrics are getting tracked.
So that was always something that it's there,
it's happening, and I'm excited about it.
And when the opportunity came that,
yes, I can shift from the engineering team
and start learning about data,
that's how I kind of got into that team.
Yeah, that's a great point, actually.
I think one of the characteristics of working with data
and becoming a data engineer is that you have the opportunity
to pretty much have like a 360 view of what is happening in the company.
I mean, you have to work with the product,
you have to work with the rest of the business lines
inside the company.
So I think this is quite interesting.
What do you think?
For sure, for sure.
And the biggest change for me was that
when your stakeholders are the users of your application
to the internal users of the organization,
that was also one of the biggest shifts that I actually saw.
So when you're a mobile engineer, you're working towards making your application better,
you are impacting the users outside your organization.
That's a completely different challenge.
But once you start building internal tools, you're building for the stakeholders
and you also get the feedback much faster.
Because they will come to you and say that the data platform,
that these are the issues that need to be fixed.
Whereas the feedback loop in a typical consumer life cycle is longer, right?
So there were completely different challenges in terms of, you know,
how you manage a mobile application versus how you manage an internal data product.
Absolutely. Absolutely. I totally agree with that.
So moving forward, can you
share some more information
about the current data stack that
you have in Growforce
and what's the infrastructure
and the applications and
technologies that you are using to
manage this data stack?
Sure.
Primarily, we are based on
AWS services
and that is why our primary warehouse is Redshift.
So all our data ultimately resides in Redshift
and ingestion in different tools
happens basically through some of our batch jobs
and some of our streaming jobs.
So these are separate jobs which are responsible
for different kinds of data needs.
We primarily use Airflow as our orchestration layer,
which helps us in managing these jobs.
And we are also working on a Hudi Lake,
which is essentially going to become our data lake
for all our source data.
We also have some of our Spark jobs.
So we also have our Spark cluster running,
which is used to process all the different event data
and a lot of our ML, AI workflows.
So these are primarily our ingestion
and warehouse competence.
And over the top,
our people basically query these data marts
on their red shirt using primarily an open source tool called ReaDash.
And we also have a self-serve analytics use case
where we use Tableau.
Sounds great.
So that's a lot of information
around the current technologies that you are using.
What about the data sources?
I mean, where do you collect data from
and how do you work with them?
Okay.
So primarily because of the nature of our business,
it's a transactional business.
So mostly our source databases are Postgres and MySQL
depending upon the service that it's using it.
So we have a microservice architecture where you have different services like cart, orders,
last mile.
And because it's an e-commerce company, again, you have a lot of components.
You have a delivery app, you have a picker app shopper.
So a lot of these services have their independent service and they have their own databases,
which are primarily Postgres and MySQL.
And what we essentially do is we capture the replication logs from these databases.
Like in case of MySQL, it can be a bin log or val logs in Postgres.
And we use Debezium, which basically captures those CDC changes and dump it into a Kafka
stream and dump it into a Kafka stream.
And dump it into Kafka, actually.
And from Kafka, we dump it into a raw location in S3
where we process it and convert it into a Hudi Lake.
And that's how our lake gets created.
Sounds good.
So this is how you're getting, I guess,
the data that are generated mainly on your production databases.
Are there any other data sources where you get data from?
I mean, cloud applications that you might also pull data from,
or event data that you might be capturing on your web properties or the application that you have?
I mean, the mobile application?
For sure, for sure.
So we use Rudder Stack to basically capture
one of our data for all our impression needs
on our consumer application.
And that's another part of the data that we capture.
We also have some vendors through which we collect data
and ultimately all of that flows into our lake and in Redshift.
So yes, we have other sources like event data.
Sounds good.
So between the two main data sources that you mentioned,
all the data that are coming from your production databases
and the event data,
which source generates actually most of the data
that you have to work with,
or they are on par in terms of like the volume?
No, definitely. The volume of our event data is much, much higher. So we capture around five to
six billion events every month. Whereas if I talk about our transactional data, we must be generating
some terabytes of data every month. So definitely the scale of the event data is
on a completely different level. And we definitely have different workflows to manage our event data
compared to our normal transactional data.
So as I said, for the transactional data, we are using CDC.
And for the event data, because most of the vendors,
what they do is they dump your data into your S3 in a raw format.
And then we have Spark jobs running,
which get them converted into Parquet compressed files,
partitioned by date, so that that can be accessed through Redshift Spectrum. And we don't even keep
it in Redshift because of the size of the data. So yes, definitely there is a very big gap in terms
of the nature of the size and the volume of the data. Yeah, yeah. That's an impressive amount of
data that you have to work with. That's very interesting.
So, okay, we talked so far about the current data stack that you have and the different data sources and the volume of the data that you have to work with.
So my next question is, what are the, I mean, who is the consumer of this data inside the company?
I mean, you mentioned at the beginning about the difference of being a data engineer
and who is the stakeholder and who is actually your customer as a data engineer
compared to building a product, which I assume when we are talking about this,
the customers are actually the people inside the company
that they need this data to drive their work.
And so who are these people?
And also, if you can share with us some common use cases that you see inside Grofers
that are outside the common use case of reporting and BI analytics,
that is like the most common use cases and the first use case
that every company is trying to build?
So definitely.
So Groofer is a data-driven company
and each of the different aspects of the business
are basically grouped together
into these different teams.
And I said, like, for example,
you have a category team
which manages all the different products
which are going to be visible,
how much is the margin and everything on it, right?
So each team has its own purpose
and each team will have its own data analyst
which help them reach their target metrics,
which help them show where they stand right now.
So I would say like,
if we have 20 teams inside Grofers
which help manage the different aspects of it,
then each team becomes my data stakeholder
and they use data in some form or the another
to get information around.
As you said, like your typical reporting cases
where they want to track their L0 metrics
and they use Redash and they use Tableau
to create queries on the top of the existing data
to get those metrics.
So these are definitely the users inside our data.
And so we have a decentralized team.
So we have a common data engineering team
which manages the data product and the data platforms.
But in terms of data analysis, these are decentralized teams across all the functions.
And in terms of use cases, I would say that for event data, there are a lot of different
use cases that people use.
For example, when they have to test a rolling out a feature, they want to see how the users
are using it. They are running an A-B test,, they want to see how the users are using it.
They are running an A-B test,
how they want to see the conversions happening on it.
So all of that happens through the event data
that is basically added to these different features.
And there are other use cases as well.
We do a lot of recommendations
based on the existing data that we get
and the associations of the data,
right? So for example, once you visit our, and that's what we call our homepage is the feed
homepage, right? So when, once you visit our feed, the view that you get is a very personalized view
and that's basically created based on your past buying and what other people buy, right? So that kind of association helps us
create those recommendation engines
which help power the different field APIs
and the sorting APIs.
And I can even talk about
one of the other big use case for us
is merchandising a product, right?
So we have customers who are coming to our platform
from all the different aspects, right aspects and the different companies out there, they want to basically boost their product or they want to sell their product on our website or on our application. them a page where they can showcase their products.
And that's what we call it Merchandising.
We sell it as a product to the different brands out there where they can utilize it and make
it like a sponsored kind of product.
So these are some of the use cases that come to my mind.
So recommendation, feature testing, merchandising.
Yeah, it sounds great. I mean, from what I hear from you, data is pretty much used in everything inside the
company.
So it's not just about reporting and business intelligence where you're trying to figure
out what happened with your business in the past and how you can make choices for the
future.
But the whole product actually, I mean, not the whole product, but a big part of the product is also based on using this data
to create recommendations or promote some of the products
that you sell through Growforce.
That's pretty amazing and very, very interesting, I would say.
All right, so moving forward, I mean, we talked so far
about the technologies
and the data stack that you have and the volume of the data
that you have to deal on a daily basis.
So how many people are currently supporting the infrastructure
that you have and what is the structure of the team
that is supporting data engineering inside GrowFers?
Sure. So as I said, we have a centralized data engineering inside Grofers? Sure.
So as I said, we have a centralized data engineering team
and our entire company or the entire organization
is basically broken into these two aspects of,
one is the biggest aspect is the supply chain,
which actually delivers the product.
And one aspect is your consumer team,
which I am a part of which basically you know manages how the consumers are
placing an order how they are coming onto an application the acquisition
retention part of it right so we have these two big chunks which we call as a
consumer and the supply team and we have a data engineering team which sits across
right so we currently have five data engineers who manage all these different products,
including, as I said, about managing the past services
and managing our open source systems, part clusters, everything.
And we have a one product guy who helps us create these products,
figure out what is the next most useful thing.
And in terms of data warehousing
or in terms of data analysts,
as I said, we have two people in our team,
but most of the data analysts
are kind of decentralized into their own teams.
Sounds good.
So what is the most common issue
that you have to deal with as a data engineer right now?
I mean, what's the biggest pain or your biggest,
let's say, what you would like to solve
in terms of your day-to-day work as a data engineer?
Yeah, so as I said,
having a decentralized team kind of have a downside
also, right?
So it becomes very difficult to manage your like standard L0 metrics and there's repetition
of metrics, there's repetition of dashboards because it is really difficult to get that
communication happen across teams all the time.
So you see a lot of these things happening.
Plus, it's also very difficult to enforce best practices because those people, they
don't work with you day in, day out.
And they have all come from different backgrounds.
And it's not necessary that everyone has worked with Redshift and Redshift has its own nuances,
how you want to write query.
And it's very different from a very different warehouse where they might have already been
using some of the other practices.
So to make them understand that Redshift is very different from whatever they have been
using in the past and you create platforms or you create abstractions on top of it
to ensure that they don't have to know about Redshift.
I think that is one of the most challenging things for us right now.
Very interesting.
So from what I hear from you,
one thing is actually dealing with Redshift
and how Redshift can be used.
I mean, the data warehouse doesn't have to do necessarily with Redshift. Redshift is what weshift can be used. I mean, the data warehouse
doesn't have to do necessarily with Redshift.
Redshift is what we are using right now.
But from what I understand,
one thing is how you can have
all these different people interact
and follow the best practices
interacting with the data warehouse.
That's one thing.
And then there's a lot of,
let's say, issues that they arise because of, let's say, the governance and the communication between the team as a result of having a decentralized way of doing analytics and working with data inside the organization.
That's great. Okay, one of the things that's, I mean, it's very, very common question lately is about COVID and how this has is actually affecting like businesses.
And I assume that also in your case, because you are like a B2C company and also in something like quite sensitive and essential service, which has to do with buying products.
Did you see COVID having any kind of impact on your business?
And mainly, what kind of impact it had on your work, right?
Like on the data infrastructure that you're managing,
the models that you're building,
and your everyday work with data?
Definitely.
I think, as you mentioned, all of the businesses, COVID has definitely impacted us.
And it is an essential service.
So customers were definitely, we were seeing one of our biggest organic growths when the COVID started.
But as you know, right, so like the ground operations were impacted massively.
There were lockdowns happening everywhere and it was getting very difficult to, you know, serve our customers.
So I would say that the initial days were very tough.
And most of the time that we actually spend on initially was changing our business model in
terms of obviously the first thing that comes was that how we how do we ensure the security
safety of our customer right how do we reduce the touch points for a customer and how are we able to
serve all the customers right we want to give those essential services to everyone so that people get their groceries safely from their home.
And that kind of pushed us into thinking around how we can better batch products,
how we can essentially not worry about the delivery timelines more rather than worry more about how you can,
if someone is placing an order today and someone is placing an order tomorrow, but they, you know, go to the same place, we would ideally
want to club them so that, you know, you have less delivery routes and you're able to deliver
more orders.
And there were a lot of product decisions that were basically taken at that time.
Like one, we built a completely contactless delivery feature.
We added capabilities in our application
around how you can edit your order
multiple times, which was not even supported.
So once you place an order, it's kind
of finalized. But because we know at time of
a pandemic, people forget about
things, they want to add something, they
realize, oh, I need to add that,
right? And if they place another order,
it probably might not get
clubbed, right right or it becomes difficult
for our operations to manage n number of orders so yes we we did a lot of our business changes around
um how we can you know better serve our customers in a way and ensure the safety of our customers we
so uh that's that's something that definitely happened and And one of the biggest change for us, and at least for our team,
was that there were a lot of new needs around reporting.
Because once you change your business model,
or once you try to adapt to the situation,
you also want to track metrics at a much faster pace.
So that you know that whatever things that you're changing,
and people were changing a lot of things in a given day, they want to track that, okay, this change caused this, right? So there are a
lot of new real time reporting needs that came with that situation. So that was definitely one
of our bigger challenges where we had to get better in our alerting systems and near real
time reporting. Yeah, it's very interesting. So you mentioned a couple of challenges that you had with the business side of things.
Is there something that you had to change or something that you had to introduce on
your data stack because of the impact that COVID had, like maybe having more, like a
larger volume of data that you had to operate with?
Or as you mentioned earlier, you said that you had very rapidly to create like a massive number of new reports.
So is there something that you've learned or something that you improved in terms of either your best practices
or your infrastructure as a result of having to deal with COVID?
So, excuse me.
So I think one of the challenge we faced
was more in terms of the communication internally in the team
rather than the infrastructure.
Like infrastructure, it was easy to scale
because we usually use managed services
and we have cloud infrastructure hosted on KX. So to be very honest, for a data engineering team, the issues are not
in that direction. But because this was the first time that we are working in a, you know, working
from home kind of scenario for a longer duration, right? So that was definitely one of the biggest
challenges for us. So I would say it was more from the communication side of things where if you're managing a team, you have to ensure that the
health of the team is always good. And when you have logged on for months, you are, you know,
you're cooped up inside a room. So you want to ensure that, you know, that energy gets released.
And, you know, we figured out games that we can play in our, you know, typical meetings,
when at the end of the day,
we are having some online games to release that energy.
So I think that was definitely a bigger challenge for us.
There were challenges around infrastructure, but mostly around how you have to scale up
services and not increase the cost also massively, right?
So if your business is impacted, you also need to figure out how do you save it at
other places, because you're losing more on your deliveries. So you have to, you can't just scale
it infinitely. So you also have to figure out how you can scale it sustainably. So that I think those
are the two bigger challenges for the data engineering team once the pandemic started.
Yeah, it's very, That's super interesting to hear
the organizational challenges
because of all these changes that happened
during the pandemic.
Moving forward and staying with challenges
for the next question,
let's discuss a little bit more about
the challenges that you are facing with the technologies that you are using.
I mean, previously you mentioned that there are some difficulties in aligning everyone to follow best practices
when they have to work with Redshift, Amazon Redshift.
Can you share a little bit more information around that?
What are the type of challenges that you are encountering
around the data warehouse that you have
and how you are dealing with these today?
As I said, we have created a platform, and we
have democratized in such a way that anyone using you know, the
data, using the data can go and create their own data marks,
right, and they can basically go ahead and do a lot of analysis
on their own. But that also brings in as I said, a lot of repetition
of data and at times people
don't know if they're looking at the right data
or not. So I would say data
discoverability right now is
a very big problem. So we have
around
2,000 to 3,000 tables in Redshift
and people at
times they don't know whether this is a production
table or it's something that was created on an ad hoc basis.
So data discoverability is one of the things that we are working on right now.
And we are trying to solve it through, you know, your typical data cataloging,
figuring out how you can create a data catalog at Grofos,
which can basically help you, you know, give a picture about, okay, this table,
what does this table mean? Who created it?
When was it last
updated? So a lot of meta
information about your
tables and
not just the fields and the
columns, but also about the data inside it
is going to help a lot of folks understand
what this table does.
So data discoverability is one of the things
that I feel is a challenge right now and that's something that we are investing our time and solving it
another i would say is that how we are currently managing a real-time pipeline
is something is of a big big challenge as i said it was a very make-do solution to you know support
our business requirements at that time but we for sure know that it's not going to scale up very well.
So what we are doing is because we are getting those CDC changes from our data
and what we want is that people are able to join data across databases in real time.
So what we are doing is we are dumping these different CDC streams into one single database
and then allowing people to query from
that and we keep on pruning the past
data. So that's something that we have built up as
a make-do real-time pipeline.
So you have data changes
coming in but it's getting
difficult to difficult, it's getting more
difficult to manage primarily because
we are dumping it into a post-based kind of
database and
obviously your
OLAP queries are not going to work on OLTP.
So that's something that
we definitely want to
change and we want to move to
Kafka Streams or we want
to move to some different system
which can definitely scale up with our needs.
Very interesting.
One more question about the first challenge
that you mentioned
that you are trying to solve using data cataloging.
You mentioned that apart from cataloging
the standards metadata around the table,
you want to add some more fields there
and some more metadata
that can help understand even better what kind of data you are dealing with. Can you share some more fields there and some more metadata that can help understand
even better what kind of data you're dealing with.
Can you share some more information around that?
Yeah.
So basically, very basic things like even if you get some of the common values that
are present in a column, or how much of the data is null, or what are the maximum or the
minimum values, what are the different values. So
all of these, when you start seeing that metadata for a table, you get a very overview level of
understanding, okay, what are the different values that are possible in the system? And it really
helps. Like if you see that a particular column is null 90% of the time, you either know that,
okay, something is not right here.
Right.
And you can, you know, reach out to the data team and say, this table is not getting populated
properly.
Or even for us, we can start setting up alerts on that.
Right.
And you have some verified tables and you have verified dashboards where you're saying
that this is the metric that you're always going to hit.
And if you're not hitting that, we get to know.
Right now, it's more of a,
you know, something that we get alerted from different teams and the data is not coming in,
or the data is not properly populated, right? And the downstream services and the downstream
reports are impacted. So we also want to change that. So with data discoverability,
if you're building that system, which can get that meta for you, you can also kind of set up alerts over the top of it and you get alerted
internally rather than someone else reaching out to you.
So it brings in more confidence that the data that you're looking at is right.
And you can trust your metrics, right? Because if the data,
if you start losing confidence in that,
you can never trust your metrics and then the business decisions that you're
taking, it gets more and more difficult to, you know, be sure about what you're doing is actually impacting or actually you know making
a positive change yeah that's a great point actually and it's very interesting to see how
metadata catalog can also help with quality around data and how you can have mechanism to
figure out if something is going wrong or understand if you have to fix
something around your data. That's very interesting, super interesting.
So moving to the last question of this very, very interesting interview, any interesting projects
that you would like to share with us? I mean, it can be something either internal that you're doing at Growfirst or something personal that you do.
Anything that excites you, actually, that has, of course, like to do with data and data engineering.
So I can think of something that's, so we have been, as I said in the beginning, right, so we
have been working on creating this lake at Growfirsters, which is updated at a good frequency.
And I would say that in the start of 2020,
most of our tables were getting updated in Redshift
using a bad job, right?
And I said, we massively rely on those kinds of bad jobs.
But as we are growing
and as we are seeing the scale of our data, we
know that that's not going to scale up so well. So we started moving to an architecture
where we have more of a Kappa architecture and how we are updating our data and how we
get that data into our system, we have completely changed it. And that's something that we are
working on. So we are using Debezium, which is something which basically, you know,
converts these events from different databases
of their own format.
Like each database has its own format
into a standard message,
which can be then used to, you know,
create your data lake.
So we are using Debezium combined with Kafka
to get the raw dumps in our S3
and then running our Spark
jobs to basically populate our Hudi Lake, which is something, so Hudi is basically,
it gives you absurd capabilities on Hadoop kind of thing.
So Uber made it a couple of years back and it's now a part of Apache ecosystem.
So we are making Apache Hudi Lake using that raw data
and then essentially populating a
redshift. So that's something that we have been
working for some time now
and I think it's really exciting in
how you can
create a very dynamic lake using
Hudi where your lake
like you're
able to absurd the data into a lake
and have that asset kind of transactions
happening over S3, which was
not earlier possible.
So we have
a much faster refresh of
how we are updating our lake compared to
our bad jobs.
That's great.
I think we have many reasons to
get on another
interview in the future.
I mean, you're building and you have some very interesting projects inside Growforce
and it will be very interesting in the future to see how things went with these projects
and what new lessons you've learned from that.
So thank you so much, Satyam, for your time today.
It was a great pleasure chatting with you
and I hope to have the opportunity soon to chat again.
Sure, sure. Thanks, Kostas. Thanks for having me here.
It was a great conversation and I loved it.
So that was our interview with Satyam.
I really enjoyed it and I found it very interesting
with all the insights that he gave to us. I mean, it's always
amazing to hear someone
who has started
as a software engineer
and ended up
from a small company because he
started at Grofers when
they were just like a couple of folks there
building the company and he ended up
pretty much building and running
the data engineering function inside the company.
So I think it was very interesting to hear how this happened
and the experience that he gathered from there.
There are a couple of things that, as we said also in the introduction,
we touched there.
There are many, many other things that Grow as we said also in the introduction, we touched there. There are many, many other things
that Growforce is doing in terms of how they utilize and they extract value out of their data
that we haven't touched. Things like, for example, how they do personalization using this data,
and also in terms of the organization, how they involve other functions like marketing, for example, to actually drive this personalization, which is quite unique.
Because most of the times when we are talking about personalization, we just think of some algorithms doing it.
But here we have a very complex scenario where we have also people from different departments involved in this. So, yeah, Eric, I think we will have the opportunity in the future
to talk again in a couple of months with Satyam
and learn more about what they are doing internally at Growforce
in terms of using their data
and what kind of interesting technologies they are building.
Yeah, I'm excited.
I think, you know, with my background and early career in marketing,
I would say that
the way that they're using their data at every point in the customer journey, as far as audience
is pretty exciting. So that's going into personalization for users of the app, but they're
also packaging that to go out and do more sophisticated acquisition of new customers. So
we didn't get a chance to talk about that,
obviously, in this episode of the show,
but we're going to circle back up with the Grofers team
to learn about some more of those use cases
and hopefully get a couple of people
from those other teams like product and marketing
to join us as well.
So stay tuned and we'll let you know
when that's going to happen.