The Data Stack Show - 01: Discussing Mattermost Data Infrastructure with Alex Dovenmuehle
Episode Date: August 12, 2020In this episode, Kostas Pardalis sits down with Alex Dovenmuehle, head of data engineering for Mattermost, an open-source self-hosted communication tool that optimizes dev workflows in highly secure e...nvironments. Kostas and Alex discuss:Alex's background and experience (2:29)Data stack Mattermost is using (9:25)How Mattermost built their Data Stack (21:05)Using data to understand the story of the customer's journey (24:58)Focus on privacy and security (26:33)Practical ways Mattermost is using data (37:14)What's next for data analytics at Mattermost and wrap up (42:45)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
.
Eric Dodds here on the Data Stack Show.
Today we're going to talk with
the head of data engineering for Mattermost.
Mattermost is a company based on open standards.
They create communications tools that optimize
dev workflows in highly secure environments.
Really interesting company. We align
with them pretty closely because we're also based on open standards here at
Rudderstack. So really excited to learn more about how they're building their
data pipeline. In terms of technical specifics Kostas, what interested
you about the conversation we had with Alex? Yeah, the whole conversation with Alex was very interesting.
We went through the whole data stack that they have.
So we had the opportunity to learn about the different data sources
where they pull data from,
things like the different data types that they work with
and all the different technologies that they have to incorporate
to create a very up-to-date data stack.
We were going to discuss and we discussed around the problems that they have
and actually some very interesting points that we touched was about data modeling
and the importance of data modeling and some technologies like LookML and DPD that they are incorporating in their data stack.
And also how important open source is and the different open source tools out there in order to build and maintain such a complex data stack as the one that they currently have.
So I'm pretty excited. I think everyone will listen to some very interesting ideas around how to build a complete data
stack and what kind of issues you might encounter and also some possible
solutions to them. So hi Alex, thank you so much for your time today. It's a great
pleasure to have you here.
And yeah, we are here today to discuss about Mattermost, you,
and learn more about what you are doing with analytics and your overall experience, of course,
and especially what you are doing at Mattermost.
So, would you like to start with giving like a brief background,
introduction about yourself and also say a few things
about Mattermost?
ALEX DOVENMILLEN- Sure.
So my name is Alex Dovenmill.
You'll have to find my name on the internet
to see how to spell it.
My background is a computer science degree in college.
I went to UNC Charlotte, which is in North Carolina.
And for a long time, I just worked, you know, around Charlotte.
I did back end, front end, everything.
I did, you know, VB.net, C Sharp, Ruby, Python.
I've kind of done all sorts of stuff.
And then about five years ago,
I started a job at Heroku,
which is a Salesforce company.
It's sort of used to be bigger back in the day, I guess,
but, you know, platform as a service type company.
And I joined on the vault team,
which was essentially the team responsible for the systems that handled
billing the customers.
And then our group was called business operations.
And as part of that group,
we also ran the data warehouse and the analytics for the company,
for Heroku itself, I guess, as a business unit.
And that's kind of where I started to really get into
a lot of the data engineering.
And I'll say big data, but it wasn't really that big.
I always say when people say big data,
because big data to me is like petabytes,
like Google, Facebook scale kind of stuff.
Yeah.
I mean, I think we were using Redshift at Heroku, and it was maybe, I think we were getting up into the 30 terabyte range or something, but it's not like it's that big.
Come on, guys.
Anyway. Yeah. So really learned a lot there um and i was actually switched to like an actual data engineering
role about two years into my time there and um i got them onto airflow i got them onto Airflow. I got them onto DBT.
And we actually, at that company, we built our own.
We were using Segment at the time at Heroku for user analytics.
And we wanted to get away from Segment.
So we actually built a homegrown
analytics pipeline, essentially,
using Amazon Kinesis,
which worked fairly well, and actually
had a lot of traffic going through that.
But yeah, so then,
about six months ago, I left Heroku and started at Mattermost.
When I started at Mattermost, they kind of had a data warehouse, they were using Redshift,
but, and they were also using Chart.io, which is sort of like you know business intelligence kind of tool like looker or whatever
but they really didn't have any actual like what I would consider data engineering things
set up and really the warehouse that they had at the time only had essentially user and in our case
server analytics but they weren't pulling anything in from
salesforce or zendesk or any of the other myriad of tools that we had and i came in and built a
data engineering infrastructure with it's on a amazon eks so it's on Amazon EKS, so it's Kubernetes.
It uses Airflow, dbt,
and then we're actually using Looker at Mattermost.
But that's kind of the, I'll say,
that was sort of a long-winded introduction,
but we'll just go with it, that's it.
Yeah, yeah, I mean, we will have the opportunity
to dive deeper about on the architecture that you have there
and how the overall setup is and why you decided to go this way. But it's very interesting what you
said and a couple of questions on that. First of all, going from Heroku to Mattermost and it's been
quite some time between these two companies. How you've seen
things progress and how they have changed in terms of data engineering and the role of the data
inside the company and what are the differences that you have experienced as a data engineer
going from Kuroko to Mattermost? Well, I think we've tried to take the best
of what we learned at Heroku to Mattermost.
And really, the overarching goal is get all your data
from all the different places into one warehouse,
in our case, Snowflake, and then use a tool like dbt to make sense of all of it
and kind of aggregate it up to the levels and stuff so that when you present it in Looker,
you can not only know that you're presenting accurate data, but you're also really enriching
all the data that you do have and making all the little connections between all the data to really unlock the
power of it.
Because it doesn't exactly help you if like, I can go log into Zendesk and see this.
I can go log into Salesforce and see that.
Being able to combine it and see it all in one place makes it a lot more valuable, in
my opinion.
Yeah, that's very interesting.
And I think it's even more interesting when we see how a common pattern, what you're describing
is even between so different companies as Heroku, which is like an infrastructure company,
and Mattermost is around chat application, but still the need to collect all the different
data from there and put it in one place and try to unlock the value out of it remains
the same.
That's great.
So moving forward, can you give a little bit more color around the stack that you currently
have?
What kind of technologies?
I mean, you have mentioned some of them already. But give some more information about the whys behind you
chose these.
And also, it would be interesting to hear
about how these tools also changed from your experience
this past few years, starting from Heroku to today,
Mattermost.
Yeah, so let's see where to start.
I think we'll start with Airflow.
I did mention that we were running it on EKS,
Amazon EKS with Kubernetes.
And so Airflow, if you don't know,
is really a job scheduler and runner at its most basic,
but it has this concept of
DAGs, which are directed acyclic
graphs, and you essentially
can do lots of
fancy things, like say,
once these two jobs run successfully, then
run this next job, and then it also
has a lot of stuff built in with
automatic retries,
it'll alert you if anything fails,
it logs the logs of a specific job in a specific place.
So the UI is kind of nice to be able to make sure
that everything's running the way that you think it should
and being able to monitor everything
and all that kind of stuff.
So Airflow was something I discovered three years ago.
I know it's a lot older than that.
I think it was originally an Airbnb tool, if I'm not mistaken.
Yeah, I think so, yeah.
Yeah, and now it's like an Apache open source project.
So yeah, that was definitely like day one,
I was like, we definitely need Airflow.
Like you're just going to have to have it.
So built that out. And one of the we definitely need Airflow. Like, you're just going to have to have it. So built that out.
And one of the interesting features that Airflow came out with,
I can't remember exactly what release it was or even how long ago,
but it's fairly recent.
Each job in a DAG is called an operator.
And they came up with what they're they call the kubernetes pod operator
and what it essentially does is it runs the job in it spins up a kubernetes pod
with whatever image that you tell it to and then it runs the job in that pod now what's really cool
about that is now you have decoupled the scheduling and operation of just running the jobs with the actual what is running the job.
What does the job actually do?
And what's cool about it is Airflow is written in Python, but because you're using a Kubernetes pod operator, you could run a job in any language that you want, because it just needs to be some container,
like a pod running in a container or whatever,
to run it.
So you could have, if the rest of your company's on,
you write GoLang or maybe Haskell, I don't know,
whatever craziness you want to try,
it doesn't matter, with the Kubernetes pod operator,
you can do it.
One thing I will call out is that I'm assuming most people have heard of GitLab, but if you
haven't, it's sort of like an open source, open core kind of competitor to GitHub.
But they actually open sourced all of their data engineering stuff,
just like we have at Mattermost.
But I sort of took some of the patterns that they were using
to get the stuff set up.
So shout out to them for all that stuff, because it's all pretty nice.
Yeah, so that's Airflow.
Next tool to probably talk about,
maybe we'll talk about Snowflake, I guess.
So Snowflake is a kind of a data warehouse tool,
essentially a database columnar data store.
Kind of competes with Redshift.
I don't even know what the rest of them are.
I'm sure there's others out there.
Yeah, BigQuery.
Yeah, BigQuery, right.
Now, the big difference between Snowflake and Redshift is that the way that Snowflake works
is the storage is decoupled from the compute.
And that's really the key to why Snowflake,
in my opinion, is really nice.
Because it allows you to scale your compute very easily.
Whereas if you're using Redshift,
and we used a lot of Redshift at Heroku,
if we needed more space,
we'd have to add more nodes to our cluster.
There wasn't any way to really just add compute
if you needed it.
Now, I do know that Redshift has been starting to add features like that, but I haven't personally
used those.
And honestly, Snowflake's been pretty nice. We use a lot of Snowflake's been pretty nice. You know, we use a lot of Snowflake.
And then on top of Snowflake, so for our actual, like, data visualization
and sort of business intelligence tool, we're using Looker.
And that's mainly because, actually, when I joined Mattermost,
I actually joined with somebody else. And basically four people from that business operations group at Heroku, four of us came over to Mattermost within two months of each other.
And so we purposely did that.
But anyway, we used a lot of Looker at Heroku as well.
So it was kind of a tool that we're comfortable with.
And the other nice thing about Looker too,
is that we actually have,
Looker uses a, I'll call it a language,
called LookML for defining your models and views
and explorers and all this stuff.
And Mattermost, we're actually,
we have our LookML for everything that we do in Looker
is also open source.
So you can go find that as well,
which is kind of cool.
Another nice thing about being sort of open source
about stuff.
The next one I'll talk about is dbt,
which stands for data build tool.
The purpose of dbt is to essentially take care of all your data transformation in your data warehouse.
It's just super good at what it does. and it really makes it easy to build models
that are not only accurate,
but also they're just easy to build,
and it's easy to figure out what's going on.
It also has this cool,
has this docs site that it will generate for you,
and you can go see,
let's say you have this table that's been aggregated
from 10 different tables,
and all this stuff has gone to make this one big aggregated table.
It'll actually show you the data lineage
of where all that data came from,
like how is this thing actually calculated,
which makes it a lot easier for somebody coming in new
to understand what's going on,
like how did this data get calculated,
and all that kind of stuff,
where if you just have a bunch of crazy SQL
all over the place,
it's just impossible to figure out where it's coming from.
Yeah, actually, from my experience with dbt,
I think what they have managed to do in an amazing way
is that take all the good practices
that engineering in general has and that always were missing
from when you had to work in a database environment with SQL
and actually apply them.
So you have like, you can version your models.
You can roll back and do tests and stuff like that,
that we were always discussing about to do that on the database level,
but it just wasn't there for whatever reason.
And I think that dbt managed to do an amazing job on that.
A quick question.
You mentioned LookML and dbt.
And just out of curiosity, because you are using both,
and I think this is interesting.
I mean, there is some kind of overlap between these two.
How do you separate these two tools
and how do you use them inside MotherMost?
Yeah, that's a good question.
So to me, the point of DBT is to take all the raw data that you have
and jam it together in a way that's accurate.
And like you said, you can do the testing and all that stuff
in a very...
It has some actual engineering discipline to it.
And then we generally only build stuff in the LookML,
the views and the models and all that stuff in Looker
on top of those already pre-built models from dbt that we made.
And then we just start showing those ones in Looker
because that way we know, okay, we've verified this model that dbt made,
like this table really, has all this stuff.
It's accurate. It's what we want,
and then we can show it, use Looker to let people explore the data,
and that keeps out a lot of the weird things
where if you have people in Looker
and they don't necessarily understand exactly what the data means,
and then they start making weird explorers that try to map data together that
doesn't really, you know, map the way that they think it does.
So it just makes it a lot cleaner and it makes a lot,
I think it makes it more approachable for the users and looker.
Like you don't want them to have to care or know about the data model at like a
super deep level. You just want to give them to care or know about the data model at, like, a super deep level.
You just want to give them easy tools where they can just, like, oh, I just want to see this, this, and this, and, you know, give me some group by aggregate sum or whatever.
And allow them to do that without, really, without them having to come to you and asking questions about, what does this data mean?
That's always the, you know, I mean you get you're gonna get those regardless but trying to minimize you know giving them the giving them the
power like empowering them to do stuff is it just makes it more scalable and
that's really what you're going for yeah Yeah, I think there is a good...
Because you mentioned earlier about Snowflake, about the
power of decoupling the storage from the processing.
I think something similar is also happening with
these tools where we see that the modeling of the data is actually decoupled
from the actual visualization and modeling of the data is actually decoupled from the actual
visualization and working with the data and trying to come out with data and all that
stuff.
And I think that this kind of decoupling will appear more and more in anything that has
to do with data.
And I think that it's a very powerful paradigm and we can see it manifesting itself in different
ways. And I think this is also what is happening with products like Looker,
LookML, DBT, and all that stuff.
I mean, this whole industry is still shaping
because it's still a work in progress.
But I think I'm just kind of decoupling.
It's what actually is moving forward in the industry right now.
Cool. So anything else around your data stack
that you would like to mention?
Oh, yes. There is one more.
It's called Rudder Stack.
Like Heroku, when I joined Mattermost,
they were using Segment.
And I think it was about two years ago,
and two years ago at Matter most when I wasn't there.
They started getting billed like an insane amount per month
because they were going like way over their usage on Segment
and they essentially ended up turning off
all except 2% of the events that they were sending to
Segment.
So essentially, and I mean, you know, losing 98% of your data, like just gone, because
you can't, you know, deal with Segment paying that much.
So when I came on, I was like, well, I've already gotten away from segment once.
I'm going to go ahead and do it again.
Not that I don't think segment is like a,
I think it's fine if that's what you're looking for,
but it was never what I was looking for.
So I wanted to get rid of it as soon as possible.
And so as part of that,
it was actually during my interview with Mattermost,
the CEO Ian was like,
Oh, hey, have you heard this thing called Rudderstack?
I was like, Oh no, I hadn't heard of that.
Let me go check that out.
And it's like, Oh, it's this open source second alternative.
I was like, yes, this is exactly what we need.
So starting about, I think, in March, I want to say,
we started this project of getting rid of segment
and replacing it with Rutter Stack.
And from a code perspective and everything like that, it was a really simple change.
Like, it's not that complicated to, you know, replace segment with rudder stack.
It's really quite simple.
But it was more of a, it's more of like, what's all the downstream effects and like what's the what else can we
do with rudder stack that we weren't able to do with segment because we had such
uh limitations on the amount of data we could send so um so like i've implemented
or i've had implemented that's more of like a project manager on this project
more than actually doing the code.
But essentially, Mattermost now has RutterStack implemented
on the server, the web app, the mobile app, the plugins.
On Mattermost.com, we're using RutterStack
because they're also using Google Analytics on there,
but Google Analytics only gives you aggregated views of your data, whereas Rutter Stack, not only can you have
custom events that you can just trigger in JavaScript, they click this button, they do
that, whatever, but it also does all the page tracking and everything else for you and so having that all go through like the same tool you know it integrates with snowflake just
fine and i know rudder stack has a bunch of other connectors to different data destinations or
whatever but obviously i just care about snowflake um so getting all that raw data into Snowflake where now you can really start telling the story of,
and this will, I'll just, some foreshadowing.
Really get the story of the customer journey
with RotorStack.
And the other thing that's been really nice
with RotorStack is now that we have no real limitations
on how many events that we can send,
now it's like, oh yeah, just add an event for that.
Why not? It's not going to hurt anything.
You want to be a little careful on exactly what events you add
and all that kind of stuff so you're not just plowing
a bunch of data in there that doesn't mean anything.
But if it's a meaningful event, then, you know,
it's just opening up a whole new world for,
especially the product managers that matter most.
But yeah, Rosh.
Yeah.
So from what I understand, you're using RouterStack
to capture all the interactions that your customers have
on many different touch points with them, right?
So it's not just your website, it's like many different touch points that you have. And
then these data are pulled into Snowflake and from there, dbt is used to do the data modeling
and then LookML and Looker are used like for visualization and diving deeper into the data.
Because you mentioned at the beginning
that a very important goal in building this kind of analytics infrastructure
is to collect all the data into one place.
We talked a lot about the customer events
and the customer-related behavioral data.
Are there other data sources that you pull?
And if yes, how do you do it and what kind of tools you are using for that?
Yeah, great question.
So actually, one thing, this is sort of tangential, but because
Mattermost is a very privacy and security focused company um we actually the data that we're sending to rudder stack is actually sort of there's no pii in it if you will um we're not sending email
addresses or anything like that to it it's literally just the you know like if you're on
the web app or whatever and you're just chatting, all that gets sent to rudder is like the internal user ID that the server
identifies you as and like the server ID,
just so we can have that kind of stuff,
which is kind of an interesting take on some of that stuff because,
you know, like cough, cough,
Facebook is trying to, you know,
figure out exactly what you're doing on everything at all times,
listening on your phone. I'm sure
All that crazy stuff so anyway, that's sort of an aside, but yeah, so as far as like
You know getting all the data, and then how do we like put it together?
So not only are we getting all the rudder stacks So that's like you said like you know what kind of website stuff like web traffic and that kind of things
But also like in product
user events and even server telemetry.
But then also we've got, we use Zendesk for support, Salesforce for sales.
They're using Marketo to do like marketing type stuff.
Those are pretty much the main ones.
And one thing that we do,
and that's one thing that we kind of brought over from Heroku,
is we're using a tool called Heroku Connect,
which allows you to have a bidirectional sync
between a Salesforce instance and a Heroku Postgres database.
And what that allows us to do is not only can we read the data from Salesforce and get
it into Snowflake, but then we can also write back to Salesforce.
And what that's really helpful for is that, like, the salespeople and, you know, maybe
solution architects, sales engineers or whatever, they live in Salesforce all day. You don't want to have to force them to go to some other system,
like even Looker, really.
You want the data to live in there.
So what we do is we'll generate these data points about, let's say,
like an account, you know, like some sales guy is trying to sell
to some account.
But maybe we'll sync some data that, you know, like how many users do they have
or how many active users have they had in the last week
or something like,
just something to give them a little bit more context
into what's going on with their customer.
But we are using Stitch data
to sync some of the data from the various data sources.
So like Google Analytics, Zendesk,
Jira,
but yeah, Jira, I forgot about that one.
So Mattermost uses Jira for all of its
internal project management,
and
we're actually pulling that data into the data
warehouse, and then we can actually use that
data to give,
you know, like, say the VP of engineering a view of, like, how are these different teams doing, and how many tickets are they doing, and all this kind of stuff.
So you can kind of even do, like, performance metrics with this data.
Yeah, so that's sort of, so it's like, get all the data in with Airflow or Stitch data or Rudder stack,
and then use dbt to transform it, then Looker to visualize it.
So that's the way we do it.
Yeah, sounds great.
I mean, it sounds like you have managed to build these single source of truth
around your data and pulling all the different data that you need in
one place. Is there something missing? Is there something that data sources that you don't touch
yet and you are considering to do it in the future? Do you think that's something missing
from the data that you need? Or now it's more about focusing on working on the data
and creating things like...
And we are going to talk more about that later,
but the customer journey that you mentioned.
Yeah.
So, yeah, I think it's both, really.
One of the challenges we have
with the way that Mattermost as a product is distributed,
because we don't currently have a SaaS product,
is that people have to upgrade their own servers.
And most people don't.
So, like, when I say, oh, we just released RudderStack
and, you know, the 5.23 release that we released, like, a month ago,
like, most people are not on that one.
And so that makes it a challenge
when you're trying to add some of these telemetry items.
You know, if somebody's not on the right version
where that telemetry is even implemented,
you're just not getting it, and so you don't know.
And so I think that's sort of the biggest challenge
that we have, and really until we have a SaaS product or somehow make upgrading your server brain dead easy, which I think is kind of tough in itself.
Like, you know, we're always going to be fighting that essentially. Yeah, so like I said previously, when we moved, or like two years ago at Mattermost,
they turned off all but 2% of those events
coming into segment.
So now that we've turned them back on for Rudder,
with Rudder stack,
now we have to go actually figure out
what this data all means
because we didn't really have any of it.
So we didn't really know how to model it
and do this kind of stuff.
So now that
the Rudder stack release has been in the wild for a little
while, we're finally getting enough data
to where we can start
we figured it out and we're modeling
it, aggregating
it up so you can
visualize it in Looker.
And then to the customer
journey piece is
now that we have now that we're using RudderStack across all these different properties, that's if a user visits Mattermost.com,
let's say they go to some blog post,
or maybe we'll host this podcast, and they went there, right?
And then, you know, they're reading around,
and they see all this stuff, and they're like,
oh, I'd like to, like, buy that or do a trial or whatever, right?
And then you can see, like, okay, they downloaded this trial,
and then, well, let's take the version where they actually buy it
they'll go to our customer portal
which again uses rudder stack
so now we can see
you know what they're doing in the portal
then they buy it and then we can track
which
license
kind of like a license ID
that they're using for that
server and so we can say like okay okay, they went to this podcast,
they looked at this blog, you know, they read our mattermost.com stuff,
they went to the portal and bought the product,
then they actually started using the product.
And then, you know, here's how they're using, you know,
you actually know how at least that server is using the product,
which can really help, I think, you know you actually know how at least that server is using the product which can really help i think you know what that enables you to do is not only like you know i mean you can start doing
crazy things like trying to a b test some marketing site um which is something we're planning on like
you know and start answering really detailed questions about like, of these people who visited like, you know, some certain page on mattermost.com.
Like, not only did they even buy the product, but how did they use the product once they did start buying it?
And is there something about like, those set of users that we should be thinking about from, I don't know, a marketing perspective, a sales perspective?
Should we be building more features for these people?
Like, all this kind of stuff sort of allows you to really,
really unlock the value of that data to a level that, I mean,
we never, I don't even, like,
we didn't really even get there at Heroku, to be honest.
So it's kind of, I don't know, it's like we're going to have it. So, I mean, there's kind of heat that matter most, like we're gonna have it. So I mean,
there's a lot of data modeling and like, you know, the stuff goes into it. So we're not totally there
yet. But that's sort of what we're moving towards. Yeah, the feeling that I get personally, and I
would like to hear your opinion on that, like, you know, like of the first problems that data engineering
had to face was getting access
to all the relevant data, right?
Getting access to the data, collecting
the data consistently. This is a pretty hard
problem, actually, also from an engineering perspective.
And you can see that, but all these
very complex platforms that they have been
like Kinesis, Kafka,
it's not an easy engineering
problem to solve correctly.
But for all these years,
it was all about how we are going to collect the data,
put it in one data warehouse,
then, okay, now all the things that we can do
in the data warehouse with this next generation
like snowflake that we have.
And it seems like even when we solve these problems,
there are more problems coming up.
And I think that the next cycle will be more about the lifecycle of data, how you track the
changes that are happening there. And I think that was very interesting what you said,
how you can track, for example, events that are coming from many different versions of the product out there.
And I mean, it might be more profound in your case,
because you have many installations and not all people are updating.
But I think that as we move forward, these problems will become even more important.
And it will be, I think, very interesting to see how the industry will respond to that
and what solutions will come up around this.
So moving forward,
we discussed a lot about your infrastructure
and it sounds like Mattermost is a very data-driven company.
Can you, and you touched some of the use case
of what you are doing,
can you expand a little bit more
on the role of data analytics and data in general inside Mothermost and how the company is using this data?
Yeah, yeah. A good question.
So, I mean, there's so many aspects to the stuff that we do.
So I'll just touch on a few, but one of
the first things that we actually built, and this
was before RudderStack,
and really it was just
based on the Salesforce data, but
really we were trying to
provide a way for
the executive level
to track
sort of like
financials and stuff, and do like financials and stuff
and do financial forecasting and modeling and that kind of stuff
and have that all in Looker so that they can see what we really –
and what we actually have now is sort of like a health of the business
Looker dashboard that has all these metrics and stuff that we've defined
and populated about how much revenue is coming in, you know,
financial numbers, like how much revenue is coming in,
all that kind of stuff. But then also like, how's our support team doing?
Like, are we, you know,
are we getting good feedback from the tickets or whatever?
And then also, you know, are more people installing Mattermost?
Are more people going to the website?
All these really top level things
that they can get a view of so that they don't have to spend
hours clicking around to all these different other,
if we had all these dashboards separate or whatever,
where they have to spend hours just trying to find
where this data is, but providing that data to them
in just one place.
And then actually we even built a board-level view, you know,
because it's a venture capital company.
You know, you have these guys from Battery Ventures.
I should really know these names better, but, you know.
The investors, right, they want to see see like, how is their investment doing?
And by providing them with a like board level dashboard,
that's even higher level, you know,
I think not only does that give like the board members probably be like,
Oh, these people know, you know, like this company knows what it's doing.
Like they're pretty mature on how they like do this stuff.
So I think that's pretty cool.
But yeah, and now that we have access to all this RutterStack data,
we're able to answer so many questions
for the product managers
that they just didn't have answers for before
because the product managers are trying to make,
they want to make the highest impact
and best changes to the product
to either A, make like add a feature that we don't have that,
you know, Microsoft Teams has or Slack or something, or just making like a feature better.
Like for instance, they're working on a project to revamp how threads are done.
And threads, if you don't use Mattermost, it does
threads a little bit differently than Slack
where Slack has the collapsed threads
and then you have to click it and it's over in the sidebar.
Mattermost
just has inline threads
so you can't collapse
them. And that's always been the biggest
thing when people switch from Slack. They're like, what are you guys doing? Why is this thread stuff
so stupid?
But the product manager for that, he wants
to make sure that he's building
threads, not just
to copy Slack, just so people stop
complaining about it, but how do
people actually want to use threads?
And so we're able to provide
him with a bunch of
really
specific data points on how are people actually using threads, and how many concurrent threads are going on in some of these user chat rooms and stuff, and how many messages on average are posted under a specific thread, or under a given thread or whatever.
And so he's wanting to use all this data so that he can make a better decision on
which way do we actually go with this threading thing.
And I think we're seeing that from really all the product managers at Heroku,
or Heroku, at Mattermost, where they've been so data starved because of the second
thing that now with RotorStack we really have the view of, hey, how are people actually using this
stuff and like how can we make it better? So that's the one I'm kind of most excited about.
Yeah, sounds great. I mean, it sounds like we are talking about a company that is pretty much data-driven. I mean, almost every aspect of the company
works on some kind of data.
And yeah, it sounds like the industry and the tools out there are
mature enough at this point to enable this kind of use cases, from the top
leadership down to even the product
manager who's going to use the data to drive their decisions.
And that's very interesting to hear
because it's pretty common to hear about data
in very specific areas of the company,
like more about marketing,
because of course marketing was one of the first functions
in the company to rely on data to do that.
But pretty much if you want to be competitive today,
you need to utilize your data that you have
in everything that you do.
Yeah.
That's good to hear that you are doing that on Mattermost.
All right, so we are very close to the end of our discussion.
So one last question.
What is next for the data analytics inside
Mattermost? What makes you excited about it? It sounds like you have all the data
there now to build some very interesting internal and external products.
So yeah, what makes Alex excited? Yeah, so I think for me, it's really all that
customer journey mapping stuff.
To really have that end-to-end view of things
and
really make sense of it for people
so that
non-analytics
people can understand what we're showing
them.
Just being able to do that stuff, I think,
is going to unlock a lot of
stuff
in that direction.
And I mean, you know,
we're on the way to it. We're not there yet.
I will,
you know, I'm going to write some blog posts and stuff
once we actually get it up and running
because I think it'll be really cool.
Yeah.
Sounds great.
I'm pretty sure we'll have more opportunities
to discuss again in the future
about more exciting things
that you will be building at Mothermost.
So, Alex, thank you so much.
It was extremely interesting for me
to hear what you are doing there.
I hope you enjoyed your time and for the conversation.
And I'm looking forward to chat again in the future.
Yeah, absolutely. I appreciate you having me.
So that was it with Alex.
I think it was very interesting.
We touched, as we said, many technical details.
And we covered the whole data stack that they have.
A very
interesting takeaway from this conversation is how many different
moving parts a data stack has and how complex it can become and how important
is to have the right tool for each job in this data stack and how
difficult it is to actually maintain and actually deliver the data and enable everyone inside the company to use the data,
even if you are using all the current best practices
for building a data stack and operating it.
Yeah, I agree.
I think it's really interesting that just having the data now across teams and in a place that is usable across teams is opening up all sorts of new opportunities that are going to reach into other departments that matter most.
If you think about marketing and sales and really empowering them with data, much like they're doing with the product.
So it'll be exciting to touch base with Alex in the coming months
and see how the data spreads across the organization.
Until the next one, thanks for joining us on the Data Stack Show,
and we'll catch you on the next episode.