The Data Stack Show - 25: MLOps and Feature Stores with Willem Pienaar from Tecton
Episode Date: February 17, 2021On this week’s episode of The Data Stack Show, Kostas is joined by Willem Pienaar, tech lead at Tecton to discuss machine learning, features and feature stores.Highlights from this week’s episode ...include:Willem Pienaar's background in South Africa and southeast Asia and from Goject to Tecton (1:58)Tecton was founded by the builders of Uber's Michaelangelo (6:37)Defining features and their life cycles (10:05)Comparing a feature store to a database (16:40)Data architecture in a feature store (26:16)How feature stores evolve as a company expands (30:12)Main touchpoints between the feature and the data infrastructure (37:59)How Tecton manages productizing complex architectures (41:44)How Feast and Tecton work together (45:12)Tecton impressing VCs and preparing for a competitive, emerging market (48:14)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome to the Data Stack Show, where we talk with data engineers, data teams, data scientists,
and the teams and people consuming data products.
I'm Eric Dodds.
And I'm Kostas Pardalis.
Join us each week as we explore the world of data and meet the people shaping it.
Welcome to another episode of the Data Stack Show. I'm
Kostas and today I have the pleasure to host Willem PNR, Tech Lead at Tecton, in another episode where
we will be discussing about feature stores, MLOps and open source. Willem is working in one of the
hottest startups right now around feature stores and he's also the maintainer of probably
one of the best open source feature store solutions out there. So we will have the opportunity to chat
with him and dive into what feature stores are, why we are building them, why they are using them,
what MLOps is about, and how open source is important in this new wave of technology that
is supporting machine learning.
Unfortunately, today Eric is not going to join us, but we will have a good time discussing
with Willem and many things to learn from him.
So let's dive in.
Welcome everyone to another episode of the Data Stack Show.
Today I have a very special guest.
His name is Willemiennar. We are going
to be discussing about quite recent development in the space of data in general, which is feature
stores, but everything around data and machine learning. And I'm really excited to have this
conversation with him. Welcome, William. How are you?
Thanks, Costas.
I'm great.
Thanks for having me on the show.
Yeah, yeah, of course. So would you like to start by giving us a quick introduction and a little bit of a background
story about you?
Sure.
So I can give you a quick background.
I'm South African, born and raised, grew up there, studied mechanical and
electronic engineering. I built a company while I was a student, a networking company, and then
sold that. After that, in South Africa, I worked in control systems, engineering, industrial
automation. I did that for a few years and eventually immigrated to Thailand, where I worked kind of as a software engineer slash where we, you know, built remote sensors and a lot of like streaming data from power plants in the jungle to central control systems and things like that.
So I've been in and around kind of the engineering space, the data space, kind of vertical solutions for a while. And after working there for a few years,
I moved to Singapore where I joined at the time
kind of like a company that had been deemed a rocket ship
that just crossed 1 billion in valuation.
It's an Indonesian company called Gojek.
Oh, wow.
So they're currently, it's currently a $10 billion company.
At the time, they're mostly focused on ride hailing as their core product,
but they are today a multi-application or a multi-product platform.
So they do food deliveries, digital payments, lifestyle services.
So you can get like somebody to come and fix your car
or you can pay and get like airtime like data for your phone and things
like that so there's like every single need that you have in the day like if you need a motorcycle
taxi a car you know delivery groceries i've got like 17 different services so i joined that team
our director for us to build it's basically at the start to get ML into production because they were sitting on
mountains of data, just like Uber and Lyft and all of these companies, but they weren't leveraging
that at all. And they had a bunch of data scientists that they'd hired, but those folks
couldn't get into production because the engineers building the products weren't incentivized to
really help them. So I was the engineering lead that helped them kind of build the initial systems, actual
ML systems.
So not products so much.
And our team was kind of, it started off as kind of like a embedded in the data science
team.
Eventually we became a platform team and then we ended up building a lot of data products
and data tooling.
So at that point, that was about two years into being at Kojic. Our team focused on the end-to-end ML lifecycle. We were only about 10 to 15 folks at the time. Data scientists were like 50 to 60.
And so we kind of pivoted towards building tools that a large amount of teams could use.
APIs, UI services,
it's not a scalable approach that we could find.
And some of the things that we worked on
were like the feature stores and model serving
and training and schedulers
and versioning and processing of data
and experimentation systems.
It's all kind of in our purview and all ML focused.
Yeah.
So after that, I joined Tekton.
I was there for about four years and I just recently joined Tekton.
I mean, it was just a match made in heaven.
Tekton is a company that's focused purely on feature stores.
And that's kind of my specialty at Gojek.
And, you know, I also led the team that built Feast,
which is the feature store that we built
with Google at Gojek.
And so at Tecton, our focus is primarily
to build a world-class feature store.
And we have two products that we're kind of building out there
and we're trying to build towards a unified vision for them.
So that's kind of a short story in a long form well that's quite a journey both uh geographically speaking
and also in terms of your career so that's amazing cool can you share a little bit more about
tekton i mean you said that tekton is like focusing mainly on building like a
feature store. Can you tell us a little bit more about the company and the product itself?
And then we are going to get more into more detail about both feature stores in general
and also like the Tekton product itself.
Yeah, that's a good question. So Tekton was founded by the original folks that built the
Michelangelo platform at Uber. And I think most people in the data space have heard of that.
So that was kind of a seminal internal proprietary platform
that was built at Uber.
And it was sold as something that democratized machine learning.
That's a very overused term, but it was widely used within Uber
to kind of productionize both data and models. And what a lot of people told us is that that system was widely used within Uber to kind of productionize both data and models.
And what a lot of people told us is that that system is actually used for a lot of EDA and iteration and development.
It's not just for productionization, but it's a very famous system.
So they left there.
So it's Mike, who was the PM on the project.
Kevin, who is well-known as an engineering leader.
And they founded Tekton.
And I think they started in stealth 2019.
So they've been secretly at the start
building a feature store startup.
And they've grown the team prior to me
joining to about 23 people.
I think I was the 24th or the 25th person to join.
So they've got a very advanced,
I'd almost go as far as to say it's the leading
feature store right now that is at least publicly available whether open source or proprietary or
paid and it's it's a complete end-to-end feature store and it's addresses both like enterprise and
kind of like small startups it's not fully open to the public right now.
So you need to obviously sign up and pay
and go through the normal sales channels.
But it's something that we want to get
in everybody's hands in the future.
But there are some specific differences
of the products between Dekton and Feast,
but we can get into that a bit later.
Yeah, absolutely.
Quick question before we move forward.
You mentioned about being the most advanced
feature store right now in the market.
I mean, my background is mainly in data engineering,
to be honest.
I'm not a person who has worked in ML.
So I know about feature stores,
but I haven't used them extensively myself.
So I did a bit of research
and I try to see what is available out there.
And what I've seen and noticed
is that there are many technologies
that are coming from very big corporations.
Like you mentioned, for example,
Michelangelo and I found like what Airbnb is doing.
I think they have, how they call it,
like the Zipline, I think.
Yes, Zipline is the feature store
and BigHit is the ML platform.
Yeah, so it looks like every big corporation
has pretty much come up with their own architecture.
But you don't, at least I didn't manage
to find that many open source solutions.
Are there open source solutions outside of Fist?
So there's HoppsWorks.
I'm not sure if it's S or not, but they're one.
They came out more or less the same time as us.
They were kind of a Hadoop-focused one.
They had like proprietary underlying technologies,
like file systems and things.
There are smaller ones.
I think there's one called Butterfree that I recently saw
that seems a little bit nascent, but they're coming out.
And I believe that some of the proprietary feature stores
will be open source in this year as well.
At least there have been some rumors.
That's good. That's exciting.
Cool. Okay.
So let's move forward and do a little bit of more technical details.
And let's start with, let's talk first of all, what is a feature?
I mean, we're talking about feature stores.
Yeah. I mean, the simplest answer to that is i mean it could be an advanced answer but simple answer is it's an input to a model so it's it's literally
a data point that is used in a model to make a prediction and how is i mean how is different compared to, let's say, the typical data types that we have in the database?
What's the difference there?
Or is it pretty much the same thing, just like packets in a different way?
I think it's more an abstract label that is assigned to specific data, because it's just in what context the data is being used you can take
raw event data and then feed it into a model and it can be considered a feature so it's when it's
fed into the model then it and it has some kind of influence on the outcome that the model is
you know producing that's when it becomes a feature but in terms of the data types that
you're feeding to the model it's almost almost always integers or floats or binary values.
If you're feeding strings or, let's say, bytes,
often the model has then the capabilities
to interpret those types.
But it's normally primitive types
that you're feeding into a model.
And in most cases, these features are only valuable.
It's not in all cases,
once you've aggregated them to some degree.
So if you look at like the amount of purchases
that a user has made or some kind of value
that allows the model to make a stronger inference
on a user or a customer or whatever the entity is
that you care about,
that you're making the prediction about.
So typically it's something that features aggregated data,
but it can also be raw data.
But the most common case is to have, I guess,
some kind of aggregation, right?
Yes.
Okay.
That's interesting. And can you give us a little bit of background
around the lifecycle of a feature?
As you said, it can be from raw data,
aggregation.
How do we come up with a feature?
How do we start from the raw data that we get
and how we end up with a feature
that we can store on a feature store
and use it on our models online or offline for training?
I'll give you the non-feature store flow first
because the feature stores are all different.
The non-feature store flow is the user exports
some historical data from a warehouse or some lake
that the company's organized for them.
So they sample the data and then they, you know,
take like 10,000 or 100,000 rows
and then they just process that data.
That's as a Pandas data frame or something.
And then they train a model on that,
and they look at the model's performance.
And then typically they would ship that model
into production somehow and then get an engineering team
to kind of rewrite those transformations
on the real-time event stream
or transactional data that's available in production.
And as the systems are transacting,
that data is fed to the model and they can make predictions.
And if you looked at that flow,
you could also productionize that flow by, you know,
the training part that the data scientist did at the start could be
extended to have more data and it could be automated through airflow or some
pipelining system, but that's kind of the high level flow.
And so the feature in that story is the transformation that's made on the raw data and it is fed
into the model during training.
Often you will log a list of features as strings, column names with the model binary and you can then reference those
same features in production because all of your models will probably have different lists of
features that they're you know kind of referencing and so the lifecycle continues in the production
and then somehow you need to tie the the data sources that you have in production with the list
of features that's saved with that model binary. So your model serving infrastructure needs to know how to select the right columns and data
points in production and feed that to the model. Otherwise you're going to have a skew or some,
if the wrong features are being fed to the model, it's just going to be an inaccurate prediction.
So that's a typical flow and how the feature stores fit into this,
I'm not sure if we want to get into the feature stores,
but the lifecycle is extended to,
it's kind of split in that the feature store
provides two interfaces,
one at the training time and one at the serving time.
And it prevents you from,
or it removes the need to kind of re-engineer features
and it gives you a kind of unified interface
to the same data, same features. We can get into that in a bit, but just the final part
on the lifecycle, I guess the final place where you would look at the lifecycle of the feature,
because you've made that prediction, is you would have an experimentation system that tracks the
outcome of the prediction. And if the outcome is good, then you could go back and say, these features are
actually predictive. And if the outcome is bad, then you can say, well, maybe these features are
the problem, or maybe the model type is the problem. Maybe there's some intrinsic problem
with the kind of way that we frame the problem domain. But yeah, so you'd want to have the model
itself and all the logic that you have around it and the features as part of the collection of artifacts
that are associated with an outcome in an experiment.
And by experiment, I mean like,
let's say if you've got a website,
you could maybe be testing two models
and those models might be recommending specific products.
So you can measure based on user behavior,
which model is doing the best and
the features are the primary influence there.
That's super interesting. Actually, I find it fascinating. Like it's a completely different
type of complexity when you are serving models compared to a software product and how you
serve it. When you have again operations, we have again like lifecycle and you have
like also similarities, but at the same time, the tools that you need
and that's like the feeling
that I'm getting from you
and the methodologies that like,
they are different.
And I'm really happy
that I have you here today
to learn more about that.
Okay, we chatted about
like what the feature is
and we touched a little bit also
about feature stores.
Let's get a little bit more
into like the feature store itself.
You mentioned something about
putting like two different phrases, one for the training part
and one when the model is online.
What is a feature store at the end and how it is different from a database or a data
store in general where we store data?
And what are the components there?
Yeah, this is something I've kind of thought about a lot. And the best way I can explain it is that the feature store is an opinionated data system that allows you to operationalize data for machine learning.
So it's a data system meant for machine learning, and it has some unique properties based on the requirements that machine learning models have. So by the way, this definition is not universal
because all feature stores are basically different and people have different opinions
of what a feature store should be. But there are some characteristics that make up most feature
stores. So the one that I think is extremely important is that a feature store provides a
kind of unified, consistent interface for you in the offline and the online worlds.
So with models, on part of the lifecycle,
you're training the model, and then the next side,
you are serving that model in production.
That production could be an online serving,
or it could also be a batch scoring where you're doing
a large batch of data that you want to make predictions on.
But an important failure mode
that we often see in production systems
where they don't have a feature store
is there has to be a re-engineering of features
in both environments
because typically there are different teams
working in different environments.
You'll have data scientists working with Python
in the offline side,
and then you have Golang and Java
in the production side with engineers.
And so they end up pre-engineering
a lot of these features
and that causes drift and problems with models.
So the feature store provides a single interface
between your model and the data.
And so it literally is an API or SDK
that allows you to pull data
and it serves the data to your model.
And it ensures the quality of the data to your model and it ensures the quality
of the data to that model then feature stores and that that fundamentally removes this kind of data
drift concept drift problem it depends on the architecture of the feature store of course
another problem that feature stores solve is feature reuse so it allows you to kind of define
both in those two contexts,
but between the kind of offline and online world,
sorry, the streaming and batch world,
consistent definitions of features.
So you can define a transformation once
and other teams can see that definition
and they can consume your features.
They can fork the transformation
and then reapply that and create new features.
So it allows for collaboration.
It allows for reuse.
That's actually one of the biggest problems we had at Gojek
was teams were just copying and pasting each other's code
if they knew about it.
But often they were just re-engineering the same features over and over
so that recreating the same transformations.
Now, this aspect is not necessarily unique to a feature
store, but it's something that it's very uniquely positioned to do because it really sits at the
center of your machine learning. It's essentially the foundation to your machine learning
architecture. So the feature store provides that consistent view. It provides also an abstraction
from between the model and your data infrastructure.
So this is also something that we had massive problems
with at Gojek where teams would build training pipelines
and then they would write SQL queries
that are basically running before model training.
And in production, they would have access to Redis
and a lot of connectivity and boilerplate code.
So feature stores decouple the process of creating and
materializing features from the consumption of that, which in turn makes your models highly
portable. So there's no direct coupling or assumption that certain boilerplate code will be
packaged with your model. And so I think those are the kind of key things
that make a feature store unique.
It's this kind of consistent view
between both environments.
It provides also online serving capabilities.
So it gives you low latency access
to features and production.
It also gives you often
the kind of more advanced feature stores
provide point in time guarantees.
So it ensures that when you are training a model, that the view that the
model sees on historical data is accurate, and that it represents the same view that the model
will see in an online case. This isn't always easy to do, because you need to do a lot of
kind of fuzzy as of joins with data in order to ensure that you don't accidentally leak future data to models.
So to drill a little bit into that, it's very easy to, as a data scientist, accidentally,
when you're doing like a join of like 20 or so tables to produce a training data set,
to easily just accidentally give some future data, like maybe it's an aggregation that's
over a day.
And you think that data that's
stored on today's timestamp means that it was from the previous day, but actually it's
from the coming day.
Now your model can see into the future when you're training it, but when it actually gets
deployed into production, you can't get that data.
And so it's just wildly inaccurate.
So those are like subtle little things
that trip up a lot of teams
when they productionize models
and that a feature store helps with.
That's very interesting.
I'll go back and ask about the feature again,
just because I tried to make it more clear
to myself, to be honest.
So if I understand correctly,
if you want to think about the feature
in an abstract way,
because initially, to be honest,
like when I was thinking about features
and reading about it,
I was thinking that at the end,
there is a database somewhere
where you have like some data stored there,
which is the result of doing a pre-aggregation, right?
But the more we talk together,
I tend to think that the feature at the end
is something much more complex than that.
And it has to encapsulate like more information than just the output of a transformation. So is it accurate to
say that like at the end, the feature is a piece of code that actually executes like the aggregation
or defines the aggregation or the type of processing that you want to do on the data,
together with source, because the data needs to come from somewhere, and this cannot be arbitrary.
It has to be well-defined as part of the feature. The model, of course, that we associated with
at the end, and also the time, right? Because something that we observe today, even if we are
talking about the same data source, or we use the same aggregation, it doesn't mean that it's going
to be the same again tomorrow, or it was the same yesterday. That's what I'm saying makes sense. To some degree, but I would challenge you on some of that. So going to be the same again tomorrow or it was the same yesterday. Does what I'm saying make sense?
To some degree, but I would challenge you on some of that.
So are you saying the feature is the definition of all those things?
Yes.
It's not clear to me how the model is associated to the feature here or connected.
Because normally a model has a dependency on a range of features,
but the feature has no awareness of models that consume it.
Okay.
Yeah, I was thinking more about the model as being the entity that's going to consume
the feature.
So in this sense, it makes sense to associate with it.
But yeah, I get your point now.
The feature can live there and you can reuse the feature also with different models, if
I understand correctly.
Yeah.
So if you disconnect the model there, you've got your input source data, and then you've
got the transformation.
Those are actually the only, that's all you need to produce a specific feature.
I don't think time would be in the mix there because, yes, over time things would change,
but if you change the transformation or the source data, then that is the input artifact that is changing.
If you have a deterministic function that produces a specific feature.
So if the input data changes or if the transformation changes, it's a new feature or it's a new version of the same feature.
And feature stores also help you with tracking that. So if you have a feature store that allows for tracking of versions,
then if one of those two things change,
then it will be a new version of the feature.
And interestingly, then when you consume that feature,
if your model has a dependency on an old feature,
you'll consume the old data and the old transformation.
And if you consume from the new version,
it'll be the new transformation or the new data.
Also, I mean, there is an aspect of
it does depend on how you partition your data like the time element does come in there
so if you're just doing a refresh of the data every every week or month will be different right
there's seasonality effect in data so what we typically do is we just consider those to be,
we consider those to be the same feature, but different models.
So it depends on, you can be really pedantic about the versioning there,
but for refreshing models, it's typically not that serious.
As long as you have the right validation on your source data
and you can make sure that the effects of seasonality is not too wild.
Sorry, getting a bit digress here, but yeah, I'm with you.
Okay, okay.
Thank you so much.
Now it's much more clear about the feature.
Sorry, I really find this conversation that we have
like as an amazing opportunity for me to learn more about that stuff.
So I might do some silly questions.
I know that there might do some silly questions.
I know that there might be some people out there that might be much more advanced and work in this space.
But yeah, I'm selfish.
Okay, thank you.
All right, so moving a little bit forward,
staying in the feature stores,
I still just understand a little bit more
how a feature store is architected.
What are the components?
If you see it from a software engineering perspective,
let's say I would like to start building a feature store.
What kind of architecture I should expect to see there
and what are the main components of it?
The traditional feature stores have an offline store.
This is a place where you are going to materialize data.
So essentially, you're going to take data from some source. You're going to use, this is another component the feature store has,
some kind of compute layer, some transformation system like Spark,
you know, Airflow. It could even be like an ELT stack, like warehouse. And then you're
going to produce data and then you're going to store it in the offline store.
That store is used by the feature store,
and often you have an API that's your feature store API that you query.
It'll then hit the offline store with a query,
produce a training dataset, and export that for you
to train your model on.
The feature stores also have an online store,
and so it'll have typically an online API,
which you will hit with a query in production.
And that will be backed by, let's say, a Dynamo, a Redis,
some kind of low latency store key value in almost all cases.
And that store is also populated by these jobs that transform the data.
The more advanced feature stores have some operational components as well.
So if you talk about Tekton, Feast also has some of these capabilities, but not as advanced as
Tekton. It plugs into monitoring systems. It also has feature transformation, on-demand feature
transformation services. You can do something like not just pre-compute features to be served,
you can also do a transformation on the fly.
So sometimes you have,
like let's say you've got a driver
making a booking on a ride-hailing app.
You only have their location
when they're making the booking
and you only have the location of the customer
when they're making the booking.
So you can't pre-compute that.
But you still need to produce features
that are dependent on those input variables.
So Tecton has this ability to do
on-the-fly feature computation,
and you can actually define those transformations
ahead of time, but they execute at runtime.
So integration with monitoring systems,
on-the-fly computation, pre-computed computation,
offline store, online store.
I'd say those are the primary components.
And then you have the computations are either batch jobs or they're streaming jobs.
So if you're doing transformations on streams, they're long-lived.
And if you're doing batch, then they're just running on some schedule
like every day or every hour or something like that.
I'd say those are the canonical components of a feature store.
But if you were listening to what I was saying earlier about what makes a feature store unique, and if you look at what Feast has implemented,
I'd say the only thing that really needs to be there is the online store and an ability to create training data from your offline data.
So that's kind of the essential complexity.
That's great.
Actually, I was checking Feast at some point.
And if I'm not mistaken, like in Feast, for example, you don't have like transformations
there, right?
Is that correct?
Yeah.
And that made me think and together with, I was reading an article at some point where
there was some kind of like critique around feature stores.
And actually what they were saying is that feature stores are great, but feature stores
are also something that needs to evolve as, let's say, like machine learning inside the
organization evolves, right?
Like if you start today to try and experiment and come up with some models and all that
stuff, probably getting a full feature store
is going to be like an overkill.
So you mentioned two things that you said
that like they are the basic requirements
to have like a feature store,
which is the offline training
and the online service of the data.
What is the evolution as the company grows
and as the company starts becoming more and more serious
around the ML and the data science teams that they have, how do you see also the feature stores evolving in there?
That's a great question.
So this is something we've been thinking about a lot as well.
Could a single data scientist use a feature store?
Can a two, three-man team deploy and run a feature store for a single use case?
We haven't found a use case for a single data scientist,
but we believe that it's possible for small teams.
Like let's say there's a company, they've got one team,
this team has to build one model and get into production,
and they need a system that gives them kind of a structured way to get data
into production without engineers being involved they would deploy a feature store and they would
kind of just use that themselves when more teams start to depend on for want to use feature stores
like they're going to get more ml models into production that require features or when that
team iterates on the same ML system,
but with different iterations of the same model.
So like the type of model is the same,
the problem it's solving,
but they've got different variants
and each model needs to be tracked
with a list of different features.
Then it makes sense to kind of double down
on the feature store and get some,
I guess you'd either need a more advanced feature store,
like depending on if you're using a Feast
or a proprietary solution,
as opposed to something yourself.
But at some point, you can't just have like a Redis
and maybe some Airflow scripts
that are pushing data into production.
You need to have something that's providing you versioning,
providing you tracking of features,
battle-tested APIs and things like that.
But you can emerge and evolve from a solutions team
that's solving one problem
to having that feature store owned by a platform team.
That's, I guess, the next step.
So it's a central engineering team
that manages the feature store.
They do things like provide access control.
They make sure that data gets garbage collected in stores.
They make sure that SLOs and SLAs are being met,
that the performance guarantees are being met,
that if jobs are failing,
that they're going to be the ones fixing that.
Then you've essentially separated two worlds, right?
On the one side, you have data engineers, data
scientists, that originally, they were creating data like features, and they were taking their
own models into production, and they were doing like end to end. But eventually, it becomes two
worlds. One is data engineers or data scientists creating features, features that may or may not be used by them.
It could be for other teams.
And often what I've seen at kind of large companies
is that analysts are being asked to do this.
So they ask analysts to write like SQL,
BigQuery SQL, Snowflake and all that stuff
because analysts are really good at that.
It's efficient.
And you create this wealth of like transformations on the one side and then
the feature store is just this layer that productionizes and operationalizes that data
and then on the other side you have this catalog of features that you as a user you can just pick
the ones that you want based on metadata that's stored on those features train your model iterate
on that until you're happy and then production that. But you probably are not going to engineer any features. You might just reuse existing
features. So I think that's kind of like the final point at which you are at the end of your
evolution. Then it's mostly about security and access control and scalability and enterprise
functionality. And kind of that's where Tekton is currently very good at.
So Feast is something that is mostly deployed by teams
that are more advanced than the single solutions team.
It's almost in all cases a platform team,
but it's not an enterprise feature store like Tekton.
It's very fascinating.
Yeah, absolutely, absolutely.
It's very interesting to hear about that.
So feature stores are something like quite new, right?
It's a new concept in terms of technology.
You taught many different parts of it.
And I assume that there's like also different maturity on these parts today.
What parts do you see and components from a feature store that there's a lot of space for improvement right now
and where do you think like that direction is going to both from your experience in your previous
company that you were like also feast and but also like in tecton because from what i understand
tecton was also like interacting with a different type of more enterprise type of company which
probably usually they have also a little bit of different requirements. That's a very good question.
This is a very, I think the tricky one here to solve is who you're addressing.
The biggest problem with the feature store today
is that it solves many problems
because it's uniquely positioned to solve those problems.
And so it becomes this platform that, you know,
it's kind of a Frankenstein monster.
So I think feature stores will evolve in different directions
and they will be more focused over time.
So I think you'll see kind of a split between feature stores
that are more focused on the solution teams
and the kind of smaller teams,
and then you'll see ones that are focused on the platforms and enterprises and their needs are different so i think that basic
problems are already somewhat solved if you look at spark transformations or dbt it's not perfect
but there are solutions in creating features and the kind of focus right now is not so much
how do you create features how do you compute them how do you not so much how do you create features,
how do you compute them, how do you store them,
and how do you serve them?
It's how do you do everything around that,
the kind of discovery and reuse, access,
how do you do things like the lineage between features, dependencies,
how do you track how models are performing that use features,
how do you integrate with adjacent monitoring systems
and data validation and quality systems?
Those are kind of the enterprise needs.
And then if you look at like a lower scale
kind of solution team focus,
it's a little bit more on how do you make it easier
to get started with feature stores?
How do you make it easy to integrate
into existing workflows?
How do you make it less kind of overwhelming for teams?
And I think all of the feature stores today
are still kind of tough to get started.
So I bet that if you went to Feast,
you didn't install and run it.
You probably just read the docs
because it's not just the pip install, right?
You have to spin up infrastructure.
You need a use case and you need to do quite a lot
to go end to end with it. So it a use case and you need to do quite a lot to go end-to-end with it.
So it depends on who you're kind of targeting,
the kind of smaller teams, larger teams, platform teams.
But I think the V1 problems are solved.
The V2 problems are different for those two.
And those are the ones I kind of mentioned earlier.
Yeah, that's great.
It's very, very interesting to hear about the enterprise
where it looks like a lot of value in this organization
is always around governance
and all these things that have been addressed
or we are trying to address also in different spaces,
but how do they apply specifically in the case of a feature store,
which is super interesting to see the same story but narrated from the side of a feature store, which is super interesting to see the same story,
but narrated from the sign of a feature store.
So last, let's say a bit of more technical question
before we move.
And I'd like to discuss a little bit more about Tekton.
How does the feature stores in general
integrate with the rest of the data infrastructure
that the company has?
You mentioned that setting up a feature store
is not like a simple process usually
because there's a lot of different components
of data infrastructure that you have to deploy there.
What are the main touch points
with the rest of the data infrastructure
that a feature store has today?
Main touch points are you have data sources,
either batch or streaming,
and you have some kind of either batch or streaming, and you have some
kind of job runner or compute engine.
So like Cloud Dataflow, Kinesis, Spark, something that can run a process that can take data
from that source, pull it in, do some transformations or take transformations.
There's an ETL system, essentially, and then load that into stores, one or more stores.
So in the old Feast architecture, you'd pull data from the source and you'd push to a stream.
And from that stream, it will get sunk into online and offline stores. But in the new Feast
architecture and then the Tecton architecture, what happens is you pull from, let's say, a batch source.
It could be a warehouse like Redshift. It can be a bucket. And you can pull from streams like Kafka or something like PubSub and do transformations
and then just push to a single online store and a single offline store.
So there's the compute layer.
There's the two sources.
There's the storage engines.
The storage engines may be existing infrastructure.
So feature stores, at least the good ones, reuse existing infrastructure and they don't
create new data islands.
And then there's also integration with operational systems like, you know, if you've got a Grafana
and a Prometheus, or you've got some kind of logging system like Stackdriver or Kibana
or Elk Stack, feature stores integrate with those.
And because they're production systems, right, you're depending on, like, literally the business decisions
are being made on the fly with this data.
So they are critical to have operational excellence on.
You need the logs, you need the metrics, you need alerts.
So they integrate with all those systems,
like a PagerDuty or Sentry and all of these
kind of monitoring and metric systems.
And then, of course, the kind of critical integration is into the model serving layer.
So the feature servers, the model server and the feature server speak to each other.
So the models will call out to get features.
And this also happens during training. So if there's a pipeline training model, then that also calls out to get features. And this also happens during training.
So if there's a pipeline training model,
then that also calls out to the feature store.
And depending on your feature store,
it'll either be deployed to Kubernetes
or it'll be deployed to kind of like a managed environment.
But I'd say most of them actually require
Kubernetes these days to run.
And if your feature store allows you to train locally, like in your notebook, So that's also another integration touch point. And then recently there's been like, I don't know if you know of Lyft's Munson.
No, I haven't heard of it.
Wait, is it Lyft or is it another company?
But it's a metadata, it's a discovery system.
Kind of like a, you know, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a,
a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, Munson? No, I haven't heard of it. Wait, is it Lyft or is it another company?
But it's a metadata, it's a discovery system,
kind of like a data discovery, metadata tracking system
that you deploy in your organization.
And it basically pulls or collects information
about all the systems that have data across your org.
So that's something that has been becoming quite popular.
Data Hub is another one.
And they've recently integrated with Feast as well.
So the integrations between those systems
and feature stores are also important.
Super interesting.
I see there are many touch points there.
So it requires from what I understand,
like quite a lot of effort to set it up
and also have probably complex operations around that, which takes me to my next question.
Let's chat a little bit more about Tekton and more specifically about what it means to productize such complex architectures, especially on the cloud.
So how did you manage to do that with Tekton?
Can you tell us a little bit more about this?
Well, I'd love to give you the finer details,
but I've just joined the team two months ago,
so I wasn't really involved with most of those small things,
but I can tell you at a high level how we operate.
There's multiple aspects to it.
Tekton today runs as a managed service
where we have architected the system in such a way that
we can run a single
Tecton control plane,
basically the brain of operations,
and we have a
separate data plane, and this data
plane can be deployed into a customer's
cloud environment.
Essentially, what this provides
is a way for us to horizontally
scale out the amount of customers that we can support and provide them data locality, like their data doesn't have to leave their environments.
So we have a large engineering team that's heavily focused on ensuring the reliability and stability and performance, as well as just the functionality that's available in that system, both from a kind of control and operational standpoint
as well as execution standpoint.
So how do you do computations for the customer
and how can you make that efficient
and how can you save them money
and how can you give them earlier alerts and warnings
and how do you integrate into the stores
that they're already using?
Then on the other side, we've got product teams.
I'm a little bit closer to the product side. So we have a lot of conversations on
what is the most intuitive way for users to define features? How do you allow them to specify
the configuration that tells the feature store how to operate? Because in the data space,
it's unlike engineering in that you're not reigning in the chaos, right?
You're not reducing complexity.
There's an innate complexity to data.
And the more features you create, the more uncertainty and complexity and entropy is introduced into the system. So you kind of want to give them as much structure as you can,
while at the same time, giving them freedom to, you know, operate. Like you can't just say,
you can do an average or a min-max, right? You have to allow them to write any kind of
transformations, bring their own code if they want to bring their own dependencies, but at the same
time, prevent them from, you know, taking down a production system and accidentally bringing in some sleep function or something.
Yeah.
So on the product side, we're heavily focused on understanding
how the users think and what to provide to them.
And the great thing about this is that we have two worlds here.
We have Feast, the open source side, and we have Takedon
where we have different customers and different users.
So, and then finally, it's just, yeah,
I mean, we have amazing founders
that are, you know,
seen a lot of great implementations of feature stores
like Uber, Michelangelo, and other companies.
And they're very well connected.
And we have great investors as well
with Sequoia and Andreessen Horowitz
that really guides us in our venture.
Yeah, yeah, absolutely.
That's very important.
So how does Feast and Tekton work together?
What's the vision there,
both from your side and also from Tekton's side?
Because you joined the company there,
you will be working on a product, Tekton, and at the same time, I assume you
are going to continue maintaining Feast.
So what's the story behind this?
Well, that's a great question.
I think that when we started, we started independently.
And then at some point, we just realized we're trying to solve the same problem, and we'll
probably be better doing this together.
And we have these great two
products so for us it's just about figuring out how to build the best feature store and we believe
that you know there will be large overlap between these two but that the feast and takedown will
kind of gravitate towards solving problems for different groups of users, where Feast will be a little bit more for teams that just want to get started quickly, solve specific
problems.
They're more at the kind of nascent stage.
But if you go to a large bank or corporate, something that requires companies or teams
that require high scale or multi-tenancy or advanced access control, then you're more
likely to go towards Tekton.
So for us, we're still trying to kind of converge
these two visions.
So we're working very closely,
I'm very close to the Feast and Tekton sites.
We can be unifying these visions.
But I think over the next three to six months,
it'll become much clearer exactly what we are,
what we have decided.
That's as much as I can answer right now. I hope
that was satisfying enough for you, but... No, no, no, that's good. That's good.
I totally understand. How's your experience working on an open source project, by the way?
It's extremely rewarding and it's also kind of draining at some points. So you don't really have
often close loop feedback. So you only see the tip of the iceberg in users.
So like 2% of users will make an issue or give you feedback,
but that'll often be negative.
So you really have to kind of have conviction
that what you're doing is right.
Luckily, I had to run Feast internally at Gojek
for like three years or two years at least.
So it was very rewarding to work with our customers internally and just get them to
use it and make them happy and see how impactful the software is.
And so you don't need to have conviction that an open source project is successful.
We just kind of put it out there and do it out in the open.
And if people like it, they like it.
If they don't, they don't.
But it turns out they kind of do.
And so for the most part, it's been very rewarding, but it can be be a lot of work so it's best if you're paid to do it yeah
i totally understand uh how it feels so we are almost uh close to our time and we have many
things to discuss to be honest i mean it's very fascinating, this whole space with feature stores. But one last
question. Tecton recently raised a quite impressive round from some very impressive VCs here in the
Silicon Valley. You mentioned some of them already. Can you tell us a little bit about what does this
mean? I mean, both for the company itself, like what excites you about what's going to happen in the next couple of months?
And also what it means about feature stores in general, right?
And this market, let's say, that is like emerging.
Yeah, so the market is going to get a lot more competitive.
We've already seen Amazon release their feature store.
Not sure if you had a look at that.
We believe that, you know, all other cloud providers also bring them out. And so raising that round is kind of a vote of confidence from our investors that, you know, they believe that we are one of the stronger players in this big round.
And I think that Tecton is probably the most, you know, it's the right environment and the right people to build an industry-dominating
or the most successful feature store,
which is part of the reason why I joined this team.
What you can expect to see going forward,
I think in the short term,
is a lot easier access,
a lot more transparency in terms of our APIs
and the functionality that we provide.
And we'll be going towards users a lot more.
So previously, I mean, from a technical standpoint,
we're going to be a lot more open towards integrating
into existing infrastructure and reusing existing infrastructure
instead of providing a managed service with
specific types of infrastructure so i think that's we've collected a lot of feedback we've got a lot
of great customers that have been working really closely with us and so there's a lot of things
that will be landing in the next couple of months but i think the thing that I'm the most excited about is getting more eyes on the product itself
and opening up what we've been working on.
Yeah, and also see how it's going to work together with Fist
because from what I understand from what you said earlier,
there are also things going to happen there.
Thank you so much.
It was a great conversation.
I really enjoyed it.
I learned a lot and I really appreciate that. And yeah, I hope to meet again like in a couple of months and see how things are going and learn more about this.
Definitely. I'm going to take you up on that offer.
Thank you so much for your time today.
Thank you everyone for joining us in another episode of the Datastack show. I hope you enjoyed today's episode with Willem as much as I did.
When we started recording this episode, I had many questions and I would say even some doubts about the importance of the feature stores.
I know many, many more things right now about them and I truly understand why they are important.
Willem did an amazing job explaining that to us.
And I'm really looking forward
to have another recording with him in the near future.
William has many things to share with us
about this exciting new world of MLOs.
Thank you so much again for listening to our show
and see you on the next episode.