Orchestrate all the Things - From machine learning observability to responsible AI: Aporia raises $25 million in Series A funding to go the distance. Featuring CEO / Co-Founder Liran Hason
Episode Date: February 23, 2022Is there a line connecting machine learning observability to explainability, leading to responsible AI? Aporia, an observability platform for machine learning, thinks so. ...
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amadiotis and we'll be connecting the dots together.
Is there a line connecting machine learning observability to explainability,
leading to responsible AI?
Aporia, an observability platform for machine learning, thinks so.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data Orchestration
on Twitter, LinkedIn, and Facebook.
Cool. So nice to meet you. I'm Liran. I'm the CEO of Aquaria.
We're a full-stack observability platform for machine learning models.
I myself, I'm a software engineer in my background, literally did that since I was a kid, just for fun.
I was five years in the elite technological unit in the intelligence forces
in Israel, graduated as captain, then joined a startup company called Adalom, a cloud security
one, where I was leading the architecture of our machine learning models. That's where essentially
I had firsthand experience what happens when machine learning models reach to
production, both the nice side of it and the less nice side of it. After the company got acquired
by Microsoft, I spent three years in a venture capital firm investing in early stage companies,
mostly in AI and cybersecurity field. And you know, after witnessing and
experiencing firsthand the issues with machine learning when it reaches to production, and
also down witnessing how many companies are adopting AI, I just realized that it can be
a better time to start a Porya and to help companies with monitoring their machine learning
models.
Okay. So in a way, it was, as they say, kind of trying to scratch your own itch, since I guess you had experience in trying to deploy, to go through the entire pipeline,
let's say, from development to deployment of machine learning models. And then you realized that, well, having more visibility, having better observability would
help.
So you decided to address that, right?
Yeah, absolutely.
Like in Adalo, what we had to do by the end of the day is we had this problem.
Our way to deal with it was writing our own scripts to kind of monitor our models. By the end of the day,
it wasn't a very good solution. We required a lot of maintenance. And I always had in my mind how it
should look like having such a system in place. So yeah. Okay. All right. So then I would like to ask you to give like an end-to-end example, let's say, of how one might go about using what you develop.
So because, you know, it's fine to describe it at an abstract level and say, well, okay, machine learning model observability, fine.
But how does that translate in practice? So what parts of the entire pipeline are you
able to monitor and how? So how do I go about using your product?
Yeah, absolutely. So maybe I'll explain a bit in high level how our solution works,
and then I'll get more into a specific example. So in a way, our system is composed out of four parts,
which is visibility, allowing data science and ML engineers
to perceive and see what decisions are being made
by their models in production for what populations and so on.
Then they can also explain these predictions
with our explainability engine.
This is the second part.
The third part is
monitoring the engine that constantly tracks and monitors the data and model behavior and
proactively alerts them when there is performance degradation, potential bias or discrimination,
or even if there's an opportunity to improve the model by retraining it with the new data,
APOYA will raise an alert to Slack or Microsoft Teams
and we'll let you know about it.
And the fourth part is saving some time
in the investigation with our investigation toolbox
that allows them to slice and dice the data,
find out what is the root cause for the problem
so they can remediate it.
So this is kind of a high level overview about our solution.
The way it works is that Aporia integrates in two main parts.
The first is in the training phase where our solution takes these data as a baseline.
When thinking about it, machinery models essentially are using the training set, this kind of snapshot of reality that they have learned.
And they are very, very good in predicting for that kind of snapshot of reality that they have learned, and they are very, very good
in predicting for that kind of data. And we have an implicit assumption that because they were good
with the training set and the test set, they will probably be as good in real life, right?
Now, the real life is constantly changing, whether it's the data pipeline itself, which is constantly changing
with new version updates and so on, or whether it's the reality itself, you know, COVID, marketing
campaigns, cultural changes, seasonality, and so on. All these changes affect the behavior and the
performance of the model. So that's why our system, the main integration
point is during the production phase, where Aporia tracks all the predictions that are being made by
the model. So maybe I'll pause here for a second to ask if there's any question before getting
into an example. No, not really. So I was actually going to ask you
if you can show or share an example.
So, yeah.
Yeah, absolutely.
So let's talk about one of our clients,
a financial institution
that is using the machine learning model
to predict credit risk
and according to that,
to decide whether to approve or deny a loan.
So in their case, in one instance, what happened is that they released a new version of the web application,
the web form where users come and ask for a loan.
And as a result, there were miscalculations with few parameters. To be more specific, the software engineer who developed
the web application thought about income as an annual income, while the data scientists who come
from a different country and a different culture thought about income field as monthly income.
So you can only assume what happened when the model started to receive
incomes that were 12 times higher than what it expected.
So eventually, a lot of loans got wrongfully approved by this model.
But thanks to the fact that they had a PORIA in place,
they got an alert about this weird behavior,
first for the input data, second for the distribution of their predictions,
and they were able to quickly find out the problem and remediate it. Okay. I think you mentioned also something about monitoring data sets themselves. And well,
the example you just shared is instructive, but I think it refers to something else. So it refers to basically mismatch in expectations and in feature parameters and so on.
If you're talking about data sets that get updated, well, okay, even though, of course, this can happen, in many cases, it won't.
It's just that you will have basically what's usually described, I think, as concept drift.
So something which was defined based on an initial data set, but its definition gets over time slightly modified or not as slightly, but it gets modified.
So are you also able to detect that in some way?
Yeah, absolutely. So maybe I'll split the type of changes that happen. So one type of changes that you have in production is data integrity like the example I gave before is the change
within the data due to inconsistency or some wrong logic or miss synchronization between different
teams. The third part which relates to what you said is a concept, the world is changing. I think
like the most obvious example is COVID but other other than that, these changes happen on a daily basis.
In another instance, we had a model.
Actually, it was another financial institution that had, were also using machine learning for loan applications.
And in that case, what happened is that the marketing department launched a new campaign targeting students specifically.
And suddenly their model started to experience a lot of more applications from students.
While in the training set, it wasn very, very good, was not representative for what happened in production.
So this is also kind of concept drift.
And our system is able to detect all the three types of differences and changes that I've described.
Okay. So then maybe that's a good segue to go a little bit into the how you do what you do.
So I wandered around your website a little bit and saw a number of references, including actually, I think, an entire sub-site dedicated to a number of open source projects, many of which I was familiar with because it's, well,
in my general area of interest and coverage, let's say.
And I was wondering if you're actually leveraging that under the hood and whether it's a mix with proprietary technology and how exactly does this thing work?
Are you talking specifically about ML Notify and projects like that?
Yeah, I think for example among the projects that you list in mlops.toys which is this
website that I think you put together, you can correct me on that if I'm wrong, I saw projects
such as DVC, I think Streamlit was also listed, like a whole array of projects there.
Yeah, so MLOps toys, we can talk about the MLOps space,
but it's kind of, there are many solutions out there
because this space is very exciting.
And there is a need for many solutions.
So you might see a lot of different solutions and
it might create some confusion and that's why like after experiencing ourselves we've decided
to help the community to better understand what solutions are available under each category
and that's why we created MLOps toys as a project for the community. And as you mentioned, this was kind of our initiative, but it's backed by the community.
So it's an open source one.
Now, as for the question about the way Apoia works behind the scenes.
So for the most part, we don't use the solutions that you mentioned that are existing within MLOps toys.
Most, 99% of Aporia is custom-built things we've developed in-house.
And the way it works, and that's actually very, very interesting to talk about that.
So we ingest the data from production, every input that gets to the machine learning model.
So, for example, all the loan applications with their every input that gets to the machine learning model.
So, for example, all the loan applications with their data and all the predictions that are being made.
Now, what our system does behind the scenes is it actually analyzes all of these data and information.
And for observing and monitoring machine learning models, essentially, you need to create a lot of statistics and metrics and aggregation.
So that's what it is actually doing behind the scenes, create distributions for training and distribution for production, different time slices, different populations.
And after having these metrics, our system treat these metrics as building blocks.
Now, this is very cool because when thinking about it, every machine learning use case is different, right?
Like even if we'll take fraud detection in one company, it will be highly different from a fraud detection model in another company.
Why? Because the data will be different.
The business KPIs are different.
And that's why we need data scientists in the first place.
So we thought about how can we solve that way? How can we allow proper monitoring for each
of these use cases? And that's where the power of Appoya comes to play. Appoya is highly flexible
and customizable. Users can essentially take each of these metrics and decide how they want to
monitor their model using these metrics. So whether they want to monitor for concept drift,
for data integrity issues and stuff like that,
they can just use these building blocks
and build a monitoring engine by using Applorium.
Okay.
Well, I have a number of follow-up questions on that,
some of which I said already
and some of which just popped you know, popped up by
hearing you giving the description. So you mentioned that you, what Aporia's platform does
is that it ingests both data and predictions. And I'm wondering, well, as to the exact nature of that
ingestion, I mean, besides the mechanics of, you know, how exactly do you ingest that data?
I wonder whether you ingest entire datasets
such as training datasets, for example,
or models, or just if you take some, I don't know,
some deltas over those,
because if you were to ingest datasets in their entirety,
that would cost you rather, you know, stably in terms of storage and all the infrastructure.
Yeah, so it depends on the choice of the user.
We can, in case where they don't have any mechanism to store their data, usually they'd like to use Aporia for that as well.
Because you want to store the data also for future retraining.
So that's why it's useful to have this data stored. And in cases where they already have this
stored in some maybe S-free bucket or Azure or any kind of storage, Aporia can just import this data
from the existing storage. Okay. Well, in either case, you end up storing lots of data, I guess. So you must have your own infrastructure for this as well. when it comes to AI, whether it's due to privacy, regulation, or just business sensitivity.
And that's why, like from the very first day when we built Aporia,
we made it in a way that it's natively self-hosted.
So Aporia, like 99% of the system runs within the customer cloud environment.
So then no sensitive data information ever leaves their premise.
Okay, well, you anticipated my next question, which was going to be precisely about that, then no sensitive data information ever leaves their premise.
Okay, well, you anticipated my next question,
which was going to be precisely about that,
because it's a very common scenario that many organizations have issues in letting their data go beyond the confines of the organization.
But I guess you foresaw that, and that's why you built it that way.
You also mentioned, you also referred to something else, some key concepts, key definitions let's say that I also
saw in the examples that you have on your site such as Drift or metrics such as F1 and so on
and so forth. So I was wondering if these are metrics that are sort of built in or
hard coded in some way in your platform, or these are things that users can engage with,
let's say, and shape to their own needs and definitions. Yeah, so this is a wonderful
question because by the end of the day, the ability for users,
for data scientists to be able to customize and see the logic and the implementation behind the
scenes of these metrics is crucial for proper monitoring. And that's why the architecture of
building blocks that I described, that allows us first to come up with out-of-the-box metrics
that are common for ML use cases like F1, accuracy, precision, and so on and so forth.
So in these cases they don't need to implement anything, it just comes out of the box and they
can use it. But in cases where they either want to customize these calculations or even to come up with
their own metrics, it's actually pretty common not to use only these metrics and also to kind
of create your own performance metric depending on the business case. So they can do it easily
within Aporia. They can just write Python code within our system to create a new custom metric of their own, and then gain
all the features and capabilities of, you know, visualizing these metrics, getting alert
upon anomalies in these metrics, creating automated workflows and explaining predictions,
all of that, just by focusing on this metric.
Okay, I see.
So actually, I get the impression, the more you talk about the platform and what it
does, I get the impression that actually, it's not just about monitoring and observability,
but actually a part, perhaps a big part of what it does has to do with data governance. So,
from the fact that you have to ingest
and monitor all those data sets
and all those predictions,
I would say that almost by definition,
that's kind of touches on data governance, really.
So this is right.
Data governance, because when thinking about it,
models are highly affected by the data
that they're being trained with.
And therefore, when we look at observability, most of it is about the data.
And that allows us also to be model agnostic.
So wherever our users are using neural networks or decision trees, it doesn't really matter
for the system itself to make proper monitoring.
I will say, though though that when it comes
to explainability and providing with explanations to why a specific prediction was made, this is the
part when we do look at model specific in a white box manner and we analyze the way the model has
made this decision. Thank you. Well actually that's another quite big area of interest to explore.
And I was going to ask you about that next, because, well, in my mind, at least, well,
having observability and visibility into your pipelines and things like that is obviously
a good thing to have and organizations can
benefit from that.
But I don't really see the direct connection to explainability because, well, if you have
something like a deep neural network, you can monitor it, you know, and its input and
its output all you want, but that won't really make it explainable.
It will tell you, you know, whether it's biased or not, it will tell you, you know, whether its predictions are on the mark or not, but how it gets to that prediction is, you know,
it's just, it's inner working. So I'm wondering, you know, if you have, if there is a way to go,
you know, from monitoring to explainability, basically, in my mind, I don't see the direct
connection, but, you know but perhaps I'm missing something.
Yeah, so I think you're right.
By the end of the day, explainability and observability or monitoring,
these are two different capabilities.
But there is a link connecting these two.
And I think the link connecting these two is maybe the long-term vision
for us as a POIA, which is responsible AI and how can we help companies and us as a society to implement AI in a responsible manner.
Now, when you monitor your models and when you observe their predictions in production, and for example, from time to time, you do see things that you don't expect.
It could be unintentional bias or just weird predictions.
And then the next question that comes to mind is, okay, but why did the model end up with this prediction?
You know, you want to debug the model.
And because it's a black box, it's really cumbersome and really challenging. And then having the
ability to explain these predictions and to see what was the contribution of each input,
for example, okay, the fact that the annual income of this person was at this level,
led to denying their loan, this is crucial. And in addition to that, when you have visibility,
it might also serve not only the data scientists and ML engineers,
but also some business users within the organization.
So, for example, from a regulation perspective, in the U.S., you have to provide with explanation for a loan applicant.
Why did you deny their loan?
And what can they do in order to approve their score so they will be
eligible for a loan it's it this regulation is in practice in the US regardless to whether you
use AI or not but when you use AI it just becomes much more challenging and therefore having this
capability as part of the platform is really useful.
Yeah.
So as far as I know, there's different techniques, basically, that people use depending on the level of explainability they want or they have to achieve.
So I think the most famous example is, well, decision trees, whereby basically if people
want to have like perfect explainability, they either go to use decision trees from the beginning or they
choose to rewrite, let's say, existing algorithms into decision trees.
Yeah, that's right.
So, you know, when dealing with neural networks, for example, it's too much
complex for us as humans to really understand or comprehend the way a decision is being made.
However, there are different aspects of explainability. I can look at explainability
as a question of how the model works in general, which is a much more difficult question as it's
an open question in the academy. But I might simplify this question and say instead of trying to figure out how the whole model works in general, to specify for a specific problem. So for example,
what led the model for this very specific prediction and choose maybe one or even
10 predictions and take a look, focus just on these predictions. So with Aporia, what Aporia
does is essentially focusing on specific problems.
So you can select the predictions that you want to explain. We don't explain the model itself.
And then we allow you to understand what, even if you're using a neural network, what led this neural net for that prediction. There's one thing that I'd like to kind of take a step back to go into because it was
in my list and we missed it somehow.
One of the things that kind of piqued my interest, picking around your website basically, was
this ML notify function, let's say. And I'm wondering, you know, you can, again, you can correct me if I
didn't get it correctly. But I think the way it seems to work, it looked like a sort of like a
Python package or function that you would install, and then you would somehow add this API called
your code, then you would get notified, you know, whenever a certain event would trigger,
basically. And I wonder if this is a model for how people would tie their code or their
datasets to Aporia in general. Is this the same level of injection that they need to do?
So, it's very similar, I'd say.
And you're right, in ML Notify, it's very, very simple.
You just install the Python package,
you import it to your notebook or to your Python script.
And automatically, once the training will start,
you'll get a message with the QR code and link
so you can get notified.
Installing a Poya is really as easy as that.
It's also a Python library that you
install. You instrument it after every prediction in the serving code. And essentially, that's all
you need to do. Okay. And I guess there must also be like a platform with the UI and everything so
that people can have an overall view of their entire data sets and models and so on.
Yeah, actually, that's one of the things we're proud of,
is the fact that Aporia is the only observability platform for ML that is completely self-serve.
I'm a software engineer.
I know that I like to try things just hands-on.
And with Aporia, you can do just that.
You can just go to appoya.com, sign up.
You'll get introduced to some demo model with demo data and notebook.
And you can start playing around and see whether it works for you or not.
And as part of that UI, you can also see how to add new models.
If you want to integrate into your pattern or to your model, it will be
completely guided by the system. Okay. And probably it's also a good time to ask you,
I guess it's a subscription model, and it's a good time to ask you, you know, what are the
different subscription levels that you offer? Yeah. So first we have the free tier, the community edition, which
people can freely use
all the features and all the capabilities
of our platform are available
in the community edition
except for a few integrations
and this is the
free tier and from there, depending
on the company size and the amount
of models they have, we
charge, we have an annual license based on the amount of models
that are being monitored by Aporia.
So you can have unlimited amount of users, unlimited amount of data.
All we charge about is the amount of models.
Okay.
And then now I guess it's a good time to also talk a little bit about the business background of this all.
Well, since after all the occasion is you raising a funding round.
So I guess that it's probably not the first funding that you receive.
I mean, judging from the fact that the company was established in 2019, so just a little bit over two years ago, and you have already developed a platform.
And if I'm not mistaken, you seem to have some paying clients as well.
So I imagine you must have secured some funding at least to be able to do that.
So let's go over the business fundamentals, basically. So first, the funding.
If you can refer to any existing funding that you have received,
then the one that you're going to receive now,
and things like headcount and some business metrics.
Yeah, absolutely.
So we've raised a total of $25 million as of today.
The last round, the CR Series A of $20 million
was led by Tiger Global Management, which are also investors of companies like GitLab, Uber,
Discord, and so on, alongside with Samsung Next, which we have great relationships with.
As for traction and where we're standing,
so, you know, we launched our platform less than a year ago,
and it's been really amazing to see the amount of traction
and the positive comments we get from our users.
We're seeing people talking about Aporia in the community,
on Slack communities, and giving recommendations
and why they like it so much.
So it's really amazing. As you mentioned, we have a bunch of paying customers. We have
hundreds of users that are using Goporya to monitor their models. And it's been really great so far.
Okay. And just out of curiosity, basically, you just said that you seem to have a good amount of traction.
Is that entirely organic or how were you able to reach those people?
Yeah, so I think like any other company in our very early days, we worked with design partners.
I think this was a very strong point that allowed us to launch out of stealth
with a very mature product
and to make it self-serve,
to be very confident about our product
that people can just try it out
without ever speaking to us.
That's how much confident we are in the product.
And not only that,
we see that they do get
none less than amazing experience.
So I think that was mainly thanks to that. And
yeah, basically, like, that's the kind of experience we're having. And, you know, as for
where our users are coming from. So I think that a lot of it is organic. As you mentioned,
we've released some open source projects for the community
like MLOps Toys and
Notify, Training Invaders, which in
my opinion is very, very cool.
Whenever you start training your model,
you have a Space Invaders game
which you can play around in the
meantime.
I think these projects and these
initiatives allowed us to
be more heard and to be more known in the community.
Okay. All right. And what about your future plans then?
In other words, where are you planning on spending that money that you're receiving?
Yeah. So we're really like the inbound interest and traffic we've been receiving in the past few months has been really crazy.
And now with this funding that allows us to expand our team, we plan to triple our team size in Israel and in the US as well to support all of this growth. We want to expand our product and to be, you know,
we really aim to be a full stack observability platform
and to deepen our capabilities with computer vision
to come up with more explainability solutions.
And the way I see it with the traction that we have so far
with the product that we have,
I believe this funding could make us the market leader
by the end of next year.
Okay, well, two follow-up questions on that.
First, you may have mentioned it before,
but I'm not sure I got it.
You said you want to triple your headcount.
What is the current number of employees?
21.
21, all right.
And then you also mentioned that you want to become like a full
stack observability platform. I'm kind of assuming that you're still referring to MLOps specifically,
so machine learning models and data sets. You don't have plans to extend to other domains?
That is true and we also don't plan to extend to other domains? That is true. And we also don't plan to extend to other domains
other than observability.
Okay.
Well, I think it's probably a wise decision.
You found your thing.
You don't need to open up the scope too much.
Yeah, you know, we are working.
It's also interesting to see the different verticals
we're working with and the difference in the companies.
It's really starting from small startups, literally 10, 12, 15 people that just has AI in the core of their product.
And through mid-sized companies like Lemonade and Armis of the world, we have ranging from hundreds of employees and up to Fortune 500 companies
that are specifically looking
for observability solution
to their ML models.
Okay, great.
I think that pretty much covers
everything I had in my list.
So as far as I'm concerned,
we were good.
And if you have any closing thoughts,
anything you'd like to share,
feel free.
Otherwise, I think we can wrap up here.
Yeah, I think that sounds good.
Thank you for the question.
It was really great.
Yeah, great.
Thank you.
Thank you for your time as well.
And yeah, I imagine, you know, it's probably crazy days for you.
It always is for founders, you know, just before they announce their funding round.
So I appreciate your time as well.
Absolutely. Thank you very much, George.
I hope you enjoyed the podcast.
If you like my work, you can follow Link Data Orchestration
on Twitter, LinkedIn, and Facebook.