Orchestrate all the Things - The rise of first-mile observability: Calyptia wants to enable enterprises to log all the things. Featuring co-founders Anurag Gupta & Eduardo Silva
Episode Date: March 1, 2022Cloud-native is the name of the game for application development. The creators of the Fluent Bit and Fluentd popular open-source projects for cloud-native observability are launching an offering ...aimed at the enterprise. Article published on ZDNet
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amatiotis and we'll be connecting the dots together.
Cloud Native is the name of the game for application development.
The creators of the Fluentbit and FluentD, popular open-source projects for cloud native observability,
are launching an offering aimed at the enterprise.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data
Orchestration on Twitter, LinkedIn, and Facebook. Perfect. Well, I'll start. My name is Eduardo
Silva. I'm one of the co-founders of Calitia. Part of my background, I came from many companies
before, but at least once at Oracle, I was working in the kernel space.
Then I joined Treasure Data, where Fluentia and FluentBit was created.
And after this, well, we started this company called Caliptia.
A bit of background, I'm from South America, I'm from Chile, I live in Costa Rica.
And yeah, so that's pretty much part of my intro.
Andrey?
Sure.
Yeah, I'm one of the other co-founders.
Started my career off at Microsoft doing Unix and Linux monitoring back in the day with
System Center.
So I've been in the observability space since the very start.
I went to TreasureData, and that's how Eduardo and I got to work together pretty deeply.
And then had a stint at Elastic where we worked on cloud services.
And we kind of saw a lot of the opportunity with FluentBit and FluentD as projects growing in 2020.
And we decided, hey, let's go do something about that.
And that's kind of the first foray into how we started Clipdia.
Okay, great.
Thank you both for the introduction.
And well, the next thing I wanted to ask is basically,
you'll have to excuse my ignorance,
but I honestly admit that I didn't know the first thing about Fluentbit
and Fluentee well up until a couple of days before.
So, you know, my superficial first look suggested that, well,
I was trying to associate, let's say, with the closest thing I know of.
And the one that I could manage was, well, to me,
it looks a little bit like something like Zipkin or Jaeger.
So distributed, you know distributed trace management and all
that stuff. And so I wanted to ask you, is that correct? Is my impression correct? And if yes,
well, what's the difference? So what's the differentiation? And if not, well,
how would you describe it in your own words? Yeah, actually we can say that Jaeger and Zipkin are cousins,
right? We are in the same space, we kind of maybe live in the same village, but we play different
roles. When it's about why Jaeger and Zipkin exist, it's because we want to generate information
or trace information from your applications.
You have instrumentation inside your application and you generate this information. Fluentd and
Fluentbit come from the logging space. So if you take observability as a whole thing,
there are primarily three pillars, right? One of them is traces, one is logs, and the other is metrics. Zipkin and Jaeger
are from trace and space. We are from logs and metric space. Now, Fluentia and Fluentbed,
the problem why it pretty much exists is when you deploy applications and you want to monitor this
application, there are many mechanisms.
One of them is to instrument your application with external components for tracing, for
example.
Or the other one, which is the most basic, I would say not basic, but the most old mechanism
is to just let your application write messages to a file saying, I'm doing transaction A, I'm doing transaction B,
I'm behaving like this, I found an error. So when it's time to analyze how your application
is behaving or what is misbehaving, first thing is you want to take a look at the logs.
So Fluentd and Fluentb play a role in that space because when you have a distributed
system, you can think about Kubernetes, when you have, for example, a hundred of applications
and you want to perform data analysis, meaning how my applications are behaving,
you want to take a look at the logs. But in order to solve that, you cannot go to each one of the
application, every single
file and take a look at what this application is doing.
You need to have a specialized tool that you can tell it, hey, these are the applications,
here are the logs, please take the logs and centralize them in a database or a cloud service
so we can perform data analysis. So Fluentd and Fluentbit becomes this specialized tool
that are able to collect this log information,
process this information, and centralize it
in one or multiple endpoints for analysis.
And Traces take a different approach,
so I would say that's kind of our cousins
because it's like kind of our cousins, because
it's a kind of different family of instrumentation.
Okay, thanks for the explanation. Of course, that opens up more questions, so you're going
to have to answer. And that's right. Okay, so let me try and see if I got it right. So is the role then of FluentBit to collect, let's say, different logs from,
I don't know, from a multitude of microservices, for example, that your application may be using,
and then try to recreate like a collective log, let's say, like create the collective log by
joining together all these disparate ones or just getting them all in
one place and then something else has to do this job?
It's really what you just said.
We collect all those logs and then send it to a backend for further analysis, further
investigation.
So we host a FluentCon.
It's a conference about the Fluent projects. And some samples have Neiman Marcus, a retail store within the US that uses it to collect data from its cloud services, uses FluentBit to collect metrics from its cloud services, and then centralize them into backends like, say, Grafana, Prometheus, etc. Likewise, on the financial side, you have folks like
Fidelity that will go and deploy this on their cloud services and collect a ton of data and send
it to various backends. And the power that comes with this tool and that makes it a little different
than just A to B is as we collect that data, like in the Fidelity use cases, we can do a lot of
transformation and parsing and logic in the middle.
And that would be things like Lua scripting or adding specific key value pairs in the
Kubernetes space, for example, when you use these technologies.
It's really important to have context about your logs.
And a log traditionally will just say the information about the application.
But what FluentBit and FluentD have done for the past 10 and 5 years respectively is add
the ability to talk to APIs within Kubernetes, enrich all that data, make it so it's much
more meaningful, so you can go diagnose and debug much, much faster.
Okay. So then the value is not just in aggregating all those different logs in one place, but you
basically, it seems like the value is in contextualizing. So adding to this raw data
that's in the logs, right? Contextualizing, routing, for example, you might have multiple backends, right? You might have
Amazon S3, you might have Google Stackdriver, and you want to send data to both. And a lot of tools
lock you into one or the other. While Fluent D and Fluent Bid, because they're so vendor agnostic
and powerful, we let you route that data wherever it needs to go. If it's two backends, if it's 100 backends, it doesn't matter.
And the truth is those backends just keep evolving, right?
We're seeing more and more services spin up every other year.
So how do we keep up with that?
And these are the values that Fluent Bid as projects can really offer.
Actually, you just touched on something that I wanted to ask you about anyway. So,
the backend side of the story. Because in the press release that I saw, I got the impression
that part of your offering seemed like it had to be somehow connected to some backend. So,
I was wondering whether this is something you have developed or it's something that you integrate with and if you can shed some light on this one.
Sure. For us, it's like we've built a backend. What we've built is an overlay service on top of
Fluentd and Fluentbit. And this is called Clipdia Cloud. We have thousands of users who signed up for Clipdia Cloud.
It includes two key things.
One is a control and management and monitoring layer.
And then the second is it includes developer tools
for folks who just want to use the open source
and are trying to figure out some of the key aspects
about regular expressions
or whether or not things are working properly.
And with the management and monitoring side, we take all of the information that Fluentd and
Fluentbit have at the edge layer as they're collecting the data. Who am I routing that
data to? How much data am I routing it to? Where am I routing that data to? What IPs? And we visualize, we add that as context to a user.
When they go in, they can say, wow, we just sent 800 gigabytes to this backend. Did we mean to do
that? And you can start to surface those type of insights ahead of that landing into that backend
and then getting built for it. So these are just some of the ways that we've built services where it's not
necessarily a proprietary back end for Fluent Bit, Fluent D.
It's just an overlay, Qlik.get Cloud on top.
Okay.
I have to say, I'm still not as enlightened as I was hoping I would be.
I was hoping to get something a bit more specific, to be honest with you.
Like, I don't know, what are you using under the hood?
So it could be, I don't know, a file system, HDFS or S3, or it could be a database.
Are you back-end agnostic, basically, or is it something proprietary that you've built?
So as products with FluentT and Fluidbit,
we're very backend agnostic.
If you want to send data to Splunk,
DataDog, New Relic, Elasticsearch, OpenSearch,
we'll support all of those.
From our cloud service side,
we run on top of Amazon.
We have services within Google.
We use a lot of Lambda functions.
We store our data in Postgres. So some of
the stuff that we use as part of our cloud service kind of go all over the place.
Okay. Okay. Thanks. The other thing that I wanted to ask you about was this term that he seemed to be using.
So first, mile observability.
And to me, again, I tried to wrap my head around it, basically. And I was wondering, okay, so it seems to be connected to this cold distributed nature,
let's say, of applications with some services running here and some services
running there.
And I was wondering, okay, so how...
You kind of touched upon that already.
So you seem to basically collect all the logs, all the information from many different places
and just aggregating it into your central repository, basically.
Is that correct? Because again,
by reading the press release, I got the impression that it seems somehow to imply that you can also
do some analysis on the edge, so where your local service is running. And I was kind of trying to
see how that would be possible. Yeah. So maybe I can talk a little bit about
the first small observability
part. And one of the big things
that Horda had mentioned earlier is
there's three key pillars that folks
talk about with observability. Logs,
metrics, traces. But we also see
observability as a journey, right?
Of, hey, I'm doing some monitoring first
and I'm evolving that monitoring. I'm adding
these new data sources.
And how do you take that journey essentially from where the data is created to where you're
debugging, you're diagnosing, and getting insights?
And with that journey, we see the first mile as really where that data gets generated,
where it gets contextualized, where it gets processed, and where it gets routed. And that's where our open source routes FluentD and FluentBit have
been for a number of years. FluentBit has been deployed a billion times across container
environments. And we've taken that expertise that we've had and with Calyptia, we add that overlay
on top to say, okay, with first model observia, we add that overlay on top to say,
okay, with first model observability,
how is that data getting routed?
Because a lot of times,
it's interesting, these users will say,
we have no idea how our data gets routed.
Why did the data choose this path
versus this path?
Why do we send so much data?
Was it because of a debug log?
Was it because of XYZ? And these it because of X, Y, Z? And these are some of the
questions we can start to answer because of the unique placement that these open source
technologies generally have within this larger observability journey.
Okay. The other thing, again, trying to kind of position, let's say, what you do in the bigger picture of observability.
And I was wondering, you mentioned previously Grafana.
And to me, on the surface, the approach kind of seems similar in the sense that, well, you know, they also say, well, we don't really have a centralized database and we
aggregate everything. And you mentioned that actually Grafana is one of the services, let's
say, that you integrate with. And I was wondering if you could compare and contrast and at the same
time explain how do you work with Grafan? Actually, the way that it works,
if you draw a line and this is observability,
you start from the left to the right, right?
Where in the left side, it's happening all the data collection,
pre-processing,
and then you're aggregating all the data in a central point, right?
It could be in a database or a premise.
When you get the data in that place, then you can do data analysis,
then you can do data visualization, right?
Grafana specializes in visualization,
but also is agnostic on where you have the data stored, right?
So we can decide, for example, FluentBed can decide,
let's store the data on, I don't know, on Prometheus.
We send the data to Prometheus or Loki.
Then you can tell Grafana in the more on the left,
hey, the data is here in Loki and Prometheus,
and they're going to pull the information
to build all these beautiful dashboards.
You can consider that Fluentbed and Calitia, we are the first mile of this observability
on the left.
And then all this data, when it transitions to the right, you have all these other companies
with great products showing you the value of that data.
So it's like a side-by-side job.
So I would say that we integrate with all of them.
Okay.
Actually, though, it seems to me like
what you are now unveiling,
so Calipia Cloud,
kind of cuts in the middle
of this continuum that you just described
because, well, it kind of adds a layer
on top of the open source fluent bit and fluent D
and just lets you, well, again, aggregate, visualize.
It gives you a UI.
And actually, that's what I was able to gather on first look.
And I was wondering, well, if there are any additional features
that I didn't get to see.
And what's the value basically on top of the open source projects?
Yeah, absolutely.
So there's one key product that you probably didn't see
was our ClipJet Enterprise for Flimbit, which has really three components.
It is the cloud. It is a CLI, a management interface.
And then it is also something that's deployed within a user's or customer's infrastructure.
And this is where, after Edward and I have been in the community, Edward has created and maintained the project for a number of years,
we've talked to probably 100, 150 community members
just in the past few months alone.
And we've realized the key sets of problems around the manual process
of how to get something like Fluidbit and FluentD ready
for observability and enterprise-grade scale.
So the product that we've built here and have some folks using it in production today
is really meant to increase the capability for a single observability practitioner,
reduce the amount of time that they're going to be spending on doing these manual tasks like scaling or auto
healing or trying to route that data. And then the last bit is really reducing costs.
So one of their production users, they were leveraging an expensive backend. And they
essentially siphoned off a lot of that data, put it into something like Amazon S3, and they were able to save and cut costs pretty significantly,
where trying to architect and organize all this before was quite hard.
So, yeah, Callipti Cloud is definitely the dev tools, the monitoring, the visualization piece.
But we also have the management aspect that we come in with Calypso Enterprise for FluentBit.
Okay. And I think you've mentioned, and again, I was going to ask you about that, that
people can also operate Calypso Cloud as a self-hosted version.
Calypso Cloud is a fully hosted visualization layer, but Calyptia Enterprise for Fluent
Bit, which is the main solution folks will use, that one can be deployed within your
Kubernetes cluster in your environment.
So, you can have it in your public cloud, you can have it in your private cloud, you
can have it on your Red Hat OpenShift, you can have it wherever it needs to be, and you can have multiple of them.
You could have 50 of them across multiple different environments, and they all communicate
with Caliptic Cloud as that control layer.
Okay.
And I imagine it's probably a subscription-based model, and would you be able to just mention
the different layers that you have?
Sure, sure.
So from Calipta's side, we have really two main ways of transaction.
We've done a lot of business with enterprise support, making sure that folks who are leveraging this in the financial services industry, telecom, cloud providers, etc., can really get to that next level.
And we've even had some commercial partnerships, technology partnerships with folks like OpenSearch that we've publicly discussed.
And then we have the second side, which is that subscription model where we take this product,
we allow you to go in and deploy into your environment.
And then there's a component about the data that moves through that product, right?
A more value-based stream.
We're still working on a case-to-case basis.
So a lot of our funding is actually dedicated to making sure that we can meet that really high demand that folks are asking for.
Hey, how do we solve XYZ?
And we have a suitable and really simple, easy way to go out and do these things.
We can meet that demand that we see.
Okay. You mentioned the funding, and that's a good thing
because that was going to be my next question, actually.
Well, this is what you're announcing, but before we got to that part,
I wanted to make sure that I had a kind of decent,
let's say, understanding about what you do.
So you're getting, I think it's a seed round of 5 million.
So I wanted to ask you a little bit
about the business side of things,
like when was the business entity, let's say, incorporated?
What are the major milestones leading to this funding announcement?
You know, headcount, use cases, clients, this sort of thing, if you would like to share.
Sure.
I can start with the first part of the story.
The company was incorporated in 2020.
And we never before to get crazy, building something that we thought that was
useful for the enterprise. We took a couple of months interviewing prospects and our users.
As you know, as we are maintaining from the ecosystem,
from the open source ecosystem,
our users are Microsoft, AWS, Google Cloud,
plus many other companies who deploy
a need to solve the observability problem at scale.
So we pretty much joined like 70 or more calls,
Zoom calls, more than an hour,
extracting information, understanding about the pains,
and understanding what we know where things were going,
but we wanted to take some time to do some research.
We did our research, and after a few months,
we started creating this kind of developer tools, right?
One of the first problems, configuration, validation.
Deploying pipelines
sometimes is a problem. What about if you got the configuration wrong and you went to production,
and then you find out that the data is wrong? It's really hard. It's expensive to hit that place.
We started with Calytic Cloud with a kind of free tools for configuration validation to make sure
that if you're going to send your data to Splunk, the pipelines that you are setting
up are correct.
They are not going to miss anything.
Also then we started iterating and saying, okay, FluentBit is great in Kubernetes clusters.
People are using it.
We get nowadays like 5 million deployments per day. And now, in total, we have
1 billion for the last
two, three years. And now it's time
to do something that
will provide more value to the
users. And all of it has
different needs, right? But all of them is like
there are cases associated with security.
I want to put something
in the middle of my pipeline to be able to
control the data to make sure that I can reduce cost because I'm using Splunk, but I also want to send
the data to S3.
So that is the first part of the story.
I think I can continue with the talk.
And from a growth perspective, we have around 16 folks now part of the company.
And as part of that in the funding again is to really take and accelerate all of that
kind of open source demand for finding easier ways to go and solve this.
And so we're really taking all of that funding and putting it right back in, whether it's into the open source and, of course, to our business side to make sure that we can really solve those problems at scale effectively.
And are you planning on increasing your headcount?
Yeah, we're always hiring.
So here's our plug.
If you're looking to join an awesome, awesome startup,
please do reach out to us. We're looking for great talent.
And one of the good things is that we are a really fully distributed team.
So we're a global team.
We have people in Japan, Spain, Chile, Argentina, Costa Rica, United States.
So we are really looking for really good talent for people who are very proactive
that has been involved in open source for some angle.
So I would say that open source is kind of the DNA of Haliti.
And yeah, still looking for more people who share the same vision as us. And
observability, there's a lot of challenges. And we always say that this year, oh, we solve
this problem, but next year is always more data, more complexity, more distributed systems.
It's a non-stopping thing.
Yeah, it's like an arms race because the development ecosystem is always evolving. And so observability has to keep up, I guess.
Okay.
So I think that was pretty much everything I had in my list.
So I think we're good. I don't know if you have any closing thoughts
or comments. If yes, feel free. If not, I think we can
wrap up here.
No, just to mention that we work with different financial institutions,
cloud providers, and this is where our major traction comes before from the open source,
but now all of them are being converted to customers. And yeah, with this announcement,
we're going to have another announcement with
the product announcement that is coming in today. I think we had mentioned we had
thousands of signups for Ecliptic Cloud. So that kind of also helped us realize some of those
key pieces about where to go invest. So there's some other stuff.
Yeah, I think I saw on your site also some nice traction
on GitHub and so from the open source side of things.
So yeah, it looks like you have an interesting journey
ahead of you.
I hope you enjoyed the podcast. If you like
my work, you can follow Linked Data Orchestration on Twitter, LinkedIn, and Facebook.