PurePerformance - OpenTelemetry from a Contributors perspective with Daniel Dyla and Armin Ruech
Episode Date: April 18, 2022OpenTelemetry, for some the biggest romance story in open source, as it took off with the merger of OpenCensus and OpenTracing. But what is OpenTelemetry from the perspective of a contributor? Listen ...to this episode and here it from Daniel Dyla, Co-Maintainer OTel JS and W3C Distributed Tracing WG, and Armin Ruech who is on the Technical Committee focusing on cross language specifications. They give us insights into what it takes to contribute and drive an open source projects and give us an update on OpenTelemetry, the current status, what they are working on right now as well as the near future improvements they are excited about.Show Links:The OpenTelemetry Projecthttps://opentelemetry.io/Daniel Dylahttps://engineering.dynatrace.com/persons/daniel-dyla/Armin Ruechhttps://engineering.dynatrace.com/persons/armin-ruech/List of instrumented librarieshttps://opentelemetry.io/registry/Contribute to OTelhttps://opentelemetry.io/docs/contribution-guidelines/OpenTelemetry Tutorials on IsItObservablehttps://isitobservable.io/open-telemetry
Transcript
Discussion (0)
It's time for Pure Performance!
Get your stopwatches ready, it's time for Pure Performance with Andy Grabner and Brian Wilson.
Hello and welcome to another episode of Pure Performance, everybody.
My name is Brian Wilson and as always is my trusty sidekick, my lead, my partner, my co-captain,
my everything, Andy Grabner.
Andy, how are you doing today?
Did you just captain?
You know that it's always me when I go captain. captain yeah yeah no I was always you get we always give
me crap when I say captain I was not saying I'm saying in terms of like a
captain of a ship and we're you know okay we're two ships sailing in the
distance or whatever in the Sun I don't know there's some stupid saying it's a
little bit romantic there but we won't get into anything.
We don't want to shy away.
This is not a romantic episode. Do we have, we never had,
I don't think we've ever had any hint of romance in the episodes,
like not even a Valentine's day episode.
I think today we'll talk about the romance between maybe logs,
traces and events coming together on the one roof.
You're so good at that, Andy. You're so good at the segues. It's amazing. Absolutely amazing.
Ladies and gentlemen, I present Andy Grabner.
Thank you. And I was just saying my first company I worked for was actually called Segway Software.
We built a performer. So I think Segway, Segway is just kind of in my nature.
But let's stop with the introduction and talking about ourselves.
Today's topic is open telemetry.
I mean, open telemetry is obviously the role.
I liked it.
So Armin, this is a perfect way to introduce yourself because you just brought up an even
better way of talking. But how can we bring romance into this? Armin you go right away what did you say?
Sure yeah you brought up romance and I thought of the marriage of open senses and open tracing
and that brings us to open telemetry today's topic. Exactly and And Armin, as you already kind of took over,
do you want to maybe start actually with introducing yourself and then we also bring
Dan in because I know we actually have two guests today. Sure. So my name is Armin. I'm a team lead
and software engineer at Dynatrace. And as part of that, I contribute a lot to OpenTelemetry
where I'm on the technical committee.
I started at the company a couple of years ago, at first on the one agent,
later on the one agent SDK, which is kind of a proprietary way of extending the agent.
And then the natural segue to move on was to go to the OpenTelemetry project and contribute there.
Great. And first of all, thanks again for the great intro on kind of segue on OpenTelemetry.
OpenTelemetry is a big topic for us. That's why we wanted to make sure we actually get people
on the show that not just talk about it in theory, but actually are actively involved
in that important project.
And now let's also open up the stage for Dan, Daniel Diler.
Hello, welcome to the show.
You may introduce yourself quickly.
Yeah.
Hey there.
I'm Dan Diler.
I also work for Dynatrace.
I'm an open source architect on Armin's team, on the open source team.
I'm primarily working on open telemetry these days, but I also work on the W3C Tracing, Distributed Tracing Working Group, among some other things.
And I guess we'll get more into what OpenTelemetry is and what it means later, but I'm on the
governance committee there and I also maintain the JavaScript client.
Can you quickly fill us in? Because there's a lot of things it seems you do in your day-to-day job.
Can you tell us how much time you actually spend in, let's say, open source project governance
things like administrative parts, I don't know, being in meetings, in sync meetings, working on
Git issues. How much time can you kind of say this? How much time do you spend in the, in sync meetings, working on Git issues, how much time can you kind of say this,
how much time do you spend in the, what I would say administrative part, which is very important
or the community driven part versus the actual implementation? Yeah, I guess it's kind of
transitioning more and more to the community aspect of it every day. When I first started working on OpenTelemetry,
I would say I was coding 80 percent and doing
community management 20 percent and now it's almost the opposite.
Now it's more like 20 percent coding and 80 percent community management.
By that I mean not just stuff like this,
but pull request reviews, issue triage, governance topics,
running the meetings, stewarding new members and stuff like that.
But this also means then, if I hear this correctly, you obviously have more people contributing
and obviously these people then are contributing through pull requests, which need to be reviewed.
And so the community itself, the contributing community is growing.
And they need some guidance.
Yeah, that's great.
Yeah, of course.
The community is growing.
And the amount of code that I do as a percentage of my time has certainly gone down.
But that doesn't mean that the project itself is getting less.
When I first started on it, it was really only me and a couple other people
on the JavaScript side of things, and it has really grown. So as it grows, it's obviously
good for the project, but it means that there's a lot more administrative overhead, I guess you
would call it. Yeah. Armin, is there anything else from your end that you would like to add,
some from your perspective in the groups that you are actively involved in?
So I'm mostly active in the specification, so less coding is involved there.
The only code we touch there is Markdown and some things with the build tooling
to make the spec paths and so on.
But in the specification, it's a lot of issues and PRs going on there.
And as being part of the TC, that also involves some housekeeping around things like GitHub
repo organizations, community membership and those kind of things.
And it's really, thank you so much for all of this work,
because I think this is often from the outset, at least,
I think missed that this is very important work,
because with this work, nothing would actually move along.
Nothing would be coordinated.
And it would be coordinated, and the output
wouldn't be what it is.
And therefore, thank you for also giving us
some insights into what it actually
takes to run open source projects successfully.
So I think that the governance part is often underestimated the amount of time that will go into that because people think open source is easy. Someone else will write the code for
you on your behalf and you don't have to do anything right um but luckily that that naive approach uh that
naive approach is not what what experienced people think of um but some might um and at first i also
thought that it was less less governance and less community work um and more progress moving on but
in every issue and every pr you have a lot of people with a lot of
opinions on it and so in the spec things naturally move a bit bit slower so in the open telemetry
specification that is you know looking at this what we just talked about there's a lot of pull
requests a lot of work a lot of coordination work going, a lot of making sure people can actually become active
and be productive.
What are the other metrics, KPIs,
that you guys are looking at to measure the actual output
and also compare a little bit how other open source projects
are running?
Just to give an example, maybe looking
at the number of pull requests that are open versus how long
do they take.
Are there any metrics that we should
be aware of what open source communities actually
are looking for?
Because when I think about this, a lot of new people
are entering the open source space
and they want to contribute.
And there's 1,000 projects to choose from.
Are there any good health indicators of, hey,
this is actually
a great run project, which I assume yours are? Yeah. So I guess since I'm the governance
committee member, I should take that one. The things that we primarily look at are
how long from when an issue is open to uh the first contact from a maintainer and then
until it's closed uh and then same for pull requests uh time to the first review and then
time to merge or close um it is part of the mandate of the governance committee to look at
those things periodically um honestly i wish we looked at them more often.
But at this point, I would say we look at it quarterly or twice a year, give or take.
And from there, we can decide where identifying the shortcomings is the first step to fixing them, right? So we had, for instance, a year ago, we were having a problem with, I guess, what you would
call throughput. We had a lot of pull requests being opened that would just sit for a really
long time. And as the governance committee, it's our job to come up with processes and such that lower that time, at least until someone's contacted.
Because as a first contributor, especially first contributor, it's a terrible experience to open an issue or a pull request, and then it just sits for a month and nobody has responded to it.
And you just feel ignored, especially when you can see that there are
things going on in the project. So you know, there are people working on this. Why are they not
interacting with my pull request? And a lot of that comes from just dynamics within the project
that you have in most of the what we call special interest groups that
maintain a component. You have like a core group of people that contribute regularly.
And you have like, you know, maybe five or six people that are used to seeing each other. They're
used to seeing pull requests from each other and stuff like that. And then when someone new comes
in, it's just not a part of their regular day-to-day flow. So as the governance
committee, we try to see what we can do to bridge that gap a little bit.
It's interesting because when you think about it, you're all volunteering, basically.
And there's no, I mean, obviously people are taking lead roles and trying to help push this stuff.
But number one, it's a matter of how much time can you put into that.
And also for anybody who is trying to make improvements, it becomes a lot more challenging in an open source project, I would imagine, to affect change when you don't have the implicit, let's call it authority behind you of being a
hired manager at a company, right? So I imagine there's a lot more diplomacy involved and just
even trying to get people to move along. It's a passion project. And Andy, you faced a lot of that
within Captain, I think, right? In terms of how do we get it moving along how do we keep the the the enthusiasm and
and uh the um the momentum moving so it's uh yes i i think probably thinks most people probably
haven't thought of when it comes to open source projects right the real daily running of it so
it's interesting to hear this perspective and thank you for sharing that. Yeah. Of course, ideally it's a meritocracy,
right? Like whoever is, is doing the best work should, should be, you know, moved up or promoted or whatever it is that you call it. But, uh, the reality is that there is a lot of diplomacy
involved and stuff like that. And even if someone is, you know, a genius contributor,
they come in and they make their first PR and nobody knows that they're a genius. So there is
definitely like a, a progression that you have to make, make, you know, for your first PRs and such.
And most of our like TC and GC members have been around with the project for a really long time.
And there are obviously disadvantages to that in terms of getting new contributors in,
but there's also advantages.
The people that have been around for a really long time are more familiar with the vision of the project
and the long-term roadmap and where we see it going and stuff like that,
which prevents us from having too many course corrections along the way.
It's a good thing that everybody is moving towards the same goal.
And maybe the last question on this before I want to give you the chance to also talk
a little bit more about OpenTelemetry.
But the last question is, there are clearly people that just come in from anywhere in
the world and try to contribute.
However, the two of you employed by Dynatrace,
you are dedicated on these projects,
which means in open source projects
like OpenTelemetry and others, we
have organizations that see the value in it
and actually contribute engineering resources
full time to really drive these things forward.
Does this help?
I assume so, right?
At least it helps to really move things along
in the direction that the uh also some of the vendors like we are
really also wanted but you know always in collaboration obviously with the bigger community
uh yeah um so i mean i think everybody has this idea of open source projects as being done by volunteers.
And in a lot of cases, that's true.
But in most cases, I would say, at least for larger projects, you tend to find that most of the people working on the project are working for some company that is interested in moving the project forward.
And as a company, there's several ways you can contribute to a project.
One way is by giving money.
But in a lot of cases, that doesn't really help as much as contributing engineering resources and product management resources and things like that.
Money contributions are never
discouraged, but a lot of times they're not the most effective way to help.
So you do find a lot of organizations donating engineering resources and things like that.
And that's why a lot of the contributors tend to be from organizations. But that said,
we definitely do have quite a few contributors
that are working in their free time,
whether they are just open source,
passionate about open source,
or whether they are open telemetry users
and their company just doesn't have time
to dedicate their own time,
but they identify shortcomings and they say, as a user,
I want this to be as good as it can possibly be,
and things like that.
But yeah, the bulk of the contributions
do tend to come from organizational resources.
Cool.
Well, thanks, both of you, for giving us the insights
from the outside in especially to understand
what's actually happening when running open source projects.
Now, open telemetry,
right? The romance between
open census and open tracing.
I think I need to write this. This is going to become
part of a child.
I love child, yeah.
Brian and I,
we had guests in the past
talking about open telemetry, but I think as this project is moving so fast, a lot of things have changed.
And therefore, it would really be great for those that listened to our previous episodes, but also those that might be completely new, just to give everybody that is listening an update.
Kind of OpenTelemetry from a high-level perspective, what is it?
Who is supporting it who is the
beneficiary of of it how does it look like if i want to use it and so i would like to pass it over
to either of you to give me a little overview and our listeners obviously about open telemetry
so the general high level um concept of it um luckily has not changed since your last episode.
It still fulfills the same purpose and it can be described as an API and SDK and other
tooling and integrations that are designed for collecting telemetry, processing it and then for exporting it to certain data sinks
that can analyze this data.
And the beneficiaries are all people that have anything running in production because
on the long run the vision would be that everything out there would come instrumented with ideally open telemetry out of the box
so that you can just magically monitor everything that you deploy in your environments with
it.
I was just going to say, I think the key differentiator, not the differentiator, but one thing to highlight
for people new to this concept is this is the
collection and exporting of this data it's not the processing or consumption that you still need
either a vendor tool like ours to do or another third-party analysis tool to ingest all this data
and make sense of it correct the open telemetry piece is the collection and exporting of that data is that a fair fair assessment um
yeah it is so there are um there are tools um to to work with that data to analyze it
um on in open source there is jaeger for example that can handle traces and obviously vendors
that that have their own backends to import such data.
But the OpenTelemetry itself is about collecting and managing and processing or pre-processing the data.
And the way I see it, and Brian, I'm sure you see it too, the reason why organizations are really looking forward to OpenTelemetry,
at least one of the arguments
i hear is people want to become kind of independent of any vendor they want they don't
want to build in something uh proprietary and everybody wants to be i think this is also why
kubernetes at least in theory is is is great because kubernetes itself is open whether you
run it on premise or whether you run it in one of the cloud vendors in their managed offering it doesn't matter you can always move around and you kind of really get this hybrid
cloud becoming a reality and you can just run your containers anywhere and i guess with open
telemetry would be the same as long as everything is instrumented with open telemetry you get your
metrics you get your logs you get your traces and then you can obviously then switch backend tools as you said you can switch from an open source to commercial you can your traces, and then you can obviously then switch back-end tools, as you said.
You can switch from an open source to a commercial.
You can switch around, or you can let people decide on what they want to do with the data.
One issue, though, I have when people talk about that idea of vendor lock-in, and I know this is probably a bit of a hot topic in some circles, is that if we're talking about everything except for serverless, right?
Or some of the brand new stuff,
there really is no vendor lock-in because this is not 1997 or 2005 or whatever
year you want to go back to where the only way to instrument the code and
collect this data is to build stuff and write stuff into your code.
You know, all of the vendors are doing it dynamically.
So there really is no lock-in. You're just dynamically capturing the data with one agent versus another. But whenever
vendor lock-in is brought up, it's always done in this boogeyman style of, oh, you're going to have
to instrument your code. That's like such an old-fashioned view. It just kills me. Obviously,
when we start going into things like serverless, where there are no agents and things like that,
this becomes a reality, right? But most people are talking about open telemetry in combination
with kubernetes which is still talking about jvms containers and everything and i just wanted to get
out that off my chest because it really bothers me when i hear the whole vendor lock inside i see
the point they're making but they're using like an ancient argument against it which bothers me
well maybe you guys have a different view on that
because obviously I'm only getting it from my perspective.
Yeah, of course.
I would say that there are high-quality agents
that do instrument automatically and stuff like that.
And in a lot of cases, they work really well.
But there's always corner cases and edge cases. Like serverless is one that you brought up, but cases they work really well. But there's always like corner cases and edge cases,
like serverless is one that you brought up,
but there are others as well,
where automatically instrumenting applications
either misses some key component
that's maybe not missed in a technical perspective,
but like is important to your particular business case, right?
Like a metric isn't captured that is important to your business, but is maybe not common from a technical perspective.
Or maybe some API is particularly difficult to observe from the outside. What really makes open telemetry different, and I think what the core value proposition of it
is that because it's open source,
you could have libraries and open source applications
and stuff like that that build it in as first-party support.
So when you're looking from the outside,
you can, of course, capture a lot of things.
But if you're the developer of a library,
you're always going to know what's important
at a much deeper level than could ever be done
by an outside agent or something like that.
So if you have a database driver, for example,
that builds in first party support for open telemetry, the data is most likely more reliable,
more stable, and more resistant to changes and updates in the library. For example, if a new
version of a database driver is released and some outside instrumentation breaks,
whereas a first-party built-in instrumentation
would be considered as a part of that update.
Yeah, those are fantastic arguments for open telemetry.
I'm not against any of that.
I just don't see the vendor lock-in side.
But that's more a feature function and
benefit and more out of the box but anyway maybe to the vendor lock and i just have one comment
of this where i can also see it valid let's say you have one apm tool and it provides you let's
say 10 metrics and you need them and then you're switching to another tool and you don't get the
same metrics it means you're kind of locked in. Because in order to get out of it,
you need to figure out how to get this data that you always
had in the other tool with the new tool.
If you only base everything on OpenTelemetry,
you don't care about the tool because you
know that you always get the data, however,
whatever backend system you have.
I think this is one of the arguments with the lock-in
that I could understand and I see.
I got a question now to both of you.
I think the way now that I see it, if I'm looking at it again from a very naive perspective,
I have the option that developers instrument their code and we encourage them to build
libraries or their own applications because
they know what's best then i can also see that if i am a developer and i'm using third-party
libraries hopefully somebody instrumented it but what about uh is what about the status of dynamic
instrumentation because i believe that already exists at least for some technologies so we have
some automated agents in OpenTelemetry
that also inject OpenTelemetry.
Can any one of you fill us in here?
Because maybe I don't want to instrument my code manually.
And I think that option also exists.
And if you're not the experts, then
feel free to say this, but at least I
think there are some auto instrumentation options
available.
Just been interested to hear an update. Yeah, I think everybody kind of has a different
definition of what auto instrumentation means. And it sounds like what you're referring to is
something where I don't have to modify my code at all, right? And I just, like, it's automatically
injected. I think, and maybe Armin can correct me if I'm wrong here, I think Java has
some concept of this in OpenTelemetry and possibly.NET, but it's honestly not
a very mature story there in OpenTelemetry yet. For the most part, in most of the language
implementations, you do need to do some level of manual work.
Most of the time what that means is during my application startup,
I need to configure OpenTelemetry and
configure instrumentations and stuff like that.
But I don't need to then go and
modify every single class in my application.
For example, in JavaScript,
I can enable the HTTP instrumentation
and the Express instrumentation,
and then anywhere where I make or receive an HTTP call and
my Express application and router and
all that are all automatically instrumented.
It's only like five lines of code to get all of that.
So it's semi-automatic, I guess you would call it.
But from a developer perspective,
this feels automated, right?
I don't need to go into all of my,
into my code and find the places.
So, yeah.
Yeah, at least you get some go-gons. You probably have more, better things to say than I do.
I was just going to say, of course,
we want it to be as automatic as possible.
Ideally, you wouldn't need to change your code at all, configure anything, and we would just figure it out.
But as the project matures, it'll hopefully get closer to that ideal scenario, but we're just not quite there yet. Yeah, the Java agent is certainly one of the closest there because the JVM has built-in support
or built-in hooks for monitoring agents.
And so you don't even have to add a line
and then recompile the entire app again,
but you can just pass the auto-instrumentation agent
on the startup of the Java virtual machine,
and then it wires up things automatically.
And that's quite close to fully automatic instrumentation already.
And I wanted to ask when it comes to,
harking back slightly to vendor lock-in,
when we talk about languages like Python and Ruby,
where for the most part that is SDK work,
that is manually instrumenting your code,
with this concept of automatic instrumentation,
do you see anything on the horizon where it's, is it either
just run this batch script against your code and it's going to instrument the common things, or is
there going to be some function built into some languages like that where it's unmanaged code
that it's at least going to be able to pick up the ingresses and egresses and complete a span
through that? What are those, you are those non-managed code bases look
like on the landscape of automatic instrumentation?
Well, I mean, on an infinite time scale and
assuming the success of the project and all of that,
we would hope that it's built in as many places as possible.
One really great example of this is actually.NET, which has built in first party support
for open telemetry directly into the runtime.
So far, that's the only place where that's true.
And it is really cool.
It kind of does feel like magic when everything works the way that it's supposed to. And the more success the project has and the more users it has,
the more incentivized library developers and such will be to build it in.
And then once you attain some critical mass of libraries that have built in support,
it then makes sense for the runtime maintainers to look and say,
our community, this is clearly something they care about,
and maybe we can build in,
if not automatic support,
then at least some features that enable it to be
used more easily and stuff like that.
I think this is all really long time-scale stuff.
None of it's coming next year necessarily,
but that is the long-term
vision of the project. I would like to loop Armin a little bit into the loop. Armin, do we miss
anything here when we talk about, because we initially, when we prepared for the talk, we
wanted to make sure we give a good overview of where OpenTelemetry stands, where it came from. Anything else we miss that the audience should understand?
Maybe also where it came from so that we understand where we are.
Sure.
So to the history of OpenTelemetry, there were two competing open source vendor agnostic instrumentation libraries one being open tracing
and one being open sensors also with their with their upsides and downsides and then i think i
think open tracing was first published in 2016 and open sensorsensus two years later, like 2018.
And then in 2019, that's also when we came to the project,
they were merged together and joined forces on OpenTelemetry.
That's where we are right now.
Yeah. And is there still, I mean, just out of, because I don't know,
is there still some OpenTracing and OpenCensus out there because somebody put it
in back then and they're still maintaining it somehow,
or is this all converted over?
No, it's not, not converted over everywhere yet.
So there will certainly be quite a few deployments with OpenTracing out there.
But what OpenTelemetry provides and calls
shims is some backwards compatibility support
with both OpenTracing and OpenSensors to some extent.
And one question that I have, because at least Brian and I,
we've been doing this
instrumentation with our product for a while and I remember in the early days we always had
the challenge to not misuse the power of instrumentation which means in the old days
we had the option to configure rules to instrument every method in a certain package, right?
And we call it the shotgun instrumentation,
which meant we were just collecting a lot of information that nobody really needed
unless you were to develop on the local machine
and really wanted to have everything,
but then it broke everything later on.
Are there, from your community work,
are there best practices or are there any approaches
how you can make sure that as a developer, I'm not instrumenting too many things?
Because too much of something doesn't help anybody either.
But if it becomes either overhead or also if you're capturing things that might even be information that is confidential, is there any best practices or any discussion in this area in the community?
So if you're instrumenting things manually,
if you're instrumenting your application or your library,
then you will best know what kind of information
is interesting and relevant.
So you can identify the operations
that you want to track and then only cover these.
So that's probably less of an issue.
And for instrumentations that are provided, for example, by OpenTelemetry as separate instrumentation libraries,
that is also taken care of.
So when an HTTP client library is instrumented or an HTTP server library,
then of course the focus is on the requests there and less on the internals.
So the granularity there should be manageable.
And are you just aware, because Brian and I were both performance enthusiasts,
do you know if, especially for those libraries that
are then heavily used in all sorts of applications,
is there any performance testing that
is done instrumented versus un-instrumented?
And just in case you happen to know if there's
anything happening there.
I've seen some benchmarks in their respective language
implementation repositories being performed.
But I don't know if that's employed everywhere.
Yeah, I'm curious.
I don't know if this is under any of your realms of expertise in this area,
but Andy, you mentioned exposing data.
So I'm curious as to what the OpenTelemetry security thought process is, right?
If either in an OpenTelemetry project, I can possibly do this, or as a vendor, I might sneak
into my code something that's going to expose something. Is there any sort of security review
when these libraries maybe come through or is it just hey I
take our word for it it's good like what's what's that risk that you might
see I mean obviously there's risk with anything you use vendor or not is there
any I don't know quite the word but any any approach to that within the open
telemetry project?
So the OpenTelemetry specification defines, let's say, a data model on how the collected telemetry should look like.
We call these the semantic conventions.
And they were written with that in mind
so that any potentially sensitive data should not be captured.
But of course, you can encode, I don't know,
credit card numbers and passwords
in your URL query parameters or whatnot.
So those have to be filled with care.
But there is no inherent way of prohibiting or preventing any such sensitive data from being exposed in open telemetry. That's something that
the respective backends are to take care of. So you can send your data to a back-end and then it's up to the back-end to make sure that only people who should have access to such data do have access the point of collecting it. So when you instrument
your code, you would say I want to add this attribute to my trace or to my span and I
define this one as being sensitive, but that's in an early stage
and just ideas floating around there.
Cool.
Guys, this now goes to both of you and whoever wants to answer first.
I know you've been both involved in that project for a while now
and we kind of learned now, again, the history and where we are.
But what's more interesting
not more but also very interesting is what's happening now and what's what are you guys excited
about in what's upcoming what are the new cool things or the things that are the innovation that
is happening the maybe the new features that are coming in let's say in the next three six to 12
months that will be beneficial for the community
that the community has been asking for?
And whoever wants to go first.
Sure.
So what's currently going on is that the metric specification was in large chunks rewritten. So we know that OpenTelemetry consists of three pillars,
one being traces, one being metrics, and one being logs.
Traces have been out for a while now,
and now it was time to finish up the metrics back.
It was actually just marked as stable last week.
So that's part of the OpenTelemetry feature lifecycle.
Now, all of the language SDK implementations
are catching up with the latest specification.
In the upcoming weeks,
we should already see the first OpenTelemetry SDKs
with a stable metric support being published.
And that's nice to see.
It took us quite a while, so it was underestimated at first how long it will take.
So the roadmap that was anticipated initially would have reached some stability in the metric space in
2021 already towards the end of the year, but that was not achievable. But now we are there
and no road bumps ahead that we would be aware of at this point.
Yeah, I think when we first started working on metrics,
everybody looked at the community of contributors and said,
we have a lot of really smart people from a lot of companies with a lot of experience, with a lot of good ideas,
and everybody generally knows what they're doing
and we should get this banged out pretty quick.
And then what happened was when we started actually working on it, we had a lot of really
smart people with a lot of really good ideas. And there ended up being a lot of debate about
which good ideas should be, you know, because some of them, you can have two ways to solve
a problem that are both good ways but have trade-offs.
And a lot of times those debates just take longer than you expect because both sides have merit.
It's much easier to settle a debate when one side is clearly wrong
or one side is clearly way better.
But when you have, honestly, a really smart community that cares a lot, there are a lot of good solutions.
And that can be as much of a cause of delays as mistakes that you've made.
Yeah.
It's the benefit and the pros and the cons of a democracy.
Yeah, we lovingly refer to it as the open source tax.
Yeah.
And any future work?
I mean, anything else we should highlight that's beyond?
It's great, obviously, that the metrics API is now,
the spec is stable and we see the SDKs.
Anything else people, if we, if we would speak again,
and let's say three to six months, and then you would say, yes,
and this is not here because we're getting close to this,
but what else is coming in the future?
Yeah. So of course, Armin already said,
the metric stability is a huge thing that's happening.
Like literally right now, we were just in a meeting talking about it. And in the next three months, six months, year, we're going to see more
development of metrics. There were some features that were left out of the initial interest of stability and timelines that we do want to finish and get out.
So one example of that might be exemplars and metrics, which is a way of connecting
your metrics and traces together so that your metrics are, I guess, what you would call tracing aware, which we're really excited about.
Potentially, like a hints API,
which would be if I'm a library developer
and I'm instrumenting my own code,
I can provide a hint to the way that the SDK may be
configured when the defaults maybe won't work for this use case. a hint to the way that the SDK may be configured,
when the defaults maybe won't work for this use case,
I can suggest different defaults.
But then as the final end user,
they can always override that hint.
So that's why we call it the Hints API.
Then similarly to making metrics tracing aware,
there's been some work to do the same with logs,
to apply trace and span IDs to logs in order
to make them also tracing aware.
I guess, again,
talking more about long-term timelines,
the goal of the project is to not have separate traces
and metrics and logs,
but to have one interconnected stream of data.
A smart scheme, if you will.
Yeah, exactly.
Armin, something you wanted to add?
One thing that will be added to tracing, for example,
is also the consistent trace sampling.
So consistent sampling of distributed traces
across multiple processes or multiple realms
with potentially differing sampling rates.
That is also something that's currently
in an experimental state and being prototyped.
Right, and if somebody wants to see
all the things in action,
now our colleague, Henry Grexet,
he has his YouTube channel, Is It Observable?
And he recently did
some sessions actually on open telemetry where he also explains first of all visually very nice but then he also has some great tutorials so folks if you want to see what open telemetry really
looks like and how you can get this done on a sample app just check out is it observable as well armand daniel i know you are both obviously
relying on on people that actually contribute to the project and i think a community needs to grow
and therefore i want to give you also the chance to say what and where should people go to in case
they now listen to us and say hey this sounds like an interesting project are there good places to start with if they want to contribute in whatever form right
because there's code contribution i guess there's documentation contribution so any any thoughts
and where to get started so one great way of contributing would be to look up the project
get started with with any tutorial that looks appealing to you
and then trying it out.
And if there is anything that you notice that you don't like,
then bring it up with us.
So feedback, user feedback is on high demand in OpenTelemetry
and we would love to hear back from actual users trying it out. There is the
CNCF slack that everyone can sign up for with a bunch of OTEL prefixed slack channels where you
can reach out to the respective SDK maintainers for any questions or any issues. And of course,
if something looks like a bug or anything,
then make sure to raise an issue
in the respective repository.
That would be one great way of getting started
with any kind of contribution.
Then of course, as always, documentation is something
that can be worked on.
And if you feel ready for the first code contribution,
then you could look for the respective SDK implementations
and you will find, usually you will find some issues
labeled with help wanted or good first issue,
although those are not always applied,
but you can get started by just mentioning
that you would like to work on something
and if no one is already working on it in parallel
or if you think you can do it in a nicer way,
then you can get started with that.
It's amazing how fast time flies.
I want to also be respective to everyone's time here, but i also want to make sure we didn't miss anything that we want our listeners to know so dan armin any
any things we missed anything you want to make sure that people are still still know about OpenTelemetry?
No, I think we did a pretty good job of covering
the history of the project, where we're at, and where we're moving forward to,
as well as how people can get involved.
I would stress from a getting involved standpoint,
what Armen mentioned about users and providing feedback. I think a lot of people what armin mentioned about about users and providing
feedback i think a lot of people when they think about how can i contribute to an open source
project they always tend to think about how can i make code contributions or how can i how can i
actually you know make changes to the project but really the most important community members are the users.
So just using the project and providing feedback is immensely valuable.
And I think the value of that is often lost or at least not talked about.
Where we can build whatever we think is cool,
but if we never hear back from users that they like this,
or they don't like this,
or they really have this gap that we need to fill,
then all we're doing is building what we think people want or would use.
For me, the most valuable contributions are honestly just people using the project and then telling us where it can be improved.
That lack of feedback is how you end up with Microsoft Bob.
How you end up with lots of things.
Now Armin needs to look up Microsoft Bob. I saw his face.
Yeah, it's not the
paper clip that shows up in
Birdish. It came out of that, I believe.
I believe Clippy was
if you go back
to the Windows 98 theme packs,
that was what ended up coming out of Microsoft
Bob. I will hear no
blasphemy against Clippy.
Oh, we all
love Clippy. Because if we didn't have Clippy, who could we complain
about?
You know, the thing I'm most excited
about, just wrapping up my last thought on this,
I'm most excited to see
the third-party integrations,
you know, where
vendors are
baking OpenTelemetry into
their code, into their packages,
so that everyone can use it i'm
what i'm really curious to see is if there are any of the historically obtuse vendors i won't
name names but we all know some of them are out there will refuse to well they'll either refuse
to do or put open telemetry into their code to help everybody out to monitor what's going inside
or if someone's going inside.
Or if someone's going to be like, oh, this is a great idea, but we're going to do our own proprietary version of baking instrumentation into our code just because we, you know,
companies do that.
It's going to be interesting to see how the political landscape of the larger code vendors
or package vendors plays out.
Obviously, when you're dealing with more open source-based projects and packages of code,
you're not going to have that.
But there are these large companies that maintain a lot of their own code.
And I'm interested to see how they play along with that.
Which ones really embrace it and say, we have nothing to hide, and other ones who are going
to be like, we're not going to let you see this.
So you're talking not at this point about observability vendors but about
application vendors yes yeah i completely agree um but i do think that as open telemetry gets
more momentum uh it will be uh more and more difficult to say we're going to do our own thing. Because as more tracing backends support it and more platforms support it,
it hopefully will become an obvious value.
And even a competitive disadvantage if you don't.
Exactly.
Because in the end, if the consumer demands it,
the consumer of the software, because somebody needs to run it,
and then it's a troubleshooter and said, hey, I can go with you or with the other one.
And the other one gives me better ways of operating your software than I go with the
other one, because that's more valuable for me, maybe.
Yeah.
There's always that behemoth mentality, though, of, again, vendor lock-in idea that, you know,
you're stuck using us and we'll sell you services to analyze the code instead of allowing you
to do yours.
Anyway, it'll be interesting to see how it plays out.
I hope it doesn't go that route, but you know,
there's one company I have in mind in particular,
which I won't say that I can see doing that kind of stuff, but you know,
hopefully not, hopefully I'll be proven wrong.
Awesome. Hey, Dan, Armin, thank you so much for being on the show.
And thank you for having us.
And thanks for doing your work on OpenTelemetry
because you make all of our lives easier in the end.
I'm glad to hear it.
Thank you.
And Brian, I hope I can also speak for yourself,
but we want to make sure we invite them back in a couple of months
to see what's happening.
Andy, speaking of the thanks,
Andy and I come from a world
where we always wish people took performance seriously.
And it's amazing to see that there's this huge project now
where people are taking performance seriously
and things are going to be starting to be baked in eventually and all.
So from that perspective, I thank you as well
because it's always been a passion of ours
to see stuff working properly.
So thanks to you guys and everyone else on the community
and everyone using it and providing feedback.
Yeah, we look forward to having you back.
And anybody has any questions or comments?
Andy, was there something else you want to say?
I can see you looking.
Okay, he had a look of anticipation.
The problem with video.
Thank you, everyone, for listening.
If you have any questions, comments, feedback, pure underscore DT on Twitter or email us at pureperformanceatdianatrace.com.
We'll have a bunch of links to a lot of the things that Andy mentioned in the show in the show notes.
So thanks again for listening and making this all possible, everyone.
And thank you to Armin and Daniel for being on our show today. Thanks, everyone.
Bye- Bye.
Bye.