Lex Fridman Podcast - #496 – FFmpeg: The Incredible Technology Behind Video on the Internet
Episode Date: May 6, 2026Jean-Baptiste Kempf is lead developer of VLC and president of VideoLAN. Kieran Kunhya is a longtime FFmpeg contributor, codec engineer, and the person behind the now-infamous FFmpeg account on X. Than...k you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep496-sc See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript: https://lexfridman.com/ffmpeg-transcript CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: FFmpeg on X: https://x.com/FFmpeg FFmpeg: https://ffmpeg.org/ VideoLAN (VLC): https://www.videolan.org/ VideoLAN on X: https://x.com/videolan Jean-Baptiste’s Website: https://jbkempf.com/ Jean-Baptiste’s LinkedIn: https://www.linkedin.com/in/jbkempf/ Jean-Baptiste’s GitHub: https://github.com/jbkempf Kieran’s X: https://x.com/kierank_ Kieran’s LinkedIn: https://bit.ly/3OORhmC Kieran’s GitHub: https://github.com/kierank SPONSORS: To support this podcast, check out our sponsors & get discounts: Larridin: Measure AI adoption in your business. Go to https://larridin.com Blitzy: AI agent for large enterprise codebases. Go to https://blitzy.com/lex BetterHelp: Online therapy and counseling. Go to https://betterhelp.com/lex Fin: AI agent for customer service. Go to https://fin.ai/lex LMNT: Zero-sugar electrolyte drink mix. Go to https://drinkLMNT.com/lex Perplexity: AI-powered answer engine. Go to https://perplexity.ai/ OUTLINE: (00:00) – Introduction (03:00) – Sponsors, Comments, and Reflections (10:48) – Weirdest things VLC opens (15:12) – How video playback works (24:33) – Video codecs and containers (35:20) – FFmpeg explained (56:20) – Linus Torvalds (1:00:59) – Turning down millions to keep VLC ad-free (1:15:17) – FFmpeg & Google drama (1:34:31) – FFmpeg developers (1:41:08) – VLC and FFmpeg (1:45:42) – History of FFmpeg (1:48:59) – Reverse engineering codecs (2:02:14) – FFmpeg testing (2:06:21) – Assembly code (handwritten) (2:30:39) – Rust programming language (2:39:55) – FFmpeg and Libav fork (2:48:17) – Open source burnout (2:56:04) – x264 and internet video (3:09:20) – Video compression basics (3:16:17) – CIA and fake VLC (3:26:52) – Ultra low latency streaming (3:44:20) – AV2 codec and video patents (3:54:12) – VLC backdoors (4:04:27) – Video archiving (4:11:04) – Future of FFmpeg and VLC
Transcript
Discussion (0)
The following is a conversation all about FFAMPEG and VLC
with Jean-Baptiste Kempf and Karen Kunya.
FFMPEG is an open source software system
that is the invisible backbone behind YouTube, Netflix, Chrome, VLC, Discord,
and basically every platform that touches video or audio on the internet.
It can decode and code, transcode, stream, and play
almost any video or audio format ever created.
To me, it is one of the most incredible software systems ever developed,
and it's all done by volunteers.
VLC is also a legendary piece of software.
It is an open source media player that plays basically anything you throw at it,
any format, any platform, no ads, no tracking.
It has been downloaded over six billion times,
and again, for me, it had been one of my favorite pieces of software ever,
with the most legendary logo, which I, of course, had to honor in this conversation
by wearing the VLC traffic cone hat the whole time.
So again, above all else, thank you to the incredible volunteer engineers
who put their heart and soul into this code that has been used and loved by billions of people.
Thank you.
and about the two great engineers and human beings I'm talking to in this episode.
Jean-Baptiste is the president of Videoland and is a key figure behind VLC and FFMPEG.
Kieran is a longtime Kodak engineer, FFMPEG contributor, and the man behind the now infamous FFMPEG account on Twitter X that I recommend everybody follow.
for the memes and for the unapologetic celebration of open source and great low-level software engineering.
Let me also say that it's inspiring and humbling that so much of modern civilization rests on software built by people who are not chasing fame or money,
but are obsessed with the craft of engineering.
We live in a world where billions of people consume video every day without ever thinking about the invisible media.
machinery underneath it. But that machinery matters. Open source infrastructure matters.
It is one of the great examples of human beings quietly collaborating across borders to build
something useful, durable, and elegant for the rest of us. And so this conversation is not
just about codex and media pipelines. It is also about the deeper spirit of engineering
and generosity that makes projects like FFMPEG possible.
Again, I can never say it enough. Thank you.
And now, a quick-shie-second mention of a sponsor, check them out in the description, or at Lexfreedman.com slash sponsors.
It is, in fact, the best way to support this podcast.
We got Laird in for understanding how AI is used in your business, Blitzy for code generation and large code bases, better help for mental health, Finn for customer service AI agents, element for electrolytes, and bookbooking.
for curiosity-driven knowledge exploration.
Choose wisely, my friends.
And now, onto the full ladders,
I try to make them interesting,
but if you skip, please still check out the sponsors.
I enjoy their stuff.
Maybe you will too.
To get in touch with me, for whatever reason,
go to Lexfreedman.com slash contact.
All right, let's go.
This episode is brought to you by Lairden,
a platform that helps organizations
understand how AI is being used
across their business and what is doing for productivity and performance.
There really is a transformation happening at the individual developer level.
Many people switch from writing, let's say, 50%, 40%, 30% of their code where the rest is
written by AI.
They switch to basically where it's 0% of code written by hand via so-called agentic engineering.
And so we could see that in the individual developers.
Now, the question is when you scale that to two, three, four, five, 100 developers, what does that look like?
What does that look like from the perspective of a company that's trying to actually ship products
and trying to coordinate teams so they can build and collaborate together?
How is AI being used to increase the productivity of the individual contributors and teams seen as a whole?
That's what Laredon does.
If AI is part of your organization, now is the moment to get control of it.
head to Laredin.com to book a demo,
and to start maximizing impact from AI.
This episode is also brought to you by Blitzy,
an AI-powered autonomous software development platform.
Build for large, complex code bases,
huge number of cooperative agents,
really optimized for huge code bases,
optimized for scaling speed
when you're talking about
a very large number of agents working together.
And that's the big, interesting question.
When you have a huge number of agents, a huge company, huge code base, how do you then seen at the
big picture code base level have the growth and development, the evolution of that codebase
where a very large percentage of that code base is continuously worked on autonomously?
The question is when you have a large code base that already delivers value, they already sell stuff
that already has huge number of customers,
how do you then use agentic engineering
to continue adding features,
continuing improving,
continuing the usual kind of development
with the testing, with the security,
all that kind of stuff.
How do you do that without messing stuff up,
without filling up your code base with AI slop,
and nevertheless doing it so, for the most part,
autonomously, not fully autonomously,
semi-autonomously,
but majority of the code is written autonomously.
That's what Blitzie's special.
specializes in the future of autonomous software development is here,
learn more or speak to a member of their team at blitzie.com slash Lex.
That's blitzie.com slash Lex.
This episode is also brought to you by BetterHelp, spelled H-E-L-P-Help,
moving away from AI to the human.
The human mind is still, to this day,
out of reach of our understanding.
from the perspective of creating intelligence.
There are so many intricate psychological complexities to the mind,
which I think contributes to what makes humans incredibly special.
But those complexities get all tangled up in ways that are counterproductive,
and so they need to be untangled.
I'm a big fan of talk therapy as a set of tools as a methodology
for untangling the complexities of the human mind.
the easy, discrete, affordable way of doing that is BetterHelp.
That's why I keep recommending it.
It's a really good way to take your first steps if you haven't done talk therapy.
Get a licensed professional therapist in under 48 hours.
BetterHelp makes it super easy.
Check them out at BetterHelp.com slash Lex and save in your first month.
That's betterhelp.com slash Lex.
This episode is also brought to you by Finn, the number one AI agent for customer service.
as I mentioned already about humans, humans are complicated.
And customer service is ultimately about looking at each individual human and really listening.
So I try to do with a podcast to truly listen to each individual person,
whether we're talking about a super technical topic or we're talking about the big questions of the human condition.
And sometimes when you're talking about the details of the technical topic underneath there,
you can feel the presence of the big picture questions of the human condition.
All of those complexities peek out of the shallow surface interactions
that at first glance you might think customer service interaction is.
But really, we're trying to solve the big problems of the human condition
and specific to that individual person.
So customer service is a really hard problem, but it's a really important one.
And that's what Finn specializes in, how to leverage,
to use, how to utilize AI in the task of customer service.
Go to fin.a.ai slash Lex to learn more about transforming your customer service and scaling
your support team. That's fin.a.i slash Lex.
This episode is also brought to by Element, my daily zero sugar and delicious electrolyte mix.
I'm traveling to parts of the world that at first feel, at first experience are so foreign
that they border on the uncomfortable.
All new things can be uncomfortable.
And for me, element on the health side, psychological side,
are source of comfort.
They're a reminder of home.
It's a place where I'm healthiest.
It's a place where I have everything sort of in line.
I'm exercising, running every day,
whether that's jih Tzu or running or lifting.
I am on a good diet.
I'm fasting a lot.
And for all of that, you need the electrical.
that element provides sodium, potassium magnesium.
Really, if you get the electrolytes right,
everything else is a little bit easier.
You don't get the headaches, you don't get just the weird,
if you're feeling when you're fasting.
Favorite flavor, as always, I should mention,
because I love it so much, is watermelon salt.
Get a free A-Com sample pack with any purchase.
Try it at drinkelement.com slash Lex.
This is the Lex Friedman podcast.
please check out our sponsors in the description where you can also find links to contact me,
ask questions, give feedback, and so on.
And now, dear friends, here's Jean-Baptiste Kempf and Kieran Konya.
So the legend goes, VLC can open everything.
What's the weirdest thing that you know that it can open?
You know there is a ton of people who are using VLC to record VHS videos, right?
Like, it's just like you plug it with a capture card and you can basically record VHS video.
Well, how does that work?
Basically, it's, you know, those type of capture card where you can put a Peritel in or RGA and you put that.
And actually, VLC can play those type of cards.
And there is a module which allows to control directly those VCR camcorders.
We support DVD audio lately, right?
We spend the summer working on DVD audio support.
And, like, there is no one's making any DVD audio support.
the custom encryption schemes.
What about Lucasfilm?
Oh, yeah.
And there is, of course, all the weird codecs support, game codex supported by FFMPEG.
The one Star Wars video game, the first 10-second opening sequence,
someone has gone and implemented that and make sure that's bit exact on one disc that
existed at one time of one little sequence in the game.
And then, funny, there was a, at one VideoLand Conference,
we made a competition to me make the weirdest and most horrible file ever and see if VLC could play it.
What did end up being?
What's the file?
It was an MKV file made by Derek,
which each of the frame was changing resolution,
aspect ratio, rotation,
and it was like,
telework?
Yes.
And there was another one where the whole video
was actually animated subtitles, right?
SSA, right?
Yeah, remember that?
This one was.
So each frame was a black frame,
but on top of that,
there was a subtitle that was animated for each frame.
There was a file.
that's a valid zip and a valid MP3 at the same time or something like that.
So yeah, we'd made a competition of stupid files.
And it worked.
It opened all of the stupid files.
Yes.
By the way, for people who are not familiar, I am wearing a hat.
Would it be fair to say this is the best, worst logo of all time, the cone?
Yeah, by far, right?
The logo of VLC is so iconic, right?
Like, we are a team with a small number of people, and the icon is known everywhere.
I go to middle of nowhere in India or in China, people know the cone, right?
And 25% of the website traffic that comes to our main website is cone player, right?
So many people don't know VLT, right?
They know the cone player.
That's something that Google for is a cone player.
Yeah, they go on Google and they put cone player and they download VLT, right?
So that's iconic.
And once we try to change it as a joke, right, we said it was going to be a type of caterpillar construction.
And we said that during April 1st.
We had around 10,000 emails saying, no, don't change the logo and so on, right?
So it's so iconic, right?
It's so distinctive, right?
If you want to do a video player, you're going to put a play button on a TV, right?
And that's a YouTube logo, right?
It's an original.
This one is orange, right?
It's very bright.
And it's weird.
And it's ridiculous and it's absurd and it's hilarious.
It becomes meme and meme becomes culture.
And you keep it.
And you know about it.
And you know that in 20 years,
like you still have going to have the cones and remember,
oh yeah, that was a video player.
Yeah, and we'll talk about, you know,
the mission of FFAMPEC being a kind of,
the archival aspect of it.
So you can think about a thousand years from now,
we'll have all these videos that only VLC can open.
Human civilization has already destroyed itself multiple times,
and the only thing that will remain is this like,
you know, the cockroaches will be crawling around,
and it'll be the VLC logo with some of the archival footage
that VLC can open.
and the aliens will show up and they'll press play and it'll get to see all.
Well, we really hope so, right?
But there's also so many memes where people say,
well, I'm sure I can put a pancake inside my DVD drive and VLC will play it.
Can they?
No, we tried. It doesn't.
But we actually have a video of us trying that.
Didn't work.
A code act for physical reality?
I don't know what that would even look like.
There was a guy who did that, right?
He printed a small cone, right?
Like the ones we distribute us goodies.
And inside he put an RFID chip, which was his...
his way of playing a movie, right?
And so he put this on the RFID player,
and when he put that,
it was playing, like,
the Last Star Wars and so on.
So instead of having, like, DVD boxes,
he had, like, VLC cones all around,
and he plugged that,
and there was, like, physical objects.
So the thing that we're talking about
is everything around video codex,
video encoding, video decoding,
video streaming, video player client,
that I'm wearing on my head,
the entire ecosystem,
enabling free media.
We'll talk about FFMPEG,
We'll talk about Video Land, VLC, and all the other incredible video technology that is used probably by billions of people.
So, J.B, you're the lead developer behind the legendary VLC player.
Kieran, amongst many other things, your lead developer behind the legendary FFMPEG handle on Twitter.
And both of you have spicy opinions, I would say.
So today we want to talk about FFEMPEG and VLC.
For context, for people who are not aware,
and I'm sure basically everybody listening to this,
have used these two technologies probably regularly without knowing it.
So FFMPEG analyzed basically most video on the internet,
including YouTube, Netflix, Chrome, Firefox, of course, VLC,
and countless other video platforms.
It is estimated that over 90% of video processing workflows online and offline involve a FemPEG.
VLC has been downloaded at least 6.5 billion times.
But likely that number, because it's impossible to really count the number, is much higher than that.
Virtually any operating system supports virtually any media format.
The limitation being it can't open pancakes.
So can we just lay out some of the basics to help people understand what's involved in all of this?
So when we press play on a video player like VLC, what happens?
How does it go from the file or the stream to the pixels on the screen and the sound on the speaker?
What are the big stages to be aware of?
So there are several stages, right?
The first stage is to get from an address, right, which is the type of you.
to give you a byte of streams, right?
So this would be, for example, HTTP, file, DVD, right?
You give the pass to the media and give you a stream of data.
The stream needs to be cut up by what's known as the container,
the demultiplexer or demux.
We'll try and keep the jargon light throughout this,
but it needs to go and start demarcating video and audio frame.
So it just gets data from the operating system blocks at a time
and needs to start cutting these frames up into compressed data.
it then needs to start doing simple parsing of the video frames,
mainly to figure out whether that codec is GPU decodable
or needs to fall back to software.
We're very sort of used to assuming the GPU will play all of these things,
there'll be hardware acceleration.
I think it's up to 45% of files are not GPU decodable.
So these need to be probe, they need to be detected.
There can be variants of a given codec,
some of which are decodable on the GPU.
Different vendors of GPU might have different capabilities,
So those need to be detected.
So if it's GPU capable, you pass it through to the GPU blackbox.
So now, if there's a software fallback, that means in the beginning is to first do de-entropy coding.
So removing the mathematical coding of the bitstream.
So this uses capabilities such as Hoffman coding or arithmetic coding to actually decompress the mathematical layer of the bitstream.
We then need to start reading the syntax elements for intra-prediction.
So intra-prediction are like still images of the video, so your eye frames.
So this works and operates in the spatial domain.
So you do your intra-prediction and spatial domain, you have a residual
because your prediction isn't quite matching that of reality.
So you've made a prediction, but then there's a little bit left,
and that's what's known as the residual.
This is stored in the frequency domain,
and these are quantized to decompan their space.
We then need to do the inverse transform to bring them back to the spatial domain,
and apply these residuals.
So a lot of the process of the decoding
is this thing is compressed.
Yes.
And you have to predict the highest quality thing
that's supposed to go there.
Eyeframe is the best representation
you have spatially.
And then there's a lot of temporal compression
that can happen depending on the codec.
And then you're predicting.
You're predicting what the reality
that was captured in this raw form.
Yeah, because what people don't realize
is that the compression on video
and audio is 100 times, right?
Like, people don't realize how compressed we do, right?
For audio, you move, you compress by,
when you go from normal audio to MP3, you compress by 10 times, right?
When you move to video, you need 100 times, 200 times, right?
So you need to remove all the details that you don't care about
because all the compressions that we do, and that's very important,
people forget about that, is to be viewed by humans, right?
So all the codecs, either for audio, mimic basically how your ear works, right?
And a lot of things about the response on the ear, and same for your eyes, right?
And so, for example, on video, we don't work on RGB, right?
Everyone expect to work in RGB.
We don't, right?
We move to YUV, which is basically one is luminance, brightness, and the other are colors.
And this matches your eyes, where inside your eyes you have the cones and the buttons, right?
with some of them look on brightness and more on the other on colors, right?
So we need to compress a lot, and so we need to degrade,
but in order to degrade, we need to match the human perception.
And this is why it's so difficult.
And then we need to use the maximum power, mathematical power,
very complex technologies.
We move to the frequency domain, as Kieran said.
We do a thought of decontazzing,
and in order to get the best compression, but it still looks good.
You're trying to compress in order to maximize the highest quality thing for human perception.
That is correct.
And that is correct.
And this is very important, right?
Compression is not like a zip, right?
A zip, you have data in, you get data out, right?
And you try with all the zip compression to arrive with the limit.
Here, we are degrading the signal, right?
And so we need to degrade both the audio and the video signal in the best way possible.
And we can do that, but it involves first a,
a lot of theoretical knowledge about how it works,
the eye works, but a lot of mathematical change,
a lot of mathematical tricks, right?
For example, when you move to RGB and you go to YUV,
for example, what we do very often is that we scale down
the resolution of the color compared to the brightness.
And most of the time, and just this without compression,
it divides the size by two, but most people don't see it, right?
And so on and so on, right?
And then you go to very complex mathematical change.
So, of course, Fourier transform,
which de facto are not Fourier transform.
They are like discrete continuous transform,
but that's the same idea.
So frequency domain.
We split the video by blocks, right?
So that's why when it's wrongly decoded,
you see those blocks and badly encoded, you see those blocks,
and so on, to arrive to compression states that are insanely high, right?
And each generation of the codec is like 30% less.
for the same quality, right?
And this requires amount of power,
of computational power that are huge.
You should elaborate,
it's 30% better,
but an order of magnitude,
perhaps even two orders of magnitude,
more compression power.
That's the big difference.
What do you mean about compression power?
So CPU power to achieve that level of compression.
Oh, yeah.
So you have to be able to leverage the CPU
and sometimes GPU, like you mentioned.
And then we should mention
that a lot of this programming
is done at the lowest possible stack,
whether it's C, and of course,
as the legendary Twitter handle reemphasizes over and over a lot of assembly.
So what happens globally is that you have an address, right,
which gives you with the operating system a stream of bytes,
a stream of data, right?
And this is the first step.
And the second step arise with demoxing,
where you're going to separate audio, video,
subtitle in type of different tracks.
And then on each of those tracks,
you're going to decompress them, decode them,
either audio with an audio codec, video to video codec,
and subtitle to subtitle codec.
And once you've decompressed those type of things,
you have raw images, raw,
and then you're going to talk with your graphic card in your screen
and display that.
And same for the audio,
you're going to talk to your audio card,
which then is going to go in analog to your audio speakers.
And everything we've just said in the past couple of minutes,
every sentence is someone's lifetime's work.
There are books about every sentence.
So the level of complexity in many cases is inordinate.
Every sentence has thousands of people working on this in industry as a whole books written about it.
So there's a lot of detail.
There's a lot of subtleties.
There's a lot of both academic and practical realities, both of which matter.
We mentioned Kodags, but I don't think you mentioned containers.
So what's the actual containers for some of the stuff we're talking about?
So people are familiar with MP4, MOV, MKV.
So anyway, what are containers versus the thing that goes inside?
So the container is what we call also the moxer, right?
When I said demoxing, it means decontanizing, right?
So actually, if you look, mux, mill multiplexer and demultiplexer, right?
MUX and DMX are those.
And same.
Kodak is actually coder,
decoder, right?
And so
containers are this
collection of multiple
tracks, right?
So it's what normal people
call the file format,
but it's a bit more
subtle than that.
But the most known one,
of course, is MP4,
but when I started,
it was Avii,
right?
Ivy was the video format
from Microsoft.
And MOV,
which became MP4,
was a format from
Apple. In the open source
community, one of the person that is still
active on VideoLand is called Steve Lom
and started this Matroska
format, which is like a bit more complex
and more feature proof.
And there are
so many others. So, I mean,
it's a pretty common thing and maybe
it'll even happen in this conversation
that people confuse container
and the codec, right? So
confuse MP4 and H-S64,
for example. Is that a horrible
violation? No, it's not because
technically the name of H264 is MPEC4 Part 10,
because MPEC4 is actually a meta-specification,
which has several things in it, right?
There is the part two, so there is like audio codex, right, AAC,
the factor is MP4 audio, something.
There is actually several video codex, right, inside the MPEC4 specification.
One of them is MPEC4 part 10 called also AVC,
call also H264, right?
So it's completely the fault of the industry to make things difficult to understand.
So that's very difficult so that people then don't understand why sometimes you talk about
MPEC4 part 10, where you mean H264 and why it's not MP4.
So you can technically shove in all kinds of different codecs inside containers and
horror buso.
But broadly speaking, though, MP4 is understood to generally be H264 plus AAC or DECO.
99% of the time, that's that, and that, the rest are de minimis, the small effects,
you know, edge effects really compared to that.
So it's not the end of the world that there are people who do get annoyed by that,
but also in reality, something like VLC, just to point out, the file may say dot MP4,
but it may be something completely different.
And that's one of the challenges both FFMPEG and VLC have is the real world is a completely
different place to a three-letter file format.
And this is very important to say, right?
like, for example, in VLC and NNFVMPPag,
we discard the file format, right?
We look into the file to understand what's in it.
Because so many people, like, they say,
oh, it's a video, it should must be MP4,
but technically it's an MOV or maybe it's MKV, right?
So we analyze in real time everything that we have,
and we don't trust the format.
So what information does the fact that it's dot MP4 give you?
It helps, right?
It gives you a hint, right?
just like, oh, it's finished by dot MP4.
I will start first by opening, probing it
with the MP4 container demuxer to see, well, it should be that.
But I don't trust it, and if I'm lost,
I say, okay, maybe I'm going to try it.
So it bumps the priority of the module.
So how do you get to, just to take a bit of a tangent there,
you know, the dumb thing is if you try the MP4,
but it turns out it's a different code than you would have expected,
most players just break there.
Yes.
So how do you not break?
There's just philosophically,
I'm sure there's a bunch of stumbling blocks
along the way where it's easy to just break and stop, freak out.
That's it.
How does VLC not?
This is why VLC is popular.
But the reason is because actually VLC was,
is just a client of a streaming solution called VideoLand
from very long time ago, from the late 90s.
And when you're playing video, which are on UDP, right, in network, they might be damage, right?
So you don't trust your inputs.
And this is very important into the security is that you don't trust your inputs.
So everything in VLC is prepared to work with broken files.
And it's a philosophical idea from the beginning and everything is engineered into that.
And it's a culture, right?
And so, for example, and VLC became very popular on that because a long time,
time ago when people were pirating content, which they do a lot less today.
And none of us ever have.
No, of course not.
The metadata to place some files like Avii is at the end of the file, right?
And when you're downloading, you don't have that, right?
So VLC was just like, hey, this file is broken, but I'm still going to try to interpret it.
And this was very useful.
We hinted at the awesomeness of the various different stages.
we hinted at the awesomeness of codecs,
the depth and the richness and the complexity
of everything involved there.
Let's try to define what is a video codec.
What's involved there?
What does it mean to compress something?
You already started to hint at it,
but can we elaborate a little bit more?
So there's a huge amount of redundancy in any video,
both spatial and temporal.
And the point of any video codec
is to remove this redundant data,
use mathematical properties as part of this reduction process.
So more often than not using several orders of magnitude more compute to compress,
because that's more costly versus both costly, both financially and in CPU resources,
versus the decompression.
So it's asymmetric in that respect.
Often the case because compression is done once,
but there could be lots of viewers of another file.
So to take that information and compress it by 100x, 200x,
removing redundant information and using mathematical properties to make that small,
but also have properties such as error resilience.
So as JB suggested, VLC in the beginning was used to play UDP network feeds.
And UDP network feeds lose packets.
And so some of the design goals of a codec is also to be recoverable.
You need to actually be able to join a stream.
It's not necessarily a file.
You need to join, get on the decoding process, and start decoding.
And to give them more image to people who are not familiar, right?
Like when you're going to see any type of movie, right,
you're going to see the camera is going to pan, right, and travel.
And you realize that, for example, all the background is the same for like a minute, right, or 30 seconds, right?
So you can reuse the cloud that you see on the background.
You can reuse that from a frame to another, right?
And so it gets the more memory you have, the more power, the more comparisons you can make, right?
And so the more compressed you can be.
And most of the modern codecs are basically doing that.
So just to make you even more explicit, so what is video?
Video is a bunch of pixels off an RGB, you have three values,
and you have a grid of pixels, and you have, let's say, 24, 30 or 60 frames a second,
and you just have all these pixels repeating and showing different stuff 30 times a second.
And so the question, the philosophical, the technical question is, how can I compress all of that, store all of that at 100 X?
1,000 X, right?
1,000 X.
The target is 1,000 X, right?
And the goal is when you save redundancy, what is redundant meaning stuff at best that humans wouldn't notice if it was missing.
So, for example, you have a picture of a cloud, right?
And from the next frame, they're still going to be the same cloud.
So it's redundant.
You could just put it once and not do it, right?
Or you have a black background behind me, for example.
The black is the same on the whole picture, right?
So you can say, well, you know, in this picture, take the pixels that you have on the top left,
and the one on the top right, I'm not going to give the value.
I'm just going to tell you it's the same at the top left.
And then you can say for frame one, reuse something from the previous frame or the previous
frame and so on and so on, right?
So you could basically, it's unlimited,
but then it's limited in terms of memory
or in terms of compute power.
Because, for example,
if you need to compare pixels on 200 frames in the past
on 4K resolutions,
it's a huge amount of compute.
And then when you're showing it,
you have to do the decompress of all of that.
So is it the codec,
the encoding and the decoding
is a coupled process.
that you're developing.
Exactly, right?
And those are two different tradeoffs, right?
Are you going to compress more,
but then it might be more difficult to decode?
Are you going to make it a code
that is more complex to encode and easier to decode?
Are you going to make a code
that is easier to encode because you need to be fast?
But then the client side,
the player is going to spend more time.
That's why you have so many different type of codecs,
is that it's not always easy.
And to make it even more complex, modern codecs like AV1, AV2 or VVC are actually not codecs.
They are a collection of tools, right?
There are multiple tools, multiple codecs in the same codec to, depending on the image, get the more compression.
So just to elaborate, codecs like AV1, VVC have a much wide, have a wide audience.
It could be a screen share content.
It could be video.
It could be animation.
All of these require different.
coding tools. So what happens these days is a collection of tools are put in and called
AV1 and called AV2, called VC, to allow for different use cases. So you may be on Zoom and sharing
your PowerPoint and then you need to show the audience of video that Kodak needs to start
changing its tool set depending on the content to compress in a different way.
And like you said, there's a bunch of incredible engineers behind each part of that,
each part of the tools that make up AV1, for example.
Sure.
So we've kind of danced around it.
We talked about VLC, the logo, the hat.
Let's talk about FFMPEG.
What is FFMPEG exactly?
FFMPEG is basically the low-level libraries for codec.
So compressions and decompression,
muxes and demuxes and filters.
The core is this,
and then you have a several tool
which allow you to create a type of pipeline
to process any type of video files.
And it's used as a library,
absolutely inside everything,
from VLC to Chrome, to your smart TVs,
to basically any video that you see online,
you usually use FFMPEG.
And FFMPEG in it has all those type of tools
and sometimes depend on other libraries
like X264, LibVPX and others, right?
So it's really now the de facto
tool to process images.
From a philosophical level, I think it's incredible that your home videos, your grandmother's
home videos and trillion dollar corporations effectively are on a level playing field using the
same technology stack.
It wouldn't be a surprise.
These big companies just have 3,000 line FFMPEG commands.
There are some that use the API, but there are some that just have long command lines.
So, yeah, there's a bunch of tools, like literally command line tool, FFMPEG, of course, FF probe, there's libraries, Libavvy V Kodak, Libby V Format, Libby V filter.
But the FFM peg on the command line is like legendary because you can cut, there's so many parameters, you can customize everything to help.
It's a language. It's an actual language.
Yeah, you could think of it as a programming language.
Yeah, of course, I'm sure.
Because so most of the people, they're going to take FFMPEG, file in, file out, and specify the format, right?
But you can, we've seen thousands of characters.
And we've seen also like people like doing programming generation of common lines to make FFMPEG.
There is a ton of people who are using AI to generate common lines for FFMPEG because you have no idea what it is.
But you can specify so many filters, right, on common line, right?
So FFMPEG is this collection of.
toolbox for multimedia processing that everyone,
everyone uses. And everyone that is watching your videos are also using, right?
You're on YouTube. Well, it's FFMPEG on the client side.
Well, on the server side, the client side is polychrome.
Well, you're using FFMPEG also. And you're using OBS to record.
Well, it's FFMPEG, right? You're using a ton of important, like, big box,
professional boxes. Well, it's very possible that inside some part of FFMPEG is running.
I mean, there's like so many, just to give people an idea.
Like, I use FMBeg a lot on everything.
Just trivial stuff like take a video,
add an intro video and an outro video,
and fade one into the other, like, what is it called,
dip to black, like where it dips and then shows the next video
and does the same thing with audio.
There's like a cross-dissolve of the audio.
It's quiet, it quiets the audio and makes it loud again.
and there's a bunch of stuff like showing the captions on screen card like baking the captions in
you can customize the font you can do all kinds of layering of audio and video there's a million
things and of course all of that works like magically with basically any codec like anything you can
shove in on the audio and the video side it works but it's like if you if you look like for example you
can do things that you would do with Adobe after effects in common line on FFMPEG, right?
It's very interesting because, for example, for images, there is not such tool.
There is a few tools, but not with the breadth of FFMPEG.
So image magic has a similar kind of spirit.
Yes, but you will not do some filters, complex filters.
You don't have the equivalent of Photoshop in common line, right?
But for video, you have FFMPEG in common line.
Yeah, it's incredible.
I mean, it's like, and it's an example of a thing.
on a bunch of great people get together and they get a vision,
and they stick by that vision for many years, which is incredible.
And the vision behind, and the same for VLC and FFMPag,
is that we make everything that is very complex, easy to use for the normal people, for everyone, right?
Our goal is to make something that is insanely complex technically and make it easy to use, right?
And people, they use VLC, they drop a file.
They don't realize how complex the file is,
they play it or people put any type of thing inside FFMPEG with complex filters and it just
works like magically right and people and this is our mission right make very complex things
we wouldn't be here and you wouldn't be here if this required you know a traditional television
studio setup it's tools like FFMPEG that democratize this the the podcast and streaming revolution
the YouTube revolution was caused you know FFMPEG was a big player in that because it
democratized this technology that was once in the 90s, for example, you needed equipment that
cost hundreds of thousands of dollars to do compression. It was the size of a car. And now everybody
has that at almost an exact level playing field. And that's something that's so remarkable.
It gave voice to a lot of people. And just to clarify, we say, you wouldn't be here, not the human,
but the podcast.
Sorry, sorry. You is the, sorry. I would still, VLC did not have anything to do on the biological
well, creating me as a human.
You realize also everything move from text to images and images to video.
Look at social networks.
Video is everywhere.
It's the most powerful medium there is, right?
And when you see shorts and in reels and TikTok, right, it's amazingly powerful to give video.
It's amazing for that, right?
But the complexity is important.
It's what people don't realize.
I mean, this is really.
it gave power to the individual all across the world.
It's real freedom.
And I think I can't believe it,
but we still haven't mentioned the actual obvious thing
for people who are not familiar,
which it's open source.
And there's an open source community
of users and developers behind it.
So it's really, it's a movement.
So like, we'll talk a bunch
in a bunch of different ways about the community behind it.
But can you speak to the open source element?
So when we say what is FFMPEG, it's an open source project.
Yeah, so FFMPEG, VLC, X264, VideoLand, everything we do is fully open source.
And for the people who don't understand how open source is, my usual analogy is about a chocolate cheesecake.
Usually, when you want to buy your cheesecake, you go to a bakery, they give you the cheesecake.
The other way of having a cheesecake is have your grandma give you a recipe of how to make that.
When we do open source, we give you the chocolate cake.
and we give you the recipe to actually remake the same cake,
but at the same times tell you how to build the oven
and also how you're allowed to modify the recipe
and resell it to someone else.
And this is because software is just a very long recipe of small instruction.
Computers are not very clever.
They go very, very fast.
So a normal program has tens of billions of instruction
instead of the tens when you have your chocolate recipe.
So a lot of the software industry,
was about selling software,
where you just have like the final cheesecake.
In open source, we give you everything.
And that managed to get a lot of people work together, right?
Because then you decide that you're going to make the best program,
the best recipe for video,
and you create communities in FFMPEG.
Since the beginning of FFMPEG,
probably 2000 to 3,000 people have contributed from the beginning, right?
And then it's exactly like the Linux kernel, right?
the LickNex Keler has probably 10,000 people contributing everywhere.
And they get together, well, mostly online, right?
So they virtually get together to create the best tool for something.
And on FFMPEG and VLC is just like, well, this codec doesn't work.
So I'm going to work on the codec and I'm going to add the support for this file inside FFMPEG.
So it will be beneficial to everyone.
Because again, we work for the greater good.
We work for everyone.
And that is what open source is.
And we should mention, depending on the licensing,
you could probably build a billion dollar,
maybe even a trillion dollar company around,
basically as a wrapper.
Well, yes, people do.
People do, right?
There was a lot of problems with mostly cloud providers
who are basically running some open source tools in the cloud
and just give you the API to access.
to that. And there was a lot of
databases
like Mongo or Elastic who
changed their license in order to avoid those
type of scenarios. This is
a question we get a lot in FFM Pegg is
why don't you do that?
And you can't. We have thousands of contributors,
some of whom aren't even alive anymore.
You would need all of their
agreement to do that. And J.B. will go
maybe a bit later and talk about how challenging
that process was in VLC to do the
relicensing. The license
is a social contract in
terms of Rousseau de facto of the community.
The community does not agree on much beside the license.
People go around, discuss around because of the license.
And that also allow those license fork, right?
Sometimes the community splits, but it's possible because of the license and to merge back.
And we've seen that so many times, right, GCC and EGCC in the past.
We have seen, for example, all the web browsers, right?
They started as K-H-M-L, which becomes WebKit, and then which becomes Blink, right?
So open-source license is like the core of the community.
And people are coming from all over the world, very different type of religion, political borders.
They work in the same way on a project to solve a specific problem.
And the specific problem we're working on is to make multimedia easy for everyone.
Looking it up on Proplexity here, looking at the different open source licenses.
Most major open source licenses fall into two buckets, permissive, very few conditions,
and copy left.
Share alike requirements for derivatives.
Below is a brief practical summary of the main ones you'll see in the wild.
MIT license, BSD, ISC, Apache, Gnu GPL, Gnu A-GPL.
where's L-GPL-GPL, yeah, L-GPL, let's see,
there's the MZo public license,
there's Eclipse public license,
it goes on, there's a lot of variety.
I mean, I think I think the really popular ones is MIT,
GPL, L-GPL, NBSD,
and BSD, Apache, sometimes you'll see.
On license, that's an option,
attempts to dedicate code to the public domain
with a fallback permissive license.
There are many licenses for many different things.
What people don't understand
that public domain is something that,
doesn't exist worldwide, right?
So it's all the open source licensing use the copyright law, right,
the international copyright law, in order to give rights on how you use the software or how
you modify.
It's de facto a copyright license contract that you give to the end user or to the developer.
And so you have like the first one, which are basically very permissive MIT, BSD,
you give the code and basically you do whatever you want, right?
you take it, you modify, you do what you want,
and this is popular for JavaScript
and the type of BSD operating system.
So some of them, one of the parameters is whether they require
attribution, meaning if you use the code you have to say.
Yes.
So in those type of permissive license, some you need to say if you use it,
which is called attribution and some you don't.
And then there is the other part of license which are copy left
where you need to give back to the community your modifications
and with different string attached,
some weak copy-left license like the Mozilla public license
to some which are a bit stronger like a-GPL
or even very strong like AGPL.
So all of those are different type of licensing
that depends on what your goal are
and how you want to structure your community,
which is why I spoke about social contract
because this is very important to understand
FFMPEG and VLC are mostly GPL or LGPL.
The Linux kernel is GPL,
but Android is Apache,
a ton of JavaScript framework that are using are mostly MIT.
All the BSD kernels, open BSD, NetBSD are, of course, BSD.
And so it's philosophical change on how you want people to contribute back, basically.
So there's, I think you talked about that you've moved at one point from GPL to LGPL on certain parts of the project.
Can you describe the difference in the two and what does it take to move to, I guess, a more permissive.
So that direction is more permissive.
LGPL is more permissive than GPL.
Yeah.
So you have to realize that you can always go from more permissive to less permissive, right?
Because of course, those license are basically stated.
So if you restrict, you can always restrict more, right?
So in a GPL project, you can take MIT code.
But you cannot do the opposite, right,
because they are more constrained to match.
Indeed, in fact, I change the core of LibVLC,
which is the engine of VLC,
from GPL to LGPL.
And there were two reasons to do that.
The first one is that so people can use the VLC engine,
Lib VLC into third-party application.
So a lot of applications which are playing video on your phone or on your tablet
are actually VLC Engine Init, which is calling FFMPEG init.
So that was one of the way to create one of the company I created,
which is doing consulting and integration of those applications
where you integrate VLC into third-party solutions,
like inside game engines or stuff like that.
With DPL, you couldn't do that because that means you needed to open-source
everything and those are for a lot of like commercial companies who don't want that.
So you can create a company with LGPL, you can create a company around it.
You can do a commercial thing. You don't have to open sources. So it's a big, big leap.
So you can play video in your game. Yes. The problem is I'm a game developer and I want to
play some videos and I don't want to be forced to open source the entire game just to play those
videos. So that's where the consulting business, the Lib VLCL, LGPL, allows you to do that. The
LGPL, the library GPL as it used to be known, allows you to do that.
And FFMPEG is exactly the same.
It forces you to give back what you change on this component, this library, which is why
it's library GPL.
And so you can use FFMPEG as LGPL into like any type of application, even non-open source,
but you need to give back the modification you did on FFMPEG, same on LibVLC.
Is it limiting from an open source perspective to go GPL?
because if you, if your library, if your code is GPL, it means you're not,
you're basically discouraging companies from building a business around it, right?
Is that fair to say it?
It depends on a company, but a company whose business model requires the source,
the application to be closed source, yes, it's limited.
So that's why, for example, I move to LGBL.
The second reason is a bit more obscure is that the terms of condition of the,
App Store, the Apple
App Store for iOS, makes it
very complex to have GPL application on
it, while it's easier to
have LGPL applications on it.
So, VLC on Windows and on
Linux is GPL.
The core is LGPL,
but on iOS, the
iPhone version and the Apple TV version
is a type of different license
called the MPL.
And yes, I went and changed
the license and it was a long
story. Yeah, so I think
Basically, to change the license, you have to contact all the contributors.
Yes, it's very important to understand that open source projects are what we call in the US copyright law joint work or in civil law collective works or collaborative works.
Is that you work all together in terms of the same goal and then it creates one software, which is one release.
But the copyright is kept by all the individuals.
Some open source projects don't do that.
The forced copyright assignment.
but this is not where we do we do with communities.
So everyone has basically copyright on what they changed.
And this copyright stays even if at the end your contribution was deleted
because the new contribution was based on your previous one, right?
So if you want to properly reliccense, you need to find all the contributors.
And at that time, I had to contact more than 350 people.
And sometimes, well, they are just an email, right?
So you need to actually track down.
I actually traveled to some place to go somewhere that I was like, sorry, that I had found online to see how to go to their job and say, well, you license that can you, do you want to change from GPL to LGPL?
Most of the times they don't even care.
They wanted to help LTC.
But also it brought me to a very complex situation.
I arrived to the work of a person who was a factory worker.
And I said, well, I need to you to sign that because it was his son.
who died, who actually wrote the code, right?
So I had to explain all those type of open source meaning,
and no, I was not a company trying to rip out the two lines or five lines that that guy did,
but it was useful, and the whole community agreed on that.
And he had no idea I was a factory worker.
And I was a lot younger, right?
Like, it was 14 years ago, and like, I was almost in tears, right?
It's very difficult, right?
We are talking about life of people, and he explained, and we went,
talk about the photo of this guy.
So it's important to do it right and to do it correctly.
But yes, that means tracking down everything because every contribution works.
There are some projects who don't respect that and we do relicensing a bit aggressively.
But as I said, it destroyed the heart of the community because we only agree on this license.
So that's important.
I would emphasize the community is such a wide-ranging group of people.
There's people in a Syrian.
war zone with electricity part-time, there's all people from all walks of life, rich, poor, young,
old.
So it's quite remarkable to get, you know, a group of people aligned on something.
And now that's an achievement in itself.
Yeah, it's incredible.
And a lot of them are introverts.
So you coming to find them and getting them and getting them to answer an email.
It might be quite difficult.
Most of us are introverts, right?
You need to be more precise.
You are extremely introverts, extremely introverts, right?
It's just like a whole spectrum of different people.
It doesn't matter.
The important is, is your code good?
Is your code great?
Is your technology great?
We care about excellent code.
We don't care who you are.
Sorry, it's just like, we have no idea to check.
We cannot check, right?
Like maybe you're a dog.
I don't care, right?
I don't care where you come from.
I need to look at your code.
And this is important because people don't understand that.
And they come to the community and send them some patches.
And they get rejected.
And they don't like that because, I mean, you're just like, sorry, it's not up to our standards.
Oh, yeah, but I'm engineer at this very large company in Italy, in Germany, in the US.
We don't care.
We care about the quality of your code because this is what defines our community.
And which means that we have a lot of people who contribute who are some very different backgrounds and very,
And very introverts, sure, but that's okay, right?
So one of the legends of the community is, of course,
Linus Torvalds who created Linux and is a long-time maintainer of the Linux kernel.
As the legend goes, he can be pretty harsh on this meritocratic process
of reviewing the code and saying it's not good enough.
Can you just speak to the legend of Linus Torvald?
Linus is one of a kind, right?
And I would even go and say that what he did on Git is more interesting
than what he did on the Linux kernel.
He's very harsh, but what people don't see is usually when he's hard to,
it's people who are maintainer of part of the kernel, right?
So they know him, right?
So he's not very harsh like that to everyone.
The thing is what he created in his room is basically powering every server online, right?
Even at Microsoft Cloud called Azure, I'm quite sure,
70, 80% of the servers are running Linux.
All your Android phones are running Linux.
What he did with the power of a port source, sure, is amazing.
And yes, the quality of the Linux channel is very high, and yes, it's difficult,
but we cannot compromise on that.
We cannot compromise on quality.
Because in the end, and you have to understand that,
is the core community of VLC is five people.
the core community of FFMPEG is 10 to 15.
And we are the ones who are going to maintain your code, right?
Because 1,000 contributors in the timeline and just 10 stings,
1% chance that someone comes and stays.
1%.
So you will have change of jobs, chance of wives,
you have children, you have accident in life,
you're going to change jobs, whatever.
You're not going to come back.
It's most likely.
So we are the one going to maintain your code.
It needs to be maintainable.
It needs to be excellent.
And yes, sometimes that means that you need to rework, you work, because it was good, but it's not excellent.
And we need excellence because we have very few to maintain something that is critical for the whole.
But we should also mention that there's some spiciness, some harshness of language that's sometimes used when you're keeping this high bar of excellence.
Is there something to say to that?
It's true, right?
it's also the fact that, for example, what we're doing is low level, it's extremely technical,
you get into this community, the tone gets very, like, a type of, it's a subculture, right?
So people who arrive from the external are basically not known to the subculture.
Most of those people around FMPEG and VLC, we do VDELAND dev days, VDD, every year.
They are so fun in real life and they love it.
But it's true that you're online and sometimes, like, the tone,
you don't realize how it is.
But that's okay.
It's a culture.
I mean, you get this in the gaming culture.
There's pretty harsh and tense
the way people communicate.
And everyone understands
that the way you show love and respect
just looks different in different communities.
Sometimes people, it depends.
If it's a book club,
usually people are going to be much sweeter.
If it's an open source project
that's very high stakes
and used by millions of people.
But it's very not often insults
that you see, for example, in the gaming, right?
So Linus tone is a bit unusual, even for the open source community.
It's more like it's more harsh on the result, saying, no, this is not good, this is crap,
those type of things that you will see.
Try not to make it about the person, make it about the code.
Yes.
It's very matter of fact.
I think you've got to look at it in terms of, you know, the famous FFMPEG is developed
almost entirely by volunteers, and that's true.
And you've got to imagine someone's done a hard day's work at their day job, they come home.
you know,
terseness might be a thing,
you know,
and that's not something
to take personally,
you're tired,
you're busy,
but you still care
about this open source stuff.
But you may not be able to explain
and handhold someone
on every subtle detail.
And also,
you have to realize
that most people
don't speak English
as native language.
And this is,
especially for open source projects
like FFMPEG and VLC,
which are mostly sent out of Europe.
Sometimes, like,
people who are from the US or just like are very not happy about the tone.
But most of the time it's also like they don't know better, right?
It's difficult.
Language is English is a difficult language.
There is so many subtilities and tone and so on that you don't have, right?
So often it's also difficult in those type of community about like different cultures and languages.
So as the legend goes, J.B., you repeatedly turned down millions of dollars to keep VLC
open source, free for everyone, without ads.
So take me through the reasoning behind that decision
of leaving millions of dollars on the table.
Yeah, that's like almost a meme, right?
On Reddit.
There literally is a meme on Reddit,
NynGagg and, yeah, yeah,
you looking like a wizard in the VLC hat on Reddit.
This is J.B., the creator VLC media player.
he refused tens of millions of dollars
in order to keep VLC as free.
Thanks, Jean-BaptisteCamp.
You can even summon him on Reddit.
And usually, if you see, right,
it's usually like people tag me, right?
And then there is me.
And then like I say, good morning.
I got 24K upvotes, which is great, right?
My karma on Reddit is amazing,
at least on that account.
So the question needs to be answered first.
What is a story about VLC, right?
because yes, this is true,
I refuse dozens of millions of dollars.
Yes, several times.
Yes, I could be a multimillionaire
and be somewhere on the beach.
But I did not do it because I thought it was not moral
and it was not the right thing to do.
And this is very important for myself
is to be like, I work for the greater good,
I work for people and I don't want,
it's not just by myself.
But the reason is also because I did not feel
that I'm completely legitimate to do that
and let me explain you why.
VLC story is a very weird story.
In France, we have university
and we have a type of top colleges
and those top of excellency schools
are engineering school,
business schools, and basically lawyers
and medical, right?
But they're outside of university.
And in order to enter those,
you spend two years working like crazy
mass physics to enter those best engineering school.
One of the school is called the Ecole Central Paris.
It has changed names since, but it was called the Ecole Central Paris.
And because it was central, they had to move it because it was too small after the World War II.
And they wanted to move it to the Central of France in a place called Clermont-Feran.
And the alumni decided that this was not okay, right?
It is the school that EFEL, right, the one who did the EFEL Tower, attended to.
So they said, no, no, we are amazing, great school, we cannot do that.
And so they bought a piece of land south of Paris, very near Paris.
And it was a campus managed by a non-profit of the alumni.
Okay.
Because of that, everything on the campus was managed by students.
The university did nothing, right?
So radio, TV, supermarket, library, defining who was going into which rooms.
Everything was managed by the student.
That's amazing.
That's an amazing experiment.
that it all didn't go to hell quickly.
It somehow flourished.
It worked great, and I learned so much in my life doing those side activities, right?
Because you're 22, and you need to run your campus else you don't have electricity, right?
So you care about that, right?
But anyway, in the 80s, they did a full experiment of deploying a network,
mostly sponsored by IBM and 3Com, which was a token ring network.
So token ring is something that probably almost,
no one knows about anymore.
It's a networking technology
where you don't have routers, right?
Everyone is linked, it's like really a ring,
and when you want to send a message,
you talk to your neighbor who's going to put the message
to the next one, who's going to put the things to the next one.
In terms of ring.
The issue with token ring is, of course,
is that it's very slow,
because every computer on the network
needs to open the message,
see if it's okay, is it for me,
no, it's not, and then send it back.
Like a token, which is,
traveling around the ring.
In the 80s, you're doing some telnet and sending mails as university.
That's okay, right?
But starts the 90s.
In the 90s and start video games, and when you have high latency in video games, basically, you die.
So in 1994, 1995, around Doom and Nukem coming around, they want a faster network.
So the students go and see the university and say, you know what, we want a faster network,
we need to work, which you also play video games.
and the university tells them that basically,
oh, I'm sorry, we cannot help you.
Because you understand, the campus is not ours.
You manage it, so do something.
And you should see some basically partners of the university
and basically go away.
And they go, and they actually go and see the CIO of Bwigg,
which is a large French company,
and who's doing some TVs in France.
And he says, well, you know what?
the future of video is satellite.
Well, today we know it's not,
but at least it was a good idea in 1995,
the first of satellite dish.
And it says that instead of having one satellite dish
and a big decoder for each of the students,
which are 1,500,
what about you build, like,
you put an enormous dish and only one decoder
and you send a video directly on the network?
And that required a very fast network.
Today, it's obvious.
but at the time was like the first to do video streaming.
So they built this project, which was called Network 2000.
Of course, we're in the 90s, right?
Everything is futuristic is called 2000.
Yeah, 2000, yeah.
And so they do the Network 2000 project.
It's completely hacked.
It crashes after 45 seconds.
That's okay.
The demo is 40 seconds.
It's Leaks memory.
That's okay.
They put 64 megabyte of RAM instead of the 8 or 16 you have.
And the demo should have stopped there.
And that was a network, 2,000.
project by the students.
What was the format of the video that they had to work with?
MPEC 2, because satellite is MPEC2 TS for transport, MPEC2 video, and MPEC2 audio at that time.
And the project should have stopped there.
Everyone was happy.
They had like amazing ATM network at 155 megabits per seconds.
They had probably one of the best network in Europe at that time.
And they stopped the project.
Six months or a year later, two students arrive and say, well, you know what?
maybe other people care about video
streamed on a local network
and they create the VideoLan
project VideoLan
and one of them is called
Christoph Mathieu, that is a good friend of
both Kieran and me, and they start
the project, it's not even open source yet
and they spend around
three years to get the school to
agree to make it open source
because the university wanted to
get, because of the IP and
copyright of the students, wanted to
basically monetize this
impact two decoders.
Let's be clear, so what's the main application
streaming on a local network?
It was streaming on a local network.
By the way, that's just like
to state all this.
This is before YouTube.
This is before...
10 years before YouTube.
You have a Pentium 60 or 75,
right?
The main machine was 4886 DX
at 33 megahertz, right?
Bear in mind, television was the main form of video
at the time.
You could get new channels.
In the 90s, having even one new channel
when you grew up with four channels
having a fifth or a sixth
was a big deal.
And so having this satellite service with, you know, dozens, even hundreds of channels was so groundbreaking.
Especially because this is a university where you had a ton of different nationalities, right?
So there was a ton of people who wanted.
So in the end, they had like several dishes on different type of satellite, right?
Because, for example, a lot of people were coming from the Maghreb or the Middle East.
And so they went to different type of satellites.
Anyway, the solution worked great.
And they started the Videoland project.
The VideoLand project has several and some are completely crazy solutions like one,
how to create multicast on a unicast network, but let's not come to that.
It's too complex, but VideoLand client part is what became VLC.
Actually, they basically strong-armed the university to force it to open source because the university
did not understand that, and in 2001 it's still early.
But basically, yes, the university agreed early 2001 to make.
it open source. I joined the project in 2003 because that's when I joined the university.
So the first thing is I'm not the one who created VLC because actually no one did, right?
Just kind of naturally emerged from the VDELAND project.
We should mention that like, again, you said it just, but to make it clear, Vidalan is what it became,
was at the time is a set of technologies around video, the VLC, what you called the client.
That's the thing that most normies think of like as the thing, which is like the thing that pops up when you click on video and you play it.
So I arrive in 2003 and then I will create the open source non-profit organization called VideoLan.
And I took everything out of the university to create it a non-profit project and some things sustainable.
Yes, it's true that I spend more time than anyone on VLC and VideoLand that is sure.
But it's a continuity of a previous project, VideoLand, the student project,
which is a continuity of the Network 2000 project,
which is a continuity of that and that.
I'm sure there's moments along the way there.
You were thinking, like, what is the future of this from the open source perspective?
Because as the Internet is blowing up,
and there's companies, I mean, for people who don't remember,
like, there's companies making huge amounts of money.
And I can tell you that in 2005, the project,
should have died.
And I made it to continue the project.
At some point, we were only two active developers.
And I thought it was great technology and was useful and it will be useful.
And I made that my life and my time.
And I made that grow from a few hundreds of thousands of users, millions of users,
to what we have now, which is probably billions of version of VLC around the world.
and used everywhere.
So that's a bit the story of VLC.
There is a ton of very funny story around that.
Many people from around the world working on it,
like you said, in Syria or middle of nowhere in India.
But along the way, I got several offers,
which were either to bundle toolbars, right?
You remember those horrible toolbars,
which were basically spyware,
or changing your web browser or your search engine
or even like advertisement inside VLC.
And I didn't like that, right?
And people don't understand that.
It's not, I'm not against money, right?
I'm very happy to make money.
I created several startups and one,
I hope that is going to work very well.
It's the fact that I believe that you need to win money ethically.
There is the right way of doing that.
And doing sneaky advertisement or stealing data is not the correct way, right?
For example, if Netflix arrived at some point to say, well, we want to put Netflix inside VLC,
probably the story would have been different, right?
But they didn't.
The only people who came to us were shady ads company.
And if I do that, right, I would have a ton of money, right?
And then three years later, project is gone, right?
Someone forks it and something else happens.
So it's not even necessarily ads or any of that.
It's the shadiness of the dishonesty of the, so you had a good radar, you had a good threshold
of like, no, this compromises the spirit of what this is supposed to represent.
But also, it's for me, right?
I'm like very selfishly, I need to go to bed at night and be happy about what I've done, right?
Maybe it's my upbringing, maybe it's my parents' fault or whatever, right?
But I believe there is right and wrong, right?
And this was the right decision at the time.
It still is.
I want to be proud of what I've been doing.
and if I had sold out,
I would have betrayed so many other people who work.
Yeah, well, I should say,
me and most of the Internet,
thank you for that decision.
It's inspiring for others,
I think that are pushing the open source movement forward,
that it's okay to do these kinds of huge sacrifices
if you believe it's right.
And I think in that case it was right,
and it was the reason that VLC became as successful
as it was because it's an embodiment.
It's a symbol of, like, you know, freedom
and what the open source community can create.
Yeah, and be a service for so many people around the world,
and this is important.
We should emphasize in the 2000s,
it was really normal to download a program
and it secretly installs some spyware.
Yeah.
It was buried in very faint text
or in the license text box that nobody reads
at the bottom, oh, I will be installing this toolbar
and changing all these things.
And it was very common to have to,
you install a program to do something at the time of any sort.
To put yourself in the mind of a developer at that time,
I think it's very easy to everybody listening to this.
It's very easy at that time to convince yourself
to take a few thousand dollars, a few thousand dollars to do it,
to say no to much more money,
takes guts and takes vision.
The last offer I had was obscene.
And they say, yeah, but imagine with all that money,
you could build something new, open source, right?
It was like the mind trick was, it was difficult.
But for me, it was just like, no, this doesn't work like that,
or this is not the right thing, so I don't do it.
And again, right, it's not that I don't like money or whatever.
It's just like it wasn't right.
Well, once again, thank you for me and from the rest of the internet.
Let me talk a little bit more about the open source movement.
about the fact that, as you say, over and over and over and over,
FFMBeg and many open source projects are built by volunteers.
So there's a bit of drama recently, Karen, on the interwebs on Twitter.
You have a spicy style on Twitter that I think articulates and celebrates
all the incredible developers and development and the code,
especially assembly that's involved in building some of these codecs
and building some of this incredible technology.
But that brings us to a bit of a debacle that happened.
Tell me the full saga of what happened with the Google security engineers.
Just to be clear, Google are one of the biggest supporters of open source out there.
They have been for a long time.
It's just, I think, some things kind of went a bit overboard this time.
So, FFMbeg itself, and this is not like a secret.
It's on the homepage, you know, the...
It processes untrusted data.
There can be security issues when you parse untrusted data.
That's very normal.
But recently, what changed was Google started using AI to create security reports on an open source project.
FMPEG.
Volunteers had to deal with that.
They provided very limited funding.
And they even went to the media first announcing how good their AI was before the issues could be fixed.
And this is in the public forum.
Yeah.
So reporting an issue, using AI to find an issue in the code.
which is a security vulnerability
and reporting that public
before you're able to fix it.
It's announcing how good their AI is
that they provided a standard 90-day industry deadline
without really understanding
the nature of volunteer-driven development.
In addition, this vulnerability was
on an obscure 1990s game codec.
The way, and let's look at it
from their standpoint to begin with.
Let's, you know.
Yeah, can you steal me on,
their case. Yeah, sure. They have substantial resources working on the security of open source projects
that, you know, are ubiquitous. And they've used, you know, a lot of compute to do that and
very expensive and very capable security researchers to do that. And that's their viewpoint is they
are contributing by doing that. But I think that's where opinions differ. It opened up a lot of
interesting fishes, I would say. It does seem that there's a portion of the security community
that look at themselves a bit like building architects that never have to go to site.
Going to site is something that is a little bit beneath them, the actual day-to-day construction.
They're there to do their security things and it's someone else's problem.
The security industry also kind of has a very aggressive tone towards things. The language they
use is extremely aggressive. They use very strong language like you will get popped. And to Joe
public get popped. You know, it means something quite bad. For them, it means to get hacked. The way I
look at it personally is a little bit like the padlock on your home. Not everyone, a padlock on
your home, or, you know, the lock on your home is there to protect against the capabilities of
what it's there to protect. It's not there to protect nuclear secrets. It's not there to protect. It's not there to
protect Fort Knox. And it could be looked at that they're using AI at a level of scale to go and
pick those locks and then say, hey, your locks not secure. You need to deal with this. Whereas actually,
they're the ones of resources to be able to fix this. But that seems to not be something either
they'll contribute to in terms of patches or in terms of financially. And the scale of AI is kind of
the issue that the bug reports are very wordy. They're very, very, it's almost a denial of service
by AI generated bug reports on very niche codecs. And the other issue the security community has is
everything is marked high priority. You're going to, you know, this is the most important thing in
the world and you need to deal with this. High, high, high, vulnerable, scary, scary, scary on a game
codec used on one disc in 1993. Yeah. And that, that's where the dichotomy lies.
going around telling everyone that their padlocks not safe, well, that's a hobby project of somebody.
The safety of that codec is consummate to what that person thinks. It's their hobby. It's good that
they're security analyzing it, but it doesn't need a big, scary warning. This is a critical vulnerability.
May recently also see that there was another quote-unquote vulnerability. It wasn't a Google in this
case, but a filter could overflow and have an integer overflow, and one of your pixels could be
the wrong colour. And this was marked high, 7.5 severity in red. And at some point, the security industry
needs to realise you can't keep crying wolf like this, because this just leads to people, you know,
the equivalent thereof of putting password stickers on their PC, you know, you can't just keep crying
wolf every day. And I appreciate, you know, that's their modus operandi.
is to create as much scared and fear.
But from the Google standpoint, at the end of the day,
they need to contribute either financially or with patches.
Google uses FFMPEG at a scale, probably URI,
couldn't even contemplate millions of CPU cores.
And yes, they contribute in areas mostly regarding their own products,
so VP9, AV1, but in a wider sense,
There's a disproportionate level of contribution.
Yes, they fund students.
Yes, they fund summer of code.
And I think, so Alex Strange is a former FFMPEG developer.
I think posting in a personal capacity.
So he posted about security engineers on Hacker News.
His post reads,
The problem with security reports in general is security people are rampant self-promoters in parentheses.
Linus once called them something worse.
imagine your humble volunteer open source developer.
If a security researcher finds a bug in your code,
they're going to make up a cute name for it,
start a website with a logo,
Google is going to give them a million dollar bounty,
they're going to go to DefCon and get a prize,
and I assume some kind of secret security people orgy
where everyone is dressed like they're in the Matrix.
Nobody's going to do any of them.
this for you when you fix it.
Basically commenting on the sort of the incentives for the different people involved and
misaligned.
The problem here is a disproportion of means on discovery compared to patching it, right?
And this is the biggest issue, right?
And after that debacle, Google did some changes.
They are now starting to send patches, which is...
And they also now have rewards to all.
for fixing issues.
So it has changed a bit because of that debacle.
So it's good, right?
But we've seen, and we talk about Google,
but we have seen like some other large companies saying,
oh, you need to fix this bug because it's critical in our product.
Can you explain the XZ fiasco?
The FFMPEG tweet reads,
the ex-Z fiasco has shown how a dependence on unpaid volunteers can cause major problems.
A trillion dollar corporations expect to.
free and urgent support from volunteers. Microsoft, Microsoft Teams posted on a bug tracker
full of volunteers that their issue is high priority. After politely requesting a support contract
from Microsoft for long-term maintenance, they offered a one-time payment of a few thousand
dollars instead. This is unacceptable. We didn't make it up. This is what Microsoft. Microsoft
teams actually did. And then they give the image and the details and all that kind of
stuff showing that these trillion-dollar companies are not giving much money, not giving much
support.
They think an open-source project is a traditional vendor that they have an SLA.
They think a public bug tracker is actually, you know, a third-party vendor's Gero where
you can do all of these things.
It's not.
It's there to report bugs.
I think the thing that made this particularly heinous was the name-dropping of Microsoft,
the name-dropping that this is a visible product.
If this was just a general bug,
report, I think that would have made it a lot better.
Yeah, so they literally said, like, this is a big deal because a lot of people are using
it in Microsoft.
I wonder what happens psychologically.
So I think what happens in these companies, maybe you can correct me, is they, you're right,
they just think of FFMPEG as like a vendor that Microsoft surely is paying a huge amount of
money to.
They kind of assume that in their interaction, and nobody anywhere.
and the stack is going like, wait a minute,
shouldn't we be giving millions of dollars to have a fan break?
And this is a very big problem in large,
like we're talking about some companies,
but it's the same everywhere, right?
A lot of those companies, like when we talk to that person, right,
he was just like a manager on one project in Microsoft Teams, right?
He had never really discussed with open source community.
He had no idea, right?
It was like, and, but the problem is that,
Usually there is what we call OSPOS, right, open source program offices in those type of companies.
And they are the ones who are supposed to discuss with open source vendors or open source communities.
But like they often don't explain that correctly internally, right?
And here it's just like, we are not your supplier.
If you want me to be a supplier, I'm very happy, right?
I will send you a contract in SLAs.
Like I created five companies who are doing that around open source projects.
so that's okay. We should say that some of the spicy tweets that Karen you're behind and some of the
debacle produced results, positive results. Donations have increased substantially. They're still not
enough to cover even a single full-time developer, but on both a, you know, awareness level and a
technical level, there's substantially more technical awareness and sort of awareness of the importance
of FFMPEG as a result, as a result of X and what.
happened. I can say, you know, it solved its purpose. People realize the level of importance
FFMPEC has. And on video learning, it's the same, right? Like, for example, a very simple example.
For more than a year, we couldn't update VLC on Android because of a bug on the Play Store,
on Android Play Store, right? The only way we got someone to answer was to put a very spicy,
as you say,
to it saying that
we are going to stop
distributing VLC for
Android, right?
And we have around
100 million people
using that.
And now, then
someone from Android
actually came and discussed
to us, right?
We are the same issue
with Microsoft
or like saying
that we were going
to stop distributing
VLC on the Windows store.
And unfortunately,
we are so small
that the only very
strong power
we have to solve those issues
is blaming on
social network because it's nobles
and now they listen to us.
But those large companies often have
difficulty
talking to us. Like for example,
VLC, right, is probably one of the
top 10 software used on Windows.
I am not part
of Microsoft
ISV programs, right?
I don't have a point of contact at
Microsoft's, right? Well, I'm sure
any other software at those
Spotify has a point of contact.
I don't have that, right?
So raising awareness works.
It's sometimes very spicy, a lot of drama.
I'm okay for that, but it's efficient.
So everybody listening to this should go follow FFMPEG on Twitter on X.
Follow VideoLand on Twitter on X.
Go donate, donate to FFMPEG.
Thank you, Lex.
Over the years, several years you've been a supporter of, you know,
FFMPEG and Videoland on X, you know, giving us shoutouts,
appreciating, you know, what we do.
FFMPEG for life.
And for example, like, Team Sweeney, Carmack, and a few others,
like very high level people have raised also the awareness on our X accounts,
and that helped a lot also.
Carpathie.
Carpathie as well, yeah.
Yeah, I mean, also, you know, outside of the fact that so many people use it,
it's so impactful on the world.
It's also a great representation of a great open source project.
Like the value of assembly and see and making sure that like you take programming seriously
for real world systems.
It's not just that.
We'll talk about assembly later, I'm sure, because that's its whole topic in itself.
But it's also celebrating people like Andreas Reintart who do maintenance.
I believe unpaid.
I believe as a volunteer, he's doing massive refactoring.
Andreas Reinhardt and Anton Kernov rewriting ffmpeg.c with threading,
celebrating those guys, celebrating the untold labor that's gone into this
that actually doesn't change anything from the user standpoint.
The files are exactly the same, but wow, the airplane has been rebuilt whilst it's in the air.
Christian Garcia said, is a teenager running this account,
referring to the FFMPEG account, and you responded,
teenagers have written more assembly in FFMPEG than Google engineers.
but also just pointing out that there's a lot of incredible contributors who are teenagers.
Like J.B. said, we don't care who you are, where you're from, what you do.
Teenagers have written thousands of lines of assembly over the years.
Give a shout out back in the days to Daniel Kang.
So also highlighting the work of people like Rukai Peng, this is a 16-year-old,
some of his first contributions to FFMPEG, actually doing and putting some of these, quote, unquote, security,
researchers to shame by actually finding issues and fixing them and being 16.
There's no barriers.
There's no barriers to you have to study at college under this person and understand these.
If you can learn C, and let's be honest, it's from the K&R book, LearnC, you can learn assembly.
We'll talk about that maybe a bit later.
You can contribute to world-class technologies.
In VLC, one of the oldest contributor is called Felix.
He's the one doing everything on Mac and iOS.
He's starting working on VLC.
He was 16.
We had a guy called Edward Wong,
who used to be a Google Summer of Code student
who stayed for three years around VideoLand.
He was 14, right?
And part of Google Summer of Code and Google Coding,
which were programs where basically we have students
or high school, we wrote a ton of assembly for X264
and for VLC and for FFMPAC, right?
So everyone can contribute.
And he also did a good job because he didn't play
the alarmist CVE Heist, create a CVE, which is like a public exposure of security and do these big, scary, red, 7.5 priority.
He just fixed an issue and get off the three days and just fixed it.
He didn't need to go and play a big security drama about it.
And I think, I posted, you know, the kids are all right, whereas there's, you know, there's, you know, there's a, I'm not saying all security people do this, but there is a portion of the security community, as Alex said, that,
likes to hype themselves up by creating drama,
they would have happily raised,
this is a high priority CVE 8.0 or whatever,
on an issue that actually was in Git,
it wasn't even in a release,
it was in development,
and three days later was fixed.
Well,
I just want to put a little bit of love out there,
even to the bigger community.
Much love and respect to Google engineers.
Like you said,
they're some of the best software engineers in the world,
and they do contribute a lot,
on the security front.
And also, you know, I'm a big fan of Theo.
Much love to Theo.
He was part of this debacle and drama a little bit.
I think when you just zoom out on the grand arc of human history,
the drama contributed positively to everybody involved.
Donations went up.
It brought more attention to the topic,
allowed everybody to bicker in a way that ultimately got them to figure out
whatever a fan bag is all about.
So the way we looked at this,
is like, it's a rap battle at the end of the day.
No, but it is.
We say stuff.
We say stuff.
Yeah.
But we can leave it on X is a perfect place for, you know, international rap battle.
You say stuff.
I say stuff about your mama, but it doesn't mean, you know, I have an actual personal
issue with her.
Yeah.
And that's what it looks like.
The Theo situation, you know, JV can maybe expand, went a little bit too far.
And there was a little bit.
But, you know, it's just a bit of fun.
It's just a bit of rap battle.
It's a bit, it's a bit, it's, it's, W.E.
You know, everyone's having a bit of fun on X.
It doesn't need to be taken seriously.
You know, the teenagers thing, you know, so that guy was a Google employee saying, hey, you know, there are other ways to run an open source business, going to go and go and I was like, oh, man, just have a bit of fun, you know.
That's what the point of this account is.
And furthermore, if you can teach people about the ways of open source projects, assembly, et cetera, by doing that, I think there's a lot to be offered here.
It's not dunking on people for dunking's sake.
It's showing actually the story that I think X learned is these are not big corporate open source projects.
It's not Kubernetes where there's, you know, hundreds,
maybe thousands of people paid to develop this stuff.
These are just people in their basements in their spare time.
And if you can address that topic in a fun and entertaining way,
I think that that's the good thing.
And that's the value of X and in the reach we have.
And to be honest, right, like even at Google,
Google is one entity, but so many different people, right?
And there is a ton of Google engineer.
we work with all the time and even like Google from YouTube to Chrome to Chrome Media to the rest of Google.
Those are very different type of entities.
But what we do is efficient.
And for example, for Theo, right, it went a bit too far.
I had him, like I call everyone down.
I had him on the phone.
We say, okay, like this goes too far and so on.
But in the end, yeah, it's a rap battle, but it's positive for the project.
It's like the awareness we have on open.
open source and I mean true open source from communities, right?
Not is increased dramatically in the last two years and this is useful.
What it motivates all the incredible contributors that we've been talking about?
What's the engine?
It's so interesting to see.
Like you said, they're sitting in the basement.
What's the driver?
What's the engine there?
There are many drivers, but weirdly the main one is that what we do in multimedia plays videos
and video is cool, right?
And for example, we have so many people in the community who arrive because they loved watching anime, right?
And this is like the advice when people ask me, what should I work on in open source?
How do I start?
And my answer is always the same.
Work on something you love.
I am working on VLC because I love movies, right?
And I love watching the same movies over and over, even if my wife hates me when I do that, right?
But because it's interesting, right?
because it's a topic that you like, right?
That's the first thing where people come to,
usually, to VLC and FFMPEG.
The second thing is that technically,
because we search for excellent,
this is the best school ever, right?
This is the best school ever of programming.
If you're good in C, in FFMPEG,
if you know how to write assembly,
I assure you you're going to be one of the best programmers ever,
even if you're working on writing TypeScript,
because this is the most amazing,
thing to do. And you will
have to get reviews
by some of the most
seasoned programmer ever who are
going to look at every part of
your code and tell you why it's not great.
It's like we are the best teachers that you've
ever had in programming, right? Andrew
Kelly started Zig. He was an FFMPEG developer
and started Zig after his FFMPEG
school. I mean,
it's the place to learn
so many aspects of programming
in the real world, in a thing used by
billions of people. You have no
to hide. You have to be open and honest about your flaws and how you can learn and be better.
And what is also interesting in multimedia is that you have 16 milliseconds to display a frame.
It's not like a game engine where you can basically slow down and wait a frame. So you need to be
good, right? There is no choice else you don't have your video. And because of how codex,
if you miss a frame, you're going to destroy the look of the video, right? So you need to be good.
You need to be perfect to have the right thing.
But also is that it's not just pure programming in the mathematical fence, right?
A lot of people don't understand, but in order to program correctly on the open source multimedia community,
you need to understand how computer works.
And when you write assembly, you need to understand about CPU pipelining, right?
You need to understand how SMD works, how the ILE works, right?
You need to understand how IO works, right?
And this is what I think that is missing to a lot of engineers and software engineers today
is understanding what we call computer architecture.
And like seriously, like some of the debates is like, should we use this assembly cool or this one?
And people say, well, no, it's going to be like three cycle on this type of CPU and this one.
And it has massive impact on the output, right?
We should expand.
FFMPEG is probably one of the biggest CPU users in the world.
It's probably running as we speak easily, order of magnitude, 100 million,
maybe even a billion CPUs as we speak.
So every instruction matters.
There's not the impact, at least in terms of CPU, is massive for everything that we do.
So first you come because it's an interesting subject, then you stay because it's excellent.
And in the end, you're very proud of it because it's on the end of everyone.
Like so many people like, oh, I'm working for whatever consulting company and I'm doing some
portal to download invoices for your PG&E.
Wow, great.
Like so many jobs are like that.
You're not going to tell that to your grandma.
But if you go to see your grandma and say, I do this so that you can play video on your
laptop, they understand.
And this is very important, right, because you're working on VLC, FFMPEG X264.
it's in the end of hundreds of millions of people
and you have an impact.
And so you can be proud of yourself.
And so I think that in addition to doing a great resume,
all those things are why people contribute.
Yeah, those are side effects.
My favorite quote on this topic is John Collinson.
He said,
The world is a museum of passion projects.
You know, everything out there is a passion project.
And open source multimedia and open source in general,
you can just do that so much faster.
There's such a faster network effect,
you know, I can open a cafe and that can be my passion project, but I have to get building codes.
I have to build a building. I have to find a location. I have to do all the, you know, all sorts of
things. Well, in the software world, that passion project can be, can move quickly. It can be amplified
by the network effect. And that amplification can be more than the sum of the parts. You know,
you can be, you can find people interested in extremely obscure things and have a network effect.
make something that is truly amazing.
And on that topic of passion projects, Tim Sweeney actually said in a reply to a tweet that was
complimenting J.B.
He said, quote, many things in the world only happened because an awesome person decides to do it.
This is the case with VLC.
And that speaks to something interesting to me.
It does seem that a small number of people, sometimes one person, can create something
something incredible in the software world.
Like you said, it's over and over and over.
I think JavaScript is an incredible thing created by initially a single person.
Some of the programming languages like Python and C and Java,
like this one person has this vision, has this design,
and brings it sometimes over a weekend is the initial spark.
Yes, Linus built Git in two weeks.
Wow.
Yeah, it changed the world.
Git.
I mean, it really changed the world.
Linus Passion for, hey, uploading the Starbolt to a new.
FTP, like deal with it.
But for me, it's not just in software, right?
And I believe in individuals that are going to change the world, right?
And it's with a good, as you said, vision, right?
I want to do that.
It is useful.
It will be useful.
And whether it's going to build train or cars or rockets or something, like, I believe
people who believe in themselves and have a vision can have a huge impact for humanity.
Let's actually zoom out before we zoom back in.
We'll just keep going up and down the stack.
So, you know, we've been talking back and forth, VLC and FFMPEG.
Karen, you said that FFMPEG and VideoLand VLC coexist,
and there's no central point of importance.
It's a kind of what you call it the binary star system.
They succeed because of each other.
Can you explain the difference, how they interact?
What is the...
Are they competitive?
I don't think they're competitors.
I think the simple answer is the short answer before I go into detail is VLC is to FFMPEG as Android is to Linux.
So they depend on each other, but they coexist because of each other.
So they are a binary star system as the analogy I used.
By the way, I feel horrible that I just recently learned that Alpha Centauri, the closest star system to us, is a triple star system.
And when you start doing the physics, it's a nightmare, right?
Yeah, hence the three-body problem.
But anyway.
So a lot of FFMPEG pipelines involve the X-264 project, which is a video land project.
I would put a finger in the air that, say, 80 plus percent of those pipelines are dependent on a video land project.
VLC, obviously, we've discussed, a video land project, uses FFMPEG, gives it reach, exposure to weird files,
historically used some donation money to fund FFMPEG development,
and we'll talk a bit about some of the reverse engineering later.
So it's a binary-star system.
They work and feed off each other.
Many of the developers are shared.
There's no central location.
It's a virtuous cycle working together.
We should mention that X264 is the encoder for H-264 video standard.
So H-O-6-4 is the standard.
Yes.
X-264.
It's the open-source implementation of the standards.
That's used by basically everybody for everything.
Yeah.
It is the main driver of this.
When you think of an MP4 file that has H264 Kodak in it.
If it came from a software environment, like a data center or somewhere, the chances are it was created with X264.
And that's under the flag of VideoLand.
That's a VideoLand project.
So in the VideoLand graphic, it sits in the Videoland world.
And VideoLand has a bunch of stuff in it.
You go to the VideoLand website.
There's a bunch of icons.
Like if you look, there is so many libraries, right?
Lib DVD CSS, Lib DVD Nav, LibD-P-S-I, LiveV-V-V-D-D-SI,
LV-C, of course, VLC, Unity, LibRae.
Librae.
Yeah.
And there is so many more, right?
Like lately, the David project that we might talk about is the last project from VideoLan.
It's everywhere, right?
And we do, we have a lip spatial audio lately that we now.
Check ASEM.
Check ASEM, which is like an insane project, but amazing.
And X264 is one of those video land projects.
And my opinion, for example, is that X264 was the most amazing encoder ever designed.
And this helped the adoption of FFMPEG.
A lot of people and large companies went through FFMPEG
because they wanted to use X264,
and X264 increased the popularity on FFMPEG.
But also, VLC had its popularity because it's played so many files that were done by FFMPEG, right?
So it's many projects that are interwind and work together.
Yeah, unfortunately, there's a thing on X where VLC is mentioned and there's people,
a quick reminder that it's FFMPEG inside doing their actual work.
And that's, like I said, it's not, that's not the case that we work together.
And to give you an idea, right, when I compile VLC for Windows, I compiled,
around 16 million lines of code, right?
One million of those are inside the VLC repository,
and FFMPEG in total is probably two, around two, right?
But that means that so many dependencies are outside.
And if you also look at FFMPEG per se,
FFMPEG also is integrating third-party libraries,
like X264, but Lib Opus and so many others, right?
So we all depend on each other.
Yeah, that's why I was hoping to do this episode,
as we are doing, they just kind of joins FFMPEG and VLC,
because it's really two of the same, like you said,
a binary star system, and we're all just orbiting it.
Can we give a shout out to some of the people along the way?
We didn't really quite talk about the history of Fempeg.
So maybe can you tell me about Fabrice?
Can you tell me about Michael Niedemeyer?
Can you tell me about some of the key figures here?
the eras of FMPEG, because there's
key eras and key people that made
this possible.
Fabrice Pellard, as you mentioned, creating the concept.
And then probably in the 2000s era, I would call the
era ERAs tour of FMPEG is that the 2000s era was
Michael Niedemeyer.
So key things he got done was exhaustive support for
DivX and Exvid at the time and all sorts of weird
variants of what's known as MPEG 4 Part 2.
So this predate
the MPEG 4 Part 10 that we used to.
So this was 2000s era video codecs
where there were flavor
after flavor of weird, weird decoders.
At the time in the 2000s,
you needed a new player
to play every different type of file format.
So there was Windows media player
to play Windows media formats.
There was real player
to play real media formats.
And those were the other key thing
in FFMPEG at the time
were native decoders for those.
I actually do remember being a teenager.
I must have been
figuring out there was this one player that could play,
could decode these files without having separate bloated players.
Because at the time when you downloaded real player,
there was a ton of other stuff in there,
a ton of ads, a ton of other things,
and just having a simple library that was fast led to that.
And then I think 2008 was a 2008 onwards
was a big change,
because that's when H264 got its maturity.
And I think something hopefully we'll talk about a bit more,
more. This was the beginning of high-definition video. So H-264 was the key decoder of that.
So I'd call that the late 2000 and 2010s. And that's when the big reverse engineers came
along and really did astonishing work. The beginning was a single player that could play
XVid DivX, Windows Media and Real Player was already in a massive achievement in itself. Without
codec packs, without weird stuff you had to download that had weird ads and weird spyware.
VLC1.0 was out on those times,
2000, 2009, 2010,
and this is like where it exploded.
Yeah, without Kodak packs,
it just works across all these different...
The fact, so, it's just like all the Kodak packs are FFM Peg
inside VLC, plus we have other modules for all the type of Kodaks.
About that the time that wasn't,
is there were weird, in the 2000s,
there were weird codec packs with DLLs coming from this place,
DLs coming there,
with spyware, with, you know what,
it wasn't really.
liable, you didn't know, and having a single player that was open source or single playback
module slash player that could do this that was open source. But I think the thing to emphasize
is this task in the 2000s that Michael did was Sisyphi, and it was really the number of edge cases
are beyond comprehension in terms of you could have a Chinese CCTV system that did one weird
variant of MPEG 4, Part 2 was known as MPEC 4 ASP. And that was a weird variant. And you had to fix that
without breaking everybody else times a million.
So you said that's where a lot of the reverse engineering was happening.
It started in the 2000s with the Windows media stuff because that was proprietary.
It started with the real media, so with Benjamin Larson.
Kostya Shiskov, that era.
Those were the key, that was the key groundwork.
And then in the 2010s was kind of the poor Mahal, Kostja era, building, doing some of the most difficult codecs.
J.B. maybe you can talk about go-to-meeting four and go-to-meeting five.
and what's the good meeting?
So, so like, let's talk about this amazing Ukrainian guy called Kostia,
who was at a time living in Germany and who was in love with Sweden, right?
And the guy was the most, he's like, like, like, a lot of the people in the community are very clever.
He's one of those who are like borderline geniuses, right?
he was able to reverse engineering extremely complex codex.
And he does that, and we do a bit of engineers with Kiran,
but clearly not at this level.
No, no.
He reverse engineered binary blobs, which are 20 megabytes.
Yeah, so just for reference,
one megabyte binary blob to reverse engineer is probably order of magnitude,
a month of work.
And this guy is doing 20, 30 megabyte blobs.
Maybe we'll talk about that in a minute about the subtleties of how you do that.
but this guy is doing it for very difficult and very obscure codex.
And did that for fun, right?
And so GoToMeeting was a big problem with VLC
because that's what like the number one feature request for a long time,
so I put a bounty.
And the guy at some point said, okay, J.B, I'm going to do it.
And in a matter of two months, and then he explained how he did it.
He was just like, oh, I looked at the code.
Like this looked like a DCTs that I used to see on W.
UMV and so on, he did that.
And the funniest part is that the code he's written is a ton of jokes.
And there is a ton of J.B., right, my name, and Kempf and Kempf and Kostia jokes inside the code.
The code is beautiful, right?
So one of the things I want to call it is I've gotten a chance to speak to some of the developers,
some of the assembly language level people.
And they all always make everything sound like it's kind of easy.
There's a kind of humility because maybe just the level of what's required to do this stuff is so high that everything else seems easy.
I guess is the lesson to take away from that.
So in the community, like some of the most impressive people are the ones doing reverse engineering and the other ones doing the assembly force, right?
And both of those type of people are amazing.
X264, for example, became amazing
because of a guy called Lauren Merritt,
who was from the University of Washington, I think.
At the time, yeah.
And who was, like,
who made everything great and fast
doing a ton of assembly.
Yeah, so this is like the golden era, I guess,
where so many things got on.
Yeah, if you look at Kosteja, for example,
he looked at the world as a binary specification.
He didn't need documentation or anything.
I have a binary, and I can figure all of this out.
And he regularly used the phrase binary specification.
Ah, you know, it's not a problem.
And he went and he would go away and he would come back and he would do interesting stuff.
Can you actually speak to the details and add color and texture to what it takes to reverse engineer a blob?
Yeah, so let's look at GoToMeeting, for example, is a good one because I record a meeting on GoToMeeting, for example.
How do I play it back without needing this Go-ToMeeting player?
they may not even be a player.
I may need to send a recording of a meeting to someone
that doesn't have a player or whatever.
So first of all, there's a ton of other stuff there.
There's an actual video conferencing client.
You need to go and find, it may be easy,
it may not be easy to find the actual module
doing the decompression.
You need a way to actually dump the YUV data
from the module.
So often it involves opening in a disassembler,
trying to guess where the hooks are
to incorporate that module
and run that module natively to decode a sample file.
So figure out where this module is doing the decoding process
and find a way to hook in and output the raw YUV data
because you will need that as a point of comparison
for when you actually do the reverse engineering
because you'll need to be bit exact or in some cases close to bit exact.
And then you open up your disassembler.
Use a lot of intuition to go and figure out, you know,
where the DCT is, where's entropy coding.
There is a kind of, not a rulebook, but there's always a pattern of some sort.
For example, GoToMeeting, you know it will be a lot of screen codec tools.
There's also different variants.
So often, I think there's what's GoToMeeting, 4, 5.
Oh, 234, I think.
2, 3,4.
Yeah.
So as you mentioned here, going to perplexity, Go-to Meeting, uses its own
proprietary codec for older recorded sessions, historically stored in WMV files that require a special
decoder to play properly on Windows.
without this decoder installed Windows Media Player
and some editors cannot decode the video tracks
so you may only hear audio or see a black screen.
Boy, do I remember that.
But this is reverse engineering that.
This is key, right?
Because the GoToMeeting is something that not many people know anymore, right?
You know about Zoom and Teams and so on.
But like, now let's fast forward 10 years, 15 years.
And like this is go-to-meeting.orgie for Windows 32-bits, right?
which is like, oh yeah, but I'm on Android,
I'm on an iPad, I'm somewhere else, right?
How are you going to do that?
I'm going to be on risk five, on arm.
Those are blocked, but there are tons of files we need to support for the future.
And this is why those type of work are exceptionally useful for humanity.
I just have to say, though, that reverse engineering process is mind-blowing.
It's crazy.
It's like, it's kind of like, you know, I've been reading,
a lot and I interview archaeologists.
I mean, you just have so
little signal. Yes, yes, you know
over time you get so much
experience, you understand the structure of the
original coast, so you can kind of start inferring
basics.
But you're like
archaeologists with a little
brush trying to reconstruct the entire
human civilization.
Kiran is too humble, but Kiran has done some
reverse. Of syneiform, yeah, at the time.
Siniform, nice.
Yeah, the time before actually led to the
open sourcing of that work.
So in parallel to doing the binary side, you obviously have samples.
In many cases, you don't have many samples.
So you have to figure out what all the different flavors are.
And you may have a syniform, for example, is actually a collection of different approaches and toolkets within that codec.
Because often it grows naturally.
And the hard part is finding a sample that gets you kind of somewhere to start without having to implement 10 different other things.
So start there.
I think thankfully at the time I found a sample.
by pure chance,
they had a lot of flat blocks.
It was animation.
So that really helped a lot
because it wasn't using
particularly complex coding tools,
etc.
And you could kind of get somewhere
and then build up and build up
until you forgot,
hey,
here's a few bits here.
I miss this,
I miss this,
this if branch that it does and go,
oh.
So when we say samples,
you mean sample videos.
Yes.
And then you're tracking,
trying to infer,
like,
what is this code doing?
Yes.
By observing the sample
and then looking at what,
at the machine.
The machine code saying, I have this bite is six, take this branch.
And then a different sample.
Oh, it's nuts, man.
That is nuts.
So you see, this is nuts.
Then you go to things like go to meet.
Yeah, it's like.
Mine was easy, right?
Imagine two order of magnitude of more complexity.
A guy alone, somewhere in Germany, doing that.
Yeah.
And for a long time, you work, you're in a black box.
Because a decoder, for a long time, because there is so many steps from the
entropy decoding, the intra-prediction, the motion prediction, the IDCT and so on.
For a long time, you don't see anything, right?
So you're debugging purely in memory.
Debugging guesswork.
And you may have the buffer that the coefficients are stored in completely wrong.
And so you may be going down a complete rabbit hole thinking it's this and then, oh, damn, that's not, that's something else.
And you're doing that on binaries that are tens of megabytes, millions of instructions, right?
So you're stepping through the debugger, like one.
by one, you know, instruction by instruction going, hey, this instruction changes this, this,
this, posing the program on the CPU level.
Pausing it there on the CPU level, watching what's going on, trying to figure out.
Sometimes you need to be in a VM, so yes, you can pose the VM.
Yeah, pause the VM, dump the memory, because some of the codecs could have encryption.
There could be, like, a DRM on there.
So you need to dump the memory from a virtual machine.
Like, when I join at the Ecole Central Paris in 2003,
John Lash Johansson basically broke the DVD specification and created DCSS,
showed us how he was breaking a DRM, which was MP4 Fairplay from Apple.
What he did on his laptop, and I was young, I was 21,
was just like blindblowing because he was basically debugging windows
inside a type of VM with, like, wow.
It's incredible. It's mind-blowing and inspiring.
Does it get, like, from your experience and from what you've seen in the community,
Is it get discouraging?
People help you, people send you samples, people are keen.
Sometimes you don't have access to an encoder,
so this is even more difficult because you just ask and you have to ask for samples.
I remember VideoLand used to tweet for samples at one stage,
Hey, I need this obscure sample.
For a long time, I was, oh, I need discodec, and I need discodec.
And if you were really lucky, you would find, like, if you were unlucky,
you'd get like one or two, you'd get nothing or you get one or two.
And then they would, sometimes you'd find a gold mine.
I was like, yeah, my company has 100,000 of these files because we depended on it for some reason.
And so those are kind of the best if, because then they can test bit exactness across the huge range of coding tools.
Can you explain bit exactness?
Bit exactness.
So most but not all video codecs, certainly from about the 2000s onwards, have a bit exact definition.
So every implementation must produce exactly the same bits, bit for bits.
in exactly the same data that comes out of a decoder.
For like a large number of samples.
For a given sample.
So Lex's implementation, JB's implementation,
and my implementation of H-264 must match bit exactly.
That wasn't the case in the 90s of MPEG 2.
Probably fair to say,
one of the biggest mistakes the video industry made.
And I think people who were in the room in 92,
most of both of us were in diapers, I suspect,
but have acknowledged.
I would give a shout out to Yuri Resnick.
he's acknowledged that was one of the big mistakes of the era.
And you're saying the encoders needed to be able to run test
and then the bit exactness.
I mean, that's a nice thing to guarantee.
Like, there's a parallel sort of development here
on the way the web browser works,
which is, you know, takes HTML and displays it
and there's no bit exactness there across the different engine.
I would point out, actually,
if MPEG is unique in the sense that it's,
it has been a winner-takes-all scenario.
you have browsers is a good analogy
because it has to parse a lot of different content
and render it in a particular way like a decoder.
But there still are multiple browser engines.
There's Firefox's one, there's Chrome's one,
there's a few Japanese ones that are pretty decent.
That's not been the case in multimedia in general
across a wide range of codex.
FFMPEG is kind of one at all, I suppose, in a sense,
because of the fact that you can get
every new codec added is actually worth more
than the value of that codec itself
because it makes the whole thing better.
Man, this is really cool.
Going to Perplexity, Yuri Resnick is a multimedia and signal processing researcher.
Got his PhD in computer science from Keev University,
with over 150 papers and more than 80 granted U.S. Patton's contributor to major multimedia standards,
including issue 64, MPEG4, AVC, H-265, MPEG4, ALS, G718.
G718 is telco stuff.
And so he was more connected to company.
audio, real video, right?
Yeah.
Very important at a time.
Zen Coder, Bright Cove, Context.
Man, I need to hang out with Yuri.
He's legit.
And he's like one of the most nice person ever, right?
Like, for example, for my startup that I'm doing right now called Kaiber, right?
I met Yuri because I met him every year at the Mile High Video Conference,
which is in Denver.
And he gave me like so much good ideas and good things.
he's like really amazing person.
He tells us how, you know, how great it is to be, you know, even know, even know, you look at
that and I think it's the other way around, Yuri.
That reminds me of a thing that you mentioned to me about fate testing and like the insanely
vigorous process that's used to test everything that's incorporated into, into FFMPEG.
Can you take me through the testing process?
Yeah, so FFMPEG has a system called FATE, FFMPEG automated testing environment.
because FFMPEG runs on so many different OSes
and can be compiled with so many different compilers
has been a crazy number of configurations.
So you can see the absurd combination of compiler variants,
operating system variants, instruction sets.
You can see at the top MacOS has tons of different variants
because it has iOS, it has TVOS.
I'm looking at a page fate.fmpeg.org
81 minutes ago,
76 minutes ago, looking at the different architectures, the operating system,
the different compilers, Apple, Kling, version.
Combinations are crazy.
These are all run by volunteers.
So these are all volunteer systems.
The ones at the top, for example, the Macs, I host in my office, for example,
host all sorts of different stuff.
Other people host other things.
So it's really there to make sure, because FFMPEC does quite complex C code, for example,
you do have miscompilations.
So the compiler will sometimes compile C code incorrectly, for example.
This happens once in a while.
There's a log of all the compilations.
Yeah, log of all the compilations, all the tests.
I think one of the other ones will show all the tests passing.
If you click, you can see all the tests.
Back, all tests successful.
In logs test, yeah.
So you see all those tests are passing of all the different codecs,
all the different filter transformations, all them.
The level of scale is quite crazy.
On all the combinations, it's not just a matrix at this point.
It's like a pivot table of different combinations.
That's nuts.
And it's a key part of what we do because you may be able to test something locally.
You make a change.
But actually, that breaks GCC version 11 on Mac or something like that.
And you're able to then fix that.
We also have miscompilation.
So the C code, sometimes the compiler can have a bug in it where it creates the wrong output.
And that can have quite a big effect.
sometimes on a video because of the way frames have dependencies,
even a small change in the output can cascade to actually quite big glitches.
You see PowerPC, you see Risk, you see Arm.
There was Risk.
There was weird stuff in the past like DEC Alpha.
You see Visual Studio, different versions of Clown, Intel Compiler, Apple Clank, you name it.
What are some of the pain points?
Like maybe do you have emotional triggers, maybe nightmare?
about a particular operating system,
a particular container code, a combination.
For me, it's really easy because...
So I have a day job.
My company builds...
The company I started builds equipment
for broadcasting sports matches
between TV stadiums and studios, for example.
We have to work with 10-bit video,
and 10-bit video has a set of challenges
that you can't process 10-bit data natively on a CPU.
So that means you have to stick it in 6.5.
bits. So that means you have six wasted bits. So there's different packing formats to actually
pack the data more efficiently because when you send that over a network, you lose, because you
need to save that 40%. For example, on PCI Express, you may only have bus bandwidth to do that.
And so I think internally we have about some are, some are industry ones and some are internal
to our own hardware that we build. We have a, I think, a five by five or six by six matrix of
every single format to every single other format conversion.
In fact, one of them I sent you,
and they're all written in handwritten assembly,
and they're all written,
and they all support different CPU generations.
So this is really traumatic,
handling all these different combinations times a million.
By the way, the company you're talking about is open broadcast systems.
No relation to the free OBS streaming service.
Yeah, yeah.
But J.B and I have started companies,
broadly speaking, around the FFMPEG VLC ethos,
So that's really low-level work.
So in most companies, this wouldn't be written in an assembly.
It would be accepted that C is fast.
As you can see from that, C is not fast.
So here it says 62 times faster than C.
Yeah.
So it's taking the ethos of doing low-level programming, real-time programming,
and using that for commercial applications.
And J.B. and I have started companies around that,
in many cases hiring developers from the open source community
to use that ethos.
And so that's a great example of some of the things we're doing.
In most companies, it would be,
I'll write this in C and it's fast and we're done,
but actually you can get a lot better.
For me, some of the headaches we have
is around some OS that are difficult to support, right?
Because if you look at VLC and thanks to fate and FFM,
We run on, the last version of VLC runs on Windows XP
and still run there and runs on Windows 11.
We work on MacOS 10.7 to the latest MacOS, whatever it is, right, 26.
We work on iOS since iOS 9, where we are actually iOS 26, right?
We support many times of Linuxes, BSD, Solaris.
the last version still runs on OS2, right?
Like there is maybe 10 users of OS2 in the world
and one of them is maintaining VLC.
Then you realize that this very small team around VLC
and using FFMPEC codex and all the other ones
support more OSCs than Microsoft or Google or Apple
and they have infinite amount of power and resources.
But for example, the worst is iOS.
in order to build on iOS 9,
we need to do some very clever mixing of several versions
of the X code, IDE and SDK from Apple,
from several versions, and do a type of Frankenstein version of that
so that we can still support iOS 9,
which is not supported at all by the compiler of Apple,
in order to still run on Arm 32 on iOS 9.
And you've seen on Fate that it was still,
support in iOS 9, right?
So my headaches are mostly related to the support of so many OSCs.
And it's important because, like, we receive so many people saying, hey, thank you.
I still have my iPad too to watch movies.
And it still works on iOS 9, right?
And it's also an impact of like not forcing people to buy new hardware when it works fine.
If you optimize it correctly, which brings us to what we were saying about assembly,
it's also fighting like the fact that you need to buy something new nonstop while you could
optimize more, which is a lost art.
You got to tell me about this lost art or this, the carriers of the flame of assembly.
What is assembly?
Why is it beautiful?
Why is it challenging?
How does it work?
So when you write assembly code, you write this using the instructions.
the actual processor is using directly.
So most of the time you would write in a language,
let's take C is a good example.
The compiler would use that to create assembly language
and machine code instructions for you based off your C code.
And there's a specific flavor of assembly
that we use an FFMPEG.
That's called SIMD, SIMD,
single instruction multiple data.
So this means, for example,
say I want to add five to a number
in scalar assembly.
So this is what's known as you work on an individual element.
So I want to have a number of,
I have the number 10 and I want to add 5.
I use the add instruction.
And I add 5 to 10 and I get 15.
With SIMD, with SIMD,
I can have a whole vector of 16 different numbers.
It could all be different.
If I want to add 5 to that,
I can run one instruction.
And that one instruction sums all 16 elements.
And that, as you can imagine,
lends itself very well to video. Video is a pixel grid, so I can perform operations on multiple
pixels at the same time. The key thing that we do differently in FFMPEG is we don't use any
abstractions or any major abstractions on top of that. So there's a part of the world that
uses what's known as intrinsics. So these are C functions that behave very similarly, but not
quite the same to writing assembly by hand.
So the registers that data is stored in on the CPU,
the compiler allocates those for you.
And so the key thing to understand was when we write SIMD is we have a 10x,
a not percentage, 10x to 50X speed improvement.
That function is 62X.
That's nuts.
On the FMMPG account, as you know, posts and tweets a lot about that to try and say,
hey, we're doing this stuff.
You are a person who sees the beauty and assembly,
but it's also extremely useful for these kinds of application
to actually significantly outperform,
even see, which is crazy.
It is necessary, right?
Because one of the projects that we need to talk about is called David.
So David is a decoder for the format that was done by Alliance for Open Media,
which is a video decoder called AV1.
So for people who don't know,
we've been talking about H264,
AV1 is another hugely popular standard
and codec that is increasingly taking over the internet.
And when this format was launched,
many people said, especially even from the Alliance for Open Media,
which is Google, Netflix, Amazon, Mozilla,
I say, well, this format is so complex,
it must be done in hardware to do decoding, right?
And while I arrived with a few other people,
mostly Ronald, Henrik and Martin,
and we said we need to have an extremely good software decoder
because it's going to take time to have hardware.
And so we wrote this project, which is beyond insane.
We are talking about 30,000 Line of Sea,
but 240,000 lines,
lines of handwritten assembly, right?
Handwritten assembly, 240,000 lines.
That's incredible.
I mean, some of the stuff we're talking about is probably the biggest assembly codebases.
To give you an idea, and Karen can correct me, but I think the FFMPEG has 100,000 lines
of assembly for all the codecs.
All codex.
And just this one has 240,000.
It's a Videoland project, of course.
and it is optimized at the maximum
because the motto when we're starting the project
is every cycle matters, right?
Every cycle matters because David is used in VLC
and in some software AV1 playback stacks.
We are talking about probably 3 billion devices
which are going to decode video nonstop
because, for example, 30% of the video from Netflix
are now in Evi1, 50% of YouTube,
And you often don't have a hardware decoder because not many devices have a hardware decoder.
And with David, we realize that when one or two calls, you were able to decode 720P correctly.
So it is like literally incredible, right?
David, look at that.
Yeah, so there's another spicy tweet from you.
This is what peak video code should look like, 79.9% assembly, 90.6% C,
and 0.5% other.
And what's incredible is with those tweets,
which is factual, people get
crazy. They're unhappy, right?
For the last two years, they go crazy.
No, intrinsic is fine.
The compiler is...
Oh, they go.
You can optimize your compiler,
auto-vectorization.
It's your fault.
You don't understand.
And we've tried that forever, right?
For two years and two years later,
showing hundreds of examples
of handwritten assembly.
No, no, no.
You're doing it wrong.
the compiler can do this?
So we should actually just articulate a little clearer.
So the intuition there from the software engineering folks,
when you have code like, okay, let's just take an example,
C++ plus, there's a compiler that's doing a lot of the optimization.
Yes.
And the presumption is if you have a good enough compiler,
if you continue to improve the compiler,
you're going to generate code.
Yeah.
They can perform, like optimal performance.
You cannot possibly beat it.
And you're consistently challenging that thought that if you do it.
By orders of magnitude, handcrafted assembly can outperform C.
The two things that they tell us is, yeah, but modern compilers have auto-vectorization, right?
Because SMD that we're doing is vectorization.
And like, it's not even close, right?
It's not even close, right?
It's not like 5% 10% slower.
It's multiple times slower.
So can we, I don't know if you can say something philosophically because there's a lot of,
there's a lot of great software engineers, great engineers, great machine,
learning people.
Kapathi,
will listen to this and say,
what's the intuition
he's supposed to get
from this?
What are we supposed to?
He learned assembly
because of the tweets,
by the way.
He's like,
oh, I think this is a move.
Let me figure out what's happening.
And you know the way
he documents his work and so.
Philosophically,
what's important to realize
is that we passed the time
where hardware was going
so much faster, right?
We have the end of the more low.
We have limitation for,
for AI, for memory.
You need to go down
in the stack
and optimize more to get more power from what you have
because our request for power, CPU power,
GPU power are exploding while the hardware is not exploding in speed, right?
So what people do is that they had more calls, right?
But that's basically like at some point you can have 250 calls, right?
So what we do is to take every inch of the machine.
Not just that, not just that.
We abuse the machine.
We go and use the machine in ways that the creator didn't expect.
Sometimes we use an instruction that's completely unrelated to what we do.
We use a cryptography instruction in video processing to do nothing related.
And one of the other things that we do, for example, in David, which is a bit crazy,
is that we don't use the function calling convention from the operating system.
We should explain that.
That is extremely complex.
But basically, usually when you do move from one function,
in code to another, there is a way to save the registry, the state of the CPU, to enter another function.
And this is like standard.
It's a bit complex. I would simplify this a bit.
So David does things to abuse the calling convention.
You could define the calling convention as I've written a function and I want to call another function.
How is the data shared between the functions?
Because there's a convention, what's known as a calling convention.
And what David does, for optimal reasons, is create its own calling convention sometimes.
So if I want to call Lex Friedman's library, we've got to agree on a convention so that I can share data with you in the assembly language space.
And one of the challenges in assembly is every operating, well, not every operating system, but there are at least four that I can think of on X86, Linux 32 bit, Windows 32 bit, Windows 64, Linux 64.
They all have their own calling conventions.
and so one of the amazing things Lauren Merritt did, who we talked about before,
was create a very lightweight abstraction layer.
So you could write your assembly code once,
and it handled all the calling convention stuff for you,
which was always a problem because you had to manage four different variants.
But David takes this even further,
for speed reasons, it does its own calling convention within itself
to bypass the kind of rules, the rules of functions and say,
okay, actually, I'm going to call a function this way,
Because I know it's within my library.
Does there have to be special to every single operating system?
Well, if it's custom, no, but the challenge is, in general, yes,
and in terms of each instruction set.
So the thing to also emphasize is we do this on every instruction set.
So every instruction set has its own handwritten assembly,
which is even more crazy.
And that matrix has got bigger in recent years because of Risk 5,
because of Arm 64, because of the new SVE, there's SME,
X-86 has AVX-512, AVX.
So we do runtime processor detection.
We see what the machine FFMPEG is running on or David's running on is capable of,
because you could be on a laptop from 2008, where this isn't there,
runtime detection, we set function pointers accordingly,
and then from then on, off you go.
Or you could be on a machine with risk five.
Yes.
And in all that, we don't even respect the calling convention of the operating system
in order to be faster because we know that we are going to be called from within our binary
so we can share data without saving all the registry in the common way because that can lead
to loading and saving registry on the L1 and L2 CPU and gets us faster.
So that's why I said that understanding CPU architecture, computer architecture, is key.
And this is also why it's handwritten.
I don't know anyone.
I've never heard any other project than David in doing that.
And this is what Kiran called it an art, right?
It is an art.
I think in a mass world, there isn't something on billions of devices.
I know there are some specialist industries.
I know in high-frequency trading, they take this really seriously,
where they're receiving feeds from a market
and they need to react within X number of microseconds.
And so the instructions matter.
But that's not a mass, you know, a mass-produced thing that's on a billion devices.
That's hyper-specialized running on hyper-specialized hardware.
We're running on all hardware from...
Sorry to linger on it, but, like,
that's a really counterintuitive,
almost like revolutionary idea here
that there's a huge amount of value to assembly.
Like what are we supposed to take away from that?
Like what, you know,
there's a bunch of people listening to this.
They're basically like, sorry, from myself included,
you know, I programmed for many, many years
and C++, going up the standards of C++,
found love with C++,
and then transition more and more
because of machine learning about 15 years,
years ago to Python.
And so like for me in this Python world, JavaScript world, now vibe coding where I'm just using
natural language sitting in my jacuzzi drinking a drink and just talking to the computer,
like record stops.
Why is the value to go back all the way down to the low level?
Because you can get more power per dollar invested, right?
And sometimes it's going to be a problem that is limited by your heart.
hardware. A good analogy is what you see in quantization in LLMs, right? And people are doing,
oh, I'm going to do that in FP8 or FP4 or some crazy things like Microsoft Fear, who did in 1.5,
because you're constrained by memory, because you're constrained by the machine you can run,
because at some point we are doing real time. And I believe this is going to happen on AI inference
also, is that at some point, you need to get faster. And you cannot always get,
harder, more powerful hardware, right?
So you need to analyze code and see where, like, where is the mission critical?
Where is the things that are called non-stops?
And for example, David is a good example.
It's going to be run billions of hours per day.
That makes sense.
It doesn't make sense to be on the glue of FVMPAC, CLI.
It makes sense over there.
Yeah, this has to do.
Also, we'll talk about it more, but your new effort, your new company, Kaib,
is doing that kind of thing
for ultra-low latency,
so the slogan being
every millisecond count.
So when you're actually extremely
highly constrained in some dimension.
We are also arriving at a point
where we've done
so many great things,
but the hardware is getting back to us,
right, because cost is increasing,
because we need more power,
and so you're limited by
either your CPU, your RAM,
or your networking,
and you need to optimize,
and this is where value is going to be,
especially because doing AI is going to help do the programming of business, right?
And so the core thing that you will not be able to vibe code
are optimization for the hardware to be as fast as is possible.
I'd love to talk to you about who and how should learn assembly,
but first, I think we need a bathroom break.
quick 10 second thank you to our sponsors
check them out in the description
it really is the best way to support this podcast
go tolexfreedman.com slash sponsors
and now back to the episode
all right and we're back
there's this nice repo
with assembly lessons
first of all do you think
developers should learn
how to program an assembly
and how would you go about learning it
and what is this ASM
dash lessons
So I personally wasn't happy with the way assembly is taught in books and online because it's very grammar focused.
And you don't in general learn a language from learning the grammar and the structure.
You learn a language by asking someone what their name is and you start from there.
And you go and solve real problems that you have when you want to communicate.
You don't you don't learn sentence structure.
This is the interrogative and the adverb.
And all the assembly books seem to be doing like going through every instruction.
even ones that aren't really relevant explaining what they all do and how they actually doesn't
really change much. So and the other problem that we have in our community is assembly is taught
sort of hand to hand like person to person like blacksmithing one by one. That's the only logical
sort of analogy and that doesn't really scale online. It doesn't do the other thing. So this, this,
this, this, this, this, I started a set of assembly lessons in the in the way it's done in FFMPEG,
which is a little bit different to the way assembly in general.
for, I don't know, trying to think the other good, big use case of assembly is embedded devices
in really low power, cheap devices, and that's completely different to what we're doing here.
I think it would be good if you could highlight the requirements which are quite simple.
It's high school mathematics and C.
And actually not even C, really is pointers.
To emphasize, yes, we've talked about how brilliant this stuff is,
but high schoolers like Daniel Kang have written assembly in FFMPEG.
I think there's been contributions because of these lessons.
So it's really about trying to get this dying art to continue,
because we've shown it's possible with David to produce something amazing.
There's still a lot of codex and FFMPEG that are only maybe partially assembly optimized.
And so it really starts with basics and continues.
It explains a lot of the jargon, a lot of the syntax.
It doesn't really try and explain to you, you know, interrupt handlers and interrupt instructions
and all of these different jump targets.
It actually makes this really vector.
focused. And describes all kinds of registers, general purpose registers, vector registers,
really nice examples. This is cool. It's a classic example of the epic. But some of this
assembly language is really beautiful. And I think it's beautiful because it's kind of like flying a
spitfire. It's really aviation at its purest, but also pushing the aircraft beyond what the
designer thought was possible. So we're abusing, for example, sometimes
cryptography instructions to do certain things.
And there's a level of beauty and art where it's really you and the processor.
There's nothing in between.
It's you and the joystick of the cockpit and you move that joystick and it's
physically collected to the earlier ons.
And you can push that plane beyond what it can normally do.
And there's a level of, yeah, beauty and amazingness to go that.
But I don't think the sort of person by person assembly that is,
Someone taught me and I've taught multiple people is going to work long run because of the particular flavor and the way that we do it.
It's literally, no, I was going to say wizards handing it down.
I realize I looked like a wizard wearing this hat.
But you're basically just like the sages, the wise sages handing down the craft.
Can I ask about LLMs?
Can they help?
They have more of an understanding than I expected, but they are still, I've asked it questions.
it still goes and starts hallucinating, not hallucinating, but making modifications.
And then I go, is it bit exact? No, fix it. And then it just goes and does the same thing.
And it's going, there isn't the corpus of information like Stack Overflow to work on.
There is not enough data to train on. And this is the biggest issue.
I started my career actually doing some assembly for titanium, right?
So the titanium is a dead processor type, right, which was done by.
Intel and HP a long time ago
when they wanted to do 64 bits
well they lost and then we got
AMD who did it
AMD 64 which became
XXXIV.
But titanium was extremely
interesting in the sense that
those were processors who had
a ton of computing power
to do floats, FMAs
which is similar to what we need
now for LLMs, right?
And you could pack
three operations per line
that could be loaded. So
basically you had an output of basically
6 billion of operation
per second, but the bus
the memory bus
only allowed 1.5, right?
So your CPU was
four times faster. So you had to do
crazy things to pack things
in memory, reuse, the registered
and those type of semantics,
no language could do that, right?
So like I
have the Italian programming
book because Intel did
amazing books, but that's exactly
what Kieran says. If you don't know what you're going to do, it's impossible to read, right?
It's a ton of jargon and so on. While those lessons are amazing because they are targeted to a
real problem and you can do it yourself. People have. People have their patches and they said,
oh, I studied your lessons and here's my first changes. That's amazing. And part of that in the
lessons is a framework called X86 Inc, written by Lauren when he was working on X264. And it allows you
to do more things about that
to create a type of not caring too much
about different calling convention.
And we had a lot of students
who gave code to 6-6-4
using that a long time ago, right?
So it's really doable,
and I believe it's necessarily
to understand assembly language,
even if you don't do it much,
to understand what's going on inside your computer,
and that will make you a better programmer.
And I assure you that because doing that, you will understand some of the architecture of the memory inside your computer, right?
Understanding register L1L2L3, Rams, SSDs, disk, and so on, which are very important because then you have a good programming culture that will make you a better programmer.
What do you think about the RAS programming language?
Because that's a bit of a meme.
We have very different opinions with Kieran.
I think it's valuable what they're doing in terms of memory safety as a concept.
Can it achieve some of the speed up that assembly achieves?
Not assembly by hand, no.
I think that that's a given, C, potentially.
But I see it very, it has a very big Esperanto vibe about it.
It's like, we're going to solve this and we're doing this in a particular way.
Meaning it's a bit too utopian?
There's a lot of focus on the self-importance rather than solving real world problems.
It reminds me of the Sinclair C-5.
Sir Clive Sinclair of Sinclair computers built a car.
and he said, oh, everyone will be traveling around in one of these electric cars.
And it was, Frost reminds me of that where I think the community doesn't,
the community doesn't quite understand that in order to get people to move,
you have to build something that's as good as, if not better than what you have now.
Yes, people are doing Rust rewrites, but if they're, if they only do 85, 90% of the feature set of what we need,
like things like core utilities, that last 1% takes 99% of the time.
To use Elon's famous quote, prototypes are easy.
Like, this kind of stuff is easy.
But this, to get a real electric car, you have to make a car as good as if not better
than what we have now.
And Rust isn't in that stage yet.
I don't think anyone would object to seeing Rust code in FFMPEG,
but it needs to work as well and support the same unit testing as everything else.
It needs to be flawless.
It can't just randomly break.
They can't just randomly break.
when they want to, it needs to have, I think, more, I think it still has only one compiler
implementation.
So it's got to be as good as if not better and saying, hey, here's my utopia of memory
safety isn't enough, even though we probably all agree that that's the goal.
So I've done a ton of rust.
And the two major topics I had was adding rust modules inside VLC.
One of the reasons VLC got popular,
which was one of the main architecture decision,
is that VLC is a very small core and a ton of modules, right?
And so you can write modules in C++, in Objective C,
and anything that is basically interrobable with C.
And so we did some RAS modules.
And so I have experience on that, and I wrote some of it.
And also, like my new startup called Khyber is an open-source project,
mainly done in Rust.
What, rust is extremely good in the sense that it's a better C++ that cares about memory
and allows you to do things about memory ownership that no one else can do so far.
However, it's great when you start a new project from scratch and you do everything in Rust,
but it's very not good when you interrupt with the existing part.
And some part of the Russ community
believe that they need to rewrite everything
and everything will be better with Rust.
And the answer is like, no.
Like I'm almost always,
and all my years of being engineer,
manager, city of startup and so on,
don't rewrite.
Right?
That's the initial instinct for a lot of people
when they show up to a code base,
probably before LLMs,
is like probably because they don't understand
the,
the wisdom of the way things have been done in the past,
they say, well, we need to rewrite it,
hence why there's a thousand JavaScript frameworks.
But the reason is the following,
and this is very important to understand,
it is an order of magnitude easier to write code than read code.
Yeah.
And you see that also with LLM.
They can wipe code.
Enlizing is a lot more difficult.
And so when you arrive,
and when you arrive to a very complex piece of code, right,
you don't understand it, right?
Because it's so much more effort to understand the code from someone else
because you don't have the thought process.
And often I joke about some languages,
mostly Pearl, for example,
which has very complex syntax.
And imagine I am at my maximum intellectual efficiency in programming, right?
And I write the best code ever.
I will not be able to understanding myself six months later, right?
because reading code is more difficult.
So very often you arrive,
you don't understand all the wisdom,
all the business logic,
the reasons that were done,
that is maybe not documented.
And you say,
well,
I'm going to rewrite it.
And the thing is,
no,
you don't, right?
Because as Kieran said,
right,
I'm going to rewrite core utilities in Rust.
And then,
of course,
you arrive very quickly at 80%.
Then 90%,
takes a bit more time.
And then you got the last ones, right?
On the other side,
So for new projects, it's great.
Everything related to parsing files, networking,
because of the memory checker, borrowage checker,
it's amazing and there is nothing else.
To answer a bit differently for us,
imagine I take a piece of software like David or X264, right,
which has a ton of runtime assembly, right?
I rewrite the C part in Rust, right?
So it's more secure.
Yes, but then you arrive into the...
assembly and you can jump anywhere in the memory because we are doing handwritten assembly.
So even if I rewrite the C part in Rust for security reason, you break all the security
when you write handwritten assembly because we can jump anywhere.
So in my opinion, we need to do something that is secure assembly, right?
So which is compile time, check the assembly, which is similar to the check assembly projects
that we're doing on David and X364 with Videoland,
is to start instrumenting your assembly at compile time
to check that it's not jumping anywhere in the memory
because else you might rewrite a part of C in Rust,
but if you want to have the same performances,
you're going to have in-line assembly,
and so you destroy your whole security model.
So that's a bit what I think about Rust.
I just want to, I would say on a personal level,
so in awe about the assembly,
I actually, once a note, it never gets old the speed improvements to show 62X.
There are months, on a personal level, I run our internal test suite at work and just see,
I'm still in awe of the gains we have.
There's a source of joy and happiness programming for different reasons,
but I think one of the greatest happiness is in the optimization of code,
and it sounds like you're like at the cutting edge of that.
That was cool.
community, I want to speak about two people who are wizards of assembly, right? The two of them are
actually living in north of Europe, Sweden, in Finland, and Henrik Gramner knows so much
about Intel X86 assembly that when we ask questions at Intel about things, they tell, like,
why are you asking us Intel? You have Henrik. Henry. Henry knows
better. He knows all the cycles of almost all the SIMD instruction by all the CPU generation.
Oh yes, this is a P4, this is a Nehalem, this is a core two, et cetera. That person is like the
best person on assembly in the world and he's the nicest person that you've seen like very,
he arrived, you don't see. He's amazing. And the other one is called Martin, Martin Storsio. And he's,
They're doing mostly the same on arm, right?
So neon, right?
And iPhones and Android and so on.
And he codes in assembly on his phone,
editing it with the crappy keyboard,
like virtual keyboard you have,
while watching his kids play in the playground.
Like, this is just, like, wizard level.
So those two people are like,
Yeah.
So a part when you're programming assembly at that high level,
a part of that is knowing the architecture that you program out.
Yes.
On Arm in particular, yes.
Arm in particular.
But these are complicated architectures, right?
Yeah, but in some ways, it's more...
X86 with out-of-order execution.
It's not so bad.
Arm, you really need to understand all the different generations of arm processor
because they're all different.
There's A-72, et cetera, et cetera.
And there's the Apple variant.
as this variant is that, and you need to write code that works efficiently on all of.
X-86, well, broadly speaking, you have Intel AMD, you have sub-variants,
but generally speaking, there's something fast is going to remain fast on all of the variants,
whereas an arm is a completely much more complicated ballgame.
We're taking a non-linear journey through history here,
but we're talking about Michael Niedemeyer.
I wanted to ask about this, for a time,
there was a split in FFMPEG and a lib AV.
Yes, so in open source projects,
sometimes you disagree, right?
You have such a net word putting it, yeah.
And the good thing is because of the license,
you're allowed to basically do your own, right?
And this is normal, and this has happened all the time, right?
At the point, there was GCC at the time of GCC2 and EGCS, which became then GCC3, right?
There is what we told, Khtml with WebKit, with blink.
It is a same process.
And also, like, when I want to do a new feature today in VLC, I fork, I do my thing on my own,
and then I merge back to the community.
So there was a split in the open source community on FFMPEG, which became Libel.
Bavie and FFMPEG.
And after a few years, well, the community merged back and people moved on.
It's a bit drama that is normal in open source community.
But folks are even, they're important because they change the statute quo of a community.
Not talking about FFMPEG and IBEV here, but the GCC fork made GCC a ton better.
because some people wanted to change the architecture fundamentally to make it faster.
And of course, it's always question of people and so on.
But in the end, you realize that FFMPEC today is better than it was before the fork.
And now, well, we're back all together, right?
And I spend a lot of time in Kian and can say in the community.
It's not often, to be honest, very well explained,
because a ton of the reasons are not very public,
but I think that's normal and that's good.
Yeah, I mean, you're making it sound really nice,
but there's pretty heated battles inside open-source projects.
I mean, it is a very passionate community,
and you're kind of in a distributed way,
have to define the direction of things.
So here, looking at perplexity,
FFAMPag and Live AVeV split in 2011,
mainly over project governance,
leadership style and development processes,
not because of a fundamental
technical disagreement.
FFMPEG effectively absorbed Libby's work while Libby VEVE
Withered their most distributions and developers move back to FFMPeg.
Yeah, that was a weird.
That was a weird.
From a user perspective, that was a weird experience because, you know, I'm a Linux user,
so, you know, whether it's the Bantu and so on, all of a sudden, I think for a little
bit Bantu, I feel like, am I remembering correctly, switched to Libbye V?
12, 14, something like that.
Yeah.
And then they switched back.
I was like, what is happening?
So you get to feel the ripple effects of the different internal debates that are happening.
To be fair, on Apple, when you type GFC, you get C-Lang.
They did something like that as well.
Yeah, so to me, it's like the fork was like heated drama, but most of the development from Libbyv was merged back into FFMPEC, right?
So the fact that FMPEC got a super set around Libby,
and so that gave the user,
because in the end, we worked for the users,
a larger set of features.
And a ton of things that were discussed.
For example, the debates on reviews and on how we push
are something that now is completely settled in FFMPEG
and is following mostly what everyone in the community agrees, right?
So the fact of everyone who was active on Libbyvi
came back in work on
FFMPEG because the
disagreement were fixed and
in the end FFMPEG is stronger
than it was before
right and I know people love drama
but um well
my main concern I understand
and I think looking at
the long history
it's all for the good
but I do
I am concerned because there's so few
humans that
are critical to the success of open
projects that I have seen it be a psychological toll on folks and you know sometimes
least the burnout so you have these incredible people that are at the core of open source
projects there's a moment that happens because like what is the motivation of doing it
ultimately it's because you're passionate about it and it makes you happy then a certain
point you wake up and it's like this been a bit too much heat from the drama so like at the
at the project level the project continues and
often flourishes. But sometimes there's these individual humans, they're just like, I've had enough.
Yeah, but it's not just about folks, right? So it's a very, what you, what you are referring to is
one of the most challenging and most interesting part of open source today is maintainers burnout.
Right. And AI is a problem because of that. And Daniel Steinberg, which is the maintainer of
Curle, who's probably one of the best promoter of open source in the world.
He's, by the way, a member of the European Open Source Academy with me.
So I'm very, like, humble to be on the same communities as him, right?
He's against what he called AI Slop, right?
Because it gives a ton of fake reports or bad reports, bad patches.
And then a lot of maintainers have a lot of burden to maintain the software.
And this is straining the mine of open source developers much more than folks.
And for example, the XZ fiasco was because there was one guy maintaining it
and he got basically hammered by two attackers who were asking him questions nonstop
at weird times at night to block him.
And at some point he got fed up and says, okay, I can't do that and gave the commit access
to the attackant.
So burnout in open source community is something that exists, but mostly it's about maintaining things, right?
No, for sure.
I wonder how do we help that, because those people are so important.
The human beings are so important, the core of these projects.
So, for example, now I am maintaining a ton of multimedia and non-multimedia library as maintainer
because the maintainers got fed up, right?
some on VideoLand, some outside of VideoLan,
because sometimes you need a tough skin, right?
Because you get, like, it's not really attacks,
but, oh, this is not working, this is not working,
and you feel it personally.
And this is also why Resources or the Google fiasco is, was a problem, right?
They don't realize that in the end you have, you know,
it's like the same graph where you see like everything
and it's just like one random open source project
that is maintaining the whole internet.
You see the one, right?
Yeah, this is the meme.
I mean, it applies to a lot of other social projects,
but this is the all modern digital multimedia infrastructure,
and then that thing at the very bottom,
that everything relies on is FFMPEG.
It's true.
And then there's usually a handful of folks
that are maintaining that.
And FFMPEG or VLC, right,
you have a community of 10, 15 core developers,
are not the worst open source project.
Exe, which is even in more installations, is one person, right?
There is one guy.
Lib XML is...
Yeah, LibXML, right?
There was a big stop.
No one is maintaining the XML anymore,
which is like the only library that is able to pass XML everywhere.
All the crazy edge cases of XML under ridiculous circumstances
and they get attacked by security researchers
because there's one other crazy edge case that they haven't thought of.
I was like, yeah, but the body of knowledge to actually resolve that is massive.
There is one guy maintaining all the time zones for everyone who is in the middle of, I think, was it Nebraska or South Dakota?
Like the mental health of the open source maintainers is something that large corporations don't care or don't see, right?
It's just like, oh, yeah, I'm just doing an open source report and so on.
Some of it is financial, but some of it and people should definitely support open source financially all across the board.
but some of it is also like spiritual on a basic human level.
There's something that happens like with this image of a fampag and so much of the
internet depending on it where people almost like talk down to the folks who are carrying
these projects forward and maintaining it.
In the security community they certainly did.
That was one of the things I think that argument came out is there was a portion of the
security communities.
No, these guys write crap code.
They need to fix their crap code.
I'm like, no, no, no, no.
This is a guy's hobby project.
You've, you've, have a security bot that's gone and found some AI generated stuff.
That guy didn't write crap code.
It's just an edge case to the 99th.99999 percentile.
He didn't think about because it's his hobby project decoding Star Wars games.
I get the hobby project aspect of it.
It's just hard work and it's beautiful.
And it's like the right approach there is to celebrate people for doing incredible, incredible work.
It's just incredible that humans step up,
not getting really paid at first or maybe ever,
and then they're doing it out of the love of it.
And we need to, like, human civilization runs on people like that.
We need to celebrate them.
To give you an idea, I received death threats on VideoLand, right?
And you mentioned that to me.
What is behind that?
So that must be, what, 2009, 2010, right?
Apple is moving from PowerPC to Core Duo.
that's probably in 2006.
And by 2009 or 2010,
I decided that we are not going to do new versions of VLC for POPC.
At that time, like VLC, we were close to the number 1.0 release.
We were four of us, right?
Like, just like, no, this is not possible.
So I received a death threat with some powder in it, right?
Remember there was some anthrax threats at that time, right?
And it was because I had,
taken the decision to not maintain the power PC port anymore.
And of course, it wasn't anthrax.
Of course, it was some type of flaw and so on.
But I received that as a later of like your piece of shit, you should die,
power PC forever and so on.
And it was 2009 or 2010, right?
I was, I was young.
I was just like, why?
Why did I do?
Right.
Yeah, they can break your spirit.
It's like why.
My mother freaked out, right?
We had to go see the police and so on.
And now, like, I'm going to say that I'm quite happy that this happened at that time.
It forged me a lot, right?
I am, I can take a lot of, of hate on me.
I'm okay with it, right?
It sucks that that's part of reality,
because all the people that love VLC, all the people that love FFMPEG,
like me, you know, I legitimately,
probably thousands of times of me,
my life, had a smile on my face, because FFEMPEG made me happy. Period. And how many times did I get a
chance to say that? Zero. Until I realize there's a Twitter account. And every once in a while, I'm
messaging it. One of the things I like on the Reddit meme about me, which I don't like this
meme for a lot of reasons. And someone says, oh, J.B. is on Reddit, which I am, right? And I say,
hello, right? And then I got so many people who say, oh, thank you for VLC. And like, I take pictures. And
And then I shared that to the signal, to IRC.
Yes, we use IRC on.
I saw as a quick tangent, you mentioned IRC is like Slack for old people.
So you still use IRC?
Of course.
Yeah.
I have it on my phone as well.
Of course.
Every day.
Works fine.
Wow.
It works fine.
You have to power with a crank.
No, but there's a while.
There's no ads.
There's no tracking.
There's nothing.
It's the biggest issue, to be honest, right, compared to Slack is that it doesn't have
threads.
That's annoying.
It doesn't have emojis for reaction.
Sometimes it would be nice.
V3 has?
Yes, V3, but no one does it.
And you cannot edit your messages.
Right?
And the rest, it works perfectly fine.
How do you communicate without emojis?
Well, that's why I said.
It's for old people.
Old people.
And we know emojis with like, you know, columns and that.
Yeah, exactly.
Yeah, exactly.
That's right.
So anyway, you'll communicate at an IRC.
What were you even talking about?
Yeah, we were talking about death threats.
But having people thanking you and something.
Sometimes they got people who send me a message and, oh, thank you for VLC.
And I always answer because I want to validate the fact that you need to thank the open source community.
Yeah, please, everybody listening to this, celebrate, celebrate FFNPEX, celebrate VLC, celebrate all the incredible open source projects, Linux, everything.
There's so many.
There's so many.
And you know what?
I mean, even outside of open source, just celebrate.
companies that create software that you use a lot and love.
Celebrate human endeavor.
Celebrate the human effort to not just build something that's okay.
Build something that is damn good.
Yeah, this is important, right?
Because as we said, right, we work for technically.
We do something very complex for the normal people.
Like we want our excellence in tech to be useful for everyone.
And this is why, like, this is why.
we work, right? This is why I wake up in the morning
is because I want people to use our stuff
because it's making everyone's life easier.
I want to solve hard problems.
Work on something interesting,
work on something, interesting technical challenges.
We're engineers. We love to build things, right?
When I was young, like very early, I knew I wanted to
be engineer. I wanted to do cars, right?
Maybe at some point I will go back to cars, right?
But this is like we want to build things that are cool and useful.
And they need to be challenging, right?
because you want your brain to turn on.
One of the two of you first fall in love with programming,
with building, with engineering.
When is the first time you programmed, Kieran?
Microsoft K-BASIC.
I was on Windows 3.1 and Windows 95 at Microsoft QBASIC.
Wow.
Would you build?
Like a multiplication, just counting loops like 10, 20, 30, 40.
Nice.
Then I thought I could do everything after that.
I jumped from doing that to, I want to create a soccer,
football, soccer, video game.
And I drew everything out.
It's like how I want to do it.
And I didn't quite grasp that actually...
I didn't grasp that actually is a massive piece of work to jump from basics
and drawings and pictures to a video game.
But there we go.
I think I did also basics and then Turbo Pascal when I was...
Yeah, end of elementary school.
But mostly the first time I actually did some serious programming
was the first year of, you call that middle school when you're 11.
I lived in Italy for a year in Florence and it was amazing year.
And like the mass teacher told us to work in a programming language called Logo,
where you had a turtle that was designing things on the screen and you'd turn left and right.
And in the end, we used that to do very complex programming because of
of course you could do things.
And this changed.
I knew I wanted to do things with computers and programs.
I don't think we quite talked about X264 properly.
We talked about David.
Can we return, backtrack a little bit to X264,
this thing that powers basically all of the video on the internet.
So can you tell me the story of X264?
And Kieran, you're actually a contributor to X6.4.
So X264 is a video.
encoder for the H-264 video standard. It dominates internet video, but also other areas such as
Blu-ray discs. And Blu-ray discs are interesting because the people that make them really want the
highest quality. And there's some really cool high-end films that have been encoded broadcasting
and all sorts of other areas. X-264 was a big step change because it kind of happened at the right
time as well. A lot of the development took place when HD video was coming out. Intel, CoreTU and
NihLMCPUs were getting fast. You could do
real-time video, but the most important thing was a key sort of focus on visual metrics.
So industry and academia for 20 years before was obsessed with mathematical metrics.
So what's known as peak signal to noise ratio.
So mean squared error, logarithm of mean squared error.
And that led to tons of issues because mean squared error leads to blurring because
you actually want to, you want to add a little bit of error to everything to reduce
the mean squared error as opposed to having a big error.
And that led to loads and loads of blurring.
But hobbyists buck that trend.
It was for their own personal videos, mostly anime.
So there were two things they did differently.
And there was a big iterative feedback loop with the community.
They did some stuff differently.
Two big things, psychovigial rate distortion.
So using block energy trying to compensate for human perception when making decisions.
So the psychov visual distortion, that's the critical thing.
That's the thing.
I mean, it's kind of revolutionary, like, that we can, like, rethink.
Don't make it like this kind of theoretic thing of compression.
Make it all about pleasing visually to the eye.
Yeah, yeah.
So compressing in a way that loses the least amount of information for the stuff that matters for us humans.
Yes, exactly.
As opposed to what industry, some parts of industry are still obsessed by this,
which is mathematical numbers that don't look good in reality.
And then adaptive quantization was the other big one where it was biasing bits against complex areas and redistributing them to less complex areas like grass.
Grass has some high frequencies, but it's kind of, it's less complex overall compared to more complicated things.
And this came around by Park Joy.
So Park Joy was really the canonical sample that was running around in the park.
Yeah, so this guy was really the...
So this was created by Swedish television in the beginning of HD and it was done on film and it was no expense spared in terms of production quality and it was given away for free.
This was really, and this is the sample really that sorts the men from the boys in terms of it has so many challenges with the trees, with the water, with the grass, with the motion, with the, I don't think there's still been any public test sequence as good as that.
these days. So for people who were just listening, we're looking at a bunch of humans running
along a river as you have the reflection, a lot of really high information textures everywhere,
the leaves and the lighting, playing with the leaves and all of this.
You could show clearly that encoders with high PSNR will blur everything.
And you could see actually, I could turn on psychovisual stuff. I could turn over that to quantization
and it would just look so much better, but your metrics,
And these metrics are at the time we're considered so holy.
These are the holy metrics that are untouchable.
PSNR is the most important thing.
Can you speak to how do you measure psychivisual stuff?
Like, how do you turn how pleasing a compression is for a human eye into a number?
Is that even possible?
That's what Netflix have been trying to do with VMath.
They said they've used a machine learning model.
That's the more recent thing.
But back in when X-OXXXXXO is being developed, that's by eye, you're being able.
by I, it was developers on their laptops.
So it's not like even with big companies,
professional screens or anything.
And that was actually one of the goals, which was,
the developers at the time,
Lauren Merritt in particular,
I don't want to test this on a $30,000 screen.
I want this to look good on someone's laptop at home.
Yeah.
Brilliant.
And there is another sample,
which is a sample that is a planet Earth's
killer sample that I absolutely love.
And you are going to see why.
Yeah, I love this.
It's a ton of birds, right?
Flying and the more it goes, the more there are birds.
And at the end, right, it's almost like you have millions of birds.
It's the most complex thing ever to encode, right?
And well, you're watching it on YouTube and you see how bad the YouTube encoding is actually, right?
And this is like phenomenal to optimize and get perfect quality.
in a constant beach rate.
There was a lot of optimization,
mostly by Loren also,
on anime, for a long time.
Anime was very badly encoded
because there was a ton of bending,
right?
And you see those issues,
and there was a ton of things.
So X264 is like,
and today it's still the reference
to any encoder,
new encoder,
AV1, AV2, VVC, HVC,
everyone compares to X264.
One of my favorite films,
cinema paradiso. I know the engineer who created the Blu-Rey.
And he showed me the comparisons of X-264 versus others.
It's completely different.
And I think a bunch of guys in the Blu-Rae world started using X-264.
I think the big one was Chris Henderson from Warner Brothers.
He did the whole French box set with that's a quite like a thing, a person on the street
actually watches and wants to look good.
And so they kind of took a risk in their jobs doing that because they're in a big company.
That big company can buy whatever they want.
And they said, no, no, no, I want to use this.
free and open source thing so that things look good for my my customers and build the best.
And to this day, I personally still try and avoid watching the most cinematic films on streaming
services and buy the physical discs because they look, they look good without even having
to buy an expensive TV. I think that's the key thing. And X264 is yet another example of
open source project. It was started by Laurent Emar when he was at the Ecole Central Paris,
where VLC was born. And then you got a generation of people like,
like Jason, like Mons, like...
Henry Henrik, and this is Anton,
and this is where the assembly thing
that we use now on FFM Peg David and so on was born, right?
So X264 is like amazing project with people
who were really all over the world
and I think most of them never met each other.
But all of them, according to Kieran,
or large percentage love anime.
There's several things I've never got into
and one of them is anime and I need to.
anime so much.
Especially at the time,
like, at the time
it was like a lot
of anime content doesn't
exist commercially, right?
We are before Crunchyroll, right? So what
happens is usually people who love
anime, who take some
DVDs in Japan
and rip them because
there is no commercial offering and
some of the people who are what we call
fan subbers are basically
translating themselves to make
subtitles, right? And at that time, you download
completely illegally. It was the only way to do that, right?
And so all of that was handcrafted and it fits
the open source community, right, because they needed tools
to encode, to do fan subic, right? One of the most amazing
open source project for subtitles is called AEG sub.
And it's a subtitle, it's done for anime, for
South Asian in Japanese languages.
There are weird textures in anime that
don't think you get in real life content.
I think that was a key one,
which was optimizing these weird textures that you get,
because anime is not done in a normal fashion.
Yeah, the way you produce it is not,
you mostly produce it on screens, right,
since a bit of time,
and you have all those gradients, right, in colors,
because they are very easy to produce digitally,
very complex to produce in real life.
And the subtitles also are very complex,
because you need to have often the Japanese,
and then,
you need to have the diacritics, right, the what we call the Ruby, right,
which is the Hiragana and the katakana for the kanji.
And then because, of course, you, so that you have the official subtitling,
but you also need the English subtitles or the French subtitles because you want to learn that, right?
And there are so many things crazy on subtitles,
and we had, like, crazy samples on subtitles that we've seen around.
So this is an important part of the culture, but also because there was no official
offering. There was no way of doing that.
Can you speak to the difference in H-264 and AV-1 and then X-264 and David?
This is this big step.
Can you help people understand are some of the streaming sites moving more towards that
direction of AV-1?
Let's be honest. All of those codecs since MPEC2 video are the same concept.
the same concept about inverse transform,
about intral prediction,
motion composite, entropy coding,
all of them.
However, each generation gives you
a bump between 25, a 50%
more compression for the same quality.
And so you had the MPEC2,
you had the Divix area,
you have H264, which was like changing,
right?
H264 improved
so much. And then you had more, right? You had HVC, you had VP9 at the same time of HVC. VP9
is a bit similar to HVC in terms of quality compression, but it's royalty-free because in
multimedia there is ton of patents and the licensing after H264 became out of hand, right,
and could cost hundreds of millions of dollars per year. So it made no sense.
So Google did this VP9 and the Alliance for Open Media did this new codecotech called
AV1.
So you can imagine that
AV1 saves between
40 and 60%
less bandwidth than H264
for the same
quality, visual quality.
At a given bit rate. At a given bitrate, right?
So that's really like
you increase the quality. Either you
set the bitrate and you increase the quality
or you set the quality and you decrease your bitrate.
But because now you move from
SD to HD and HD to 4K and
4K to 4K and 4K to 4K.
OK, HDR, like you're increasing the size by like two, factor two, three, four, right?
So you need to have better compression to keep it in terms of something that is manageable.
It's more coding tools, more bigger blocks, lots more subpartitions in each block.
It's just exponentially more complex.
It's more complex because the encoder needs to search more possibilities, right?
So, for example, one of the things that is easy to understand is to predict,
a color block to another, you have directions, right?
You can go left, right, bottom, up,
and then in terms of like the other quadrants, right?
What I call north-east, north-west, and so on, right?
But that's eight directions.
Then you can do more direction.
You can do 16 or 69 or 128, right?
You can, and every time your encoder is going to spend more time
to see, oh, well, these blocks is exactly this one,
and those type of tools that you can,
can bring, and the encoder needs to check which of the tools are going to compress you better.
And so I guess that AV1 encoding is two order of magnitude more than H264 in terms of CPU cycle,
right? Order of magnitude, right?
And as we discussed, CPUs are not getting faster. You're just throwing more cores at the problem.
But also, it's a fact that you encode once and you have hundreds of millions of users, right?
So, for example, YouTube, a very good example.
YouTube encodes almost everything in H264,
but the popular video gets re-encoded in AV1
because it costs more, of course, to encode,
but you encode once and you send that to millions, right?
So it's a trade-off between encoding time and complexity
and CPU usage on the server side and on the client side.
Because at the end, if you're distributing a video to hundreds of thousands of people
and the size is half of the other, then it's better.
It's better for your batteries, better for your modem, et cetera, et cetera.
So we can lay out, let's say, the top five codec container combos
would be H-S-6-4 inside MP4 containers,
AV1 inside MP4 WebM containers,
pro-res for nonlinear editing,
inside MOV containers.
So for people who don't know, I guess pro-res is,
It's Apple's codec for editing,
originally for Final Cut Pro,
and it's designed to be fast to decode,
fast to seek,
because an editor will need to move very quickly.
So it's a different use case to the distribution element.
There's no or very minimal temporal compression.
There's none in progress.
So you can cut, so you can do cuts.
This is what we call intra-only codex, right?
So I'm going to explain quickly what is IPB frames.
Yes, please.
So iFrames, often keyframes, but is complete frames.
It's like an image.
It's a GPEG.
You can start, you see everything, right?
And then the next image can be a P frame, which is a predicted frame.
So you take some part of the previous image saying, well, I need the block 5 and 7 and 42,
and you replace it, and then you just give the extra information, right?
But that means that in order to decode this P-frame,
you need to have access the previous I-frame, right?
And then, of course, you have more complex one,
which are B-frames, which are B-predicted frames,
which can depend on different types of frames,
some in the past, some in the future.
And so, Proreze is an intra-only code.
For the people who can see, this is a very good one, right?
So I-frames are complete frames,
P-frame basically depend only on eye frame
and B-frames can depend on in front.
And this GOP group of pictures,
I think the default for actually FFMPEG
for H.O 6-4 is like 250 frames, something like this.
Yes.
And to me, it's just magic.
Like, you can predict that you can have a complete frame every...
Several seconds.
that means.
Several seconds.
And then you could still,
you could have this chain of predictions
you make.
And the fact that you can,
the fact that somebody like me
can use FFemPEC to compress something
and not notice that the result
still plays back smoothly,
it's like magic.
You can even have,
and we use that in terms on Khyber,
is what we call intra-refresh,
where basically it's,
there is no eye friends.
You have no,
you have one at the beginning.
And you never send an eye-fresh,
How does that work?
You build up an eye frame gradually as the stream continues.
So you refresh certain parts of the image.
But so you never have an eye frame.
This is intra-refresh that we use, right?
But for me, the biggest mind-blown when I started was the B-frames.
Yeah.
B-frames means B-predicted frames can depend on frames that are coming in the future.
That means that in order to decode this B-frame, you need to wait for.
for the next frame that is dependent,
buffer that, decode that one so that you can decode the B frame, right?
So the way you decode the frame, the decoding order,
is not the same as the display order, right?
That means the encoder needs to be very clever
and decide that, well, you know, I'm going to depend on things like in the future.
So this is like mind-blowing.
Yeah.
The fact it works so smoothly every day is kind of miraculous in some ways.
It works so.
You can have a stream that works across the world on their decoder versus one in the US versus one here of different manufacturers.
And they produce bit for bit exactly the same material.
That's quite remarkable and do quite complex things and getting more and more complex and still be bit exact.
There's a lot of work that goes into that.
There's a lot of knobs you can control in this whole process.
There's a lot of really fascinating parameters that I've gotten to know more and more over the years that FFMPEG gives you complete.
access to. Maybe you can speak to some of them. So first of all, like, obviously, we can lower
the resolution, we can lower the frame rate, we can use different kinds of codecs, as we mentioned,
from H-F64 to AV-1. There's ways to tune the trade-off between bit rate and quality, as we've
kind of spoken to. You know, you could do constant bit rate, you can do constant quality, say,
RFCQ, QP. We can do the longer or shorter group of pictures, JOP that we mentioned.
All that kind of stuff.
It's crazy.
Number of B frames.
Yeah.
What is crazy is that a ton of people's job is to optimize those parameters, right?
A ton of people that you see at YouTube, at Netflix, at meta, and so on.
They're not writing codex.
They're just like finding the right parameters for the file they have, for the format they have, right?
Because like something that is for a movie or something that is user-generated content from your phone,
phone or a screen recording or something that you're going to video edit you don't want the same
things and there are thousands of people whose job is just to optimize all that yeah they're wizards
hats off to them uh youtube like to deliver all the streaming sites actually to deliver at scale
and like youtube is is really magical because it's not just doing like what netflix it does which
is one way like broadcasting type thing it's
also has to upload videos from all the places.
So they're also doing encoding at scale.
For videos, they're going to be watched by like five people.
And it still has to deliver them, like,
on a moment's notice, no delay, nothing.
No late, I mean, very minimal latency.
And also serve it in all different resolutions.
Like, YouTube is basically the web version of VLC.
Well, actually, it's funny,
like Google Video, which was something they did before they acquired YouTube,
was actually using the VLC plugin so that you could run VLC inside the web browser
using the Activix plugin.
And so it worked in Internet Explorer.
And you were actually running VLC inside your browser,
which is funny because today we have the opposite where we have VLC WebAssembly,
where we compile all VLC NFFMPEG to decode, to run VLC,
in type of inside the JavaScript virtual machine with WebAssembly.
Okay, there's this legendary story that you pointed me to,
that it was discovered via WikiLeaks release of all seven documents
the CIA was using a modified version of VLC to basically try and trick people,
what, to steal their data?
Yes, exactly.
So can you explain what the heck happened?
So this was a surprise, right?
Because at some point, Wikileaks mentioned some documents.
There were a few ones with something related to Blurys and VLC.
But the most interesting one was the CIA Volt 7,
which, if I understand correctly,
was the CIA had like a custom version of VLC,
where they had a specific plugin.
Yeah, exactly.
This is like we had to write a press release on that.
Video Land wrote a press release saying the
Only safe source for getting VLC media player is the official video on that website.
I mean, I suppose that's a security vulnerability for basically any piece of open source software.
Somebody can trick you to download in a fake website or targeted advertisement.
That was a targeted advertisement to watch a specific file you need to watch with this custom version of VLC.
And it was the normal binaries of VLC, except they added one DLL.
I think it was PSAPI.DL, which was basically reading your document folder,
encrypting that and sending that.
And the thing is, this is very clever, to be honest,
because once you're watching a movie, right,
you're going to do that for two hours,
and you're not going to touch your computer.
And sometimes it's normal because it's HD that your fans are going up and say,
and there is some of TCPU usage,
because you're using VLC, right?
That's normal.
But the thing is, what you don't see is that active.
a powered version of VLCs that is used by CIA.
We had exactly the same problem with Chinese hackers
that were targeting Indian people
and that got VLC ban from India
until I had to fight in courts in India,
the Indian government, to un-ban VLC.
They didn't use VLC.
They took just one DLL because we signed the DLL
DLL correctly.
And they used that DL.
to do another program.
So you had a VLC.XZ
and was calling LibVLC,
but it was calling it into a fake one,
and they used that to target.
There is not much we can do,
actually, to blog those type of hacks.
Yeah, I think people should,
for all open source software,
for all software in general,
people should pay attention
where they download the thing.
Yes, because that means
that they were not downloading it
from our website.
Do the search engines help you?
No, they don't.
Just to clarify it,
Because you can, you know, to prevent threats from people manipulating SEO to get up there and links.
Absolutely not, right?
We have a big issue for like more than 10 years.
He said there is a fake version of VLC in Germany that was reported for now for 12 years.
And Google basically decides to not, they know what's in it.
But the binary is too big for their virus analyzer to analyze it.
And so, well, if you're in Germany, you can go to a website.
that is a fake version of VLC with a custom installer,
and it's very popular in Germany because their website is in German,
and Google mentioned that before Videland.
And the weirdest thing is that it doesn't do anything on your machine for three weeks,
because that's how they do the detection.
And after three weeks, there is a small program that is a service that
installed at the same time that wakes up after three weeks,
and it starts downloading spyware and adware.
And Google knows about it.
They've decided not to do anything.
The guys use dark SEO in Germany to do that at some point.
And this is very damaging, right?
Because one of the things that they're downloading
is actually something that is replacing your ads inside your machine, right?
It's actually quite surprisingly effective.
Whoever is doing it with Twitter and X with X.
I'll get emails about your X account has been hacked.
And however they phrase it, it gets me to,
like at least click on the email not to follow the thing.
And then you're like, man, whatever they're doing with the psychology to try to trick you,
they're quite good.
There is a security version of VLC, right?
You received an email saying, hey, there is a security version.
Update on VLT.
Think about updating right now because it can hack your computer.
You come, it's a website that looks decent.
And you download it's a new version of VLC.
Great.
You don't know.
A month later, you're hacked.
You have no idea.
You're part of a botnet.
Yeah.
So make sure wherever you're downloading stuff, it's legitimate.
I'm part of the bot.
Speaking of which, so you've mentioned that VLC sandbox is something you're working on,
and it's actually something quite challenging.
Why is it important?
Why is it hard?
So VLC is a core with around 500 plugins, right?
One of them is FFMPEG, but we support so many other formats.
We support new protocols.
We support new filters, we support weird architectures.
And in this release of VLC, you have modules that are going to call your drivers, right?
Mostly the hardware decoders, which are going to call your Intel, your Nvidia, your IMD driver.
And or calling FFMPEG, right?
And there might be a security issue.
There might be a security issue in the Shader.
There might be a security issue in VLC in FFMPEG that is going to.
to basically crash.
The issue is that you're running VLC,
like every other program, like Adobe, right?
You're running it on your machine,
and it has access to all your documents, right?
So the idea is to be sure that you do a sandbox
so that we can protect from ourselves.
Because inside the VLC process is running some code
that is not even ours,
either it's open source for other projects
that we integrate in VLC,
or it's your GPU driver
or something.
that is provided by someone else inside.
And so when we crash,
we want to not allow people to do bad things, right?
Because one of the common way of hacking people
is to crush a program,
very often done with a web browser,
very often done with PDF files,
less often with multimedia, but that could happen.
And when you crash, you launch something
on the machine of the person.
Could be a ransomware, could be a botnet, right?
So security of desktop application is important.
On mobile, it's a bit of.
different because most of the mobile application are running inside their own sandbox.
But for VLC, we could run it inside one sandbox.
But the problem is that we need access to so many things that is basically we would
do, we would have all the permissions.
And so if you have a sandbox and you put some holes everywhere, it defeats a purpose.
So what we are trying to do and we're actually doing is splitting VLC into several processes.
one is decoding, one is demuxing, one is filters,
and all of them run into their own sandbox,
so that the whole VLC, a part of VLC crash,
like Chrome crashes on some tab, right?
It crashes a crash, but it did not crash the whole program,
and this is what we're trying to do,
and it's difficult because it's a sandbox
that needs to sustain gigabytes per seconds of mem copies.
It's not a website which is 5 megabytes or 10 megabytes,
We're talking about hundreds of megabits per seconds.
So this is why it is quite challenging.
And this is a research topic that we are working on
in order to have multimedia player that is secure.
This is all the kind of stuff you have to think about
when millions of people are using.
You've mentioned something somewhere
where all the different features of VLC,
when you have that many people using it,
somebody will use every single feature
and they will tell you about it.
Best feature in VLC is called the puzzle filter, right?
So you click the puzzle filter and it transforms your video into a jigsaw puzzle, right?
And you can click and move the pieces, right?
Yep.
It's very, very useful when you're watching a French movie, right?
You're bored because it's like very long things or a love triangle, right?
We've seen that so many times, right?
But you need to watch it because someone, your wife or told you to do that or your boy,
friend told you to do that.
So you're doing that, right?
And you can click and move the pieces around.
It's absolutely useless, right?
Like, who cares about that?
First, it was done by a mass teacher in high school in South of France to teach his students about Basie Curves,
which is something that everyone should know about, right?
It's very useful.
But the code was clean.
So it got in VLC.
It was merged in 2010.
Five years later, I received an email and saying, hello, J.B.
I have a problem with VLC.
The puzzle is too simple.
And I was just like, what?
And yes, the puzzle was in the UI
maximum by 16 by 16, right?
Only 256 pieces.
And he says, I'm sorry, but in a movie,
I love puzzles.
This is too simple, right?
So there is a commit of me.
You can take it online,
which is J.B. changing
that the dimensions are 256 by 256.
My point is
so many use features.
are used by a few people, right?
There is a way to watch VLC movies in common line
without any UI, right?
I saw that. You can do ASCII.
Asky art.
Is it useful?
Very useful.
Imagine your debugging,
imagine you're debugging a multicast network, right?
You have thousands, very complex,
very complex networking stack, right?
You can SSH to all of the routers
and put VLC on it with no UI.
And you're going to see whether it's black
or it's not black.
right?
So you see if,
oh, it's all green or not all green, right?
Amazing.
People don't realize there are so many things in VLC that are useful.
And they have users because once you have hundreds of millions of users,
you have people who use every feature.
I would love to sort of zoom in and talk a little bit more about the distinction
between kind of downloading a file,
watching it offline versus streaming.
So the complexity, the challenges of streaming.
Is there something we could say about what it takes to stream files?
Because we've been talking about codecs,
and I think a lot of that implies encoding and decoding
without having to communicate over the network.
Sure, sure.
So can you elaborate what's required to do over the network stuff?
Yeah, but it is less complex than it's.
seems compared to everything that we've talked about.
Especially because the most complex thing is not about streaming in terms of streaming
services, but it was what was done to actually broadcast through satellites.
Because in most of the modern broadcasting services, you can pose and you can go on.
But when you're sending live streaming, whether it's broadcast or live for streaming services
which are live, this is much more difficult.
difficult because you need to encode in real time.
When you go on a satellite, you have a specific size of the link, right?
You cannot have a burst of bandwidths even for a second, right?
Because you don't have the space for that in your total file.
However, there is different types of challenges, which are interesting challenges,
but I think they are less complex than the one we've seen with late 90s and early 2000s
about broadcasting and streaming through satellite.
They're different.
There are control systems challenges,
whereas some are more mathematical.
I think there's a difference.
In the streaming world,
what you have is what we call adaptive streaming.
Because the difficulty,
and it's not really a video problem,
it's mostly a CDN problem,
is that you might have too many people watching the same thing
at the same time,
and it's a congestion of the network.
So your player has difficulty downloading things fast enough to play them.
So what happens is that locally,
the player is going to read,
a lower resolution of it.
But there are some very clever algorithms to do that,
but most of it is quite basics, to be honest.
Even on the buffering side is pretty basic.
Yeah, you start to download a segment,
what we call a segment, and then you time, right?
And if it takes more than 50% of the time to download the segments,
you go down to, right?
And the difficulty is more about when do you go up in bandwidth,
in quality.
But this is not very complex to do.
When you encode, you're going to encode seven resolutions, right?
And you're going to give the bitrate.
The difficulties to have your encoder
gives the same bitrate,
but it's not as strict as it used to be.
So probably YouTube has to figure out how the human psychology side of that,
like how pissed off do you get when it's like very low bit rate?
and how long should it have weighed before it increases the bit rate,
even though the connection is better,
because maybe the changes in the bitrate is what, like, affects you psychologically.
I think, actually, the interesting one is the audio.
That's true.
You can kind of notice when they move from full-fat AAC to the,
there are compressed versions of AAC that use spectral band replication.
You can kind of see it goes a bit tinny,
and that up and down is very jarring.
The video side is a lot smoother and there's less notice.
It's really the audio.
You can definitely feel it from when it's moved you from a different audio profile to one or the other.
I don't know.
We're surprisingly tolerant at skipping audio glitches.
I'm surprised people I know who are not video engineers, how tolerant they are,
how tolerant they are to watching sports at 30 FPS, for example,
whereas it should really be 60.
The world is a lot more tolerant to that, but audio people are very,
it's an immediate feedback mechanism
of all. If you hear a glitch,
you realize it directly. Yeah.
I get to fully realize that, I suppose,
one of the things I'm afraid of when I listen to audio
more and more, that I get to notice
every single tiny detail, and that
you can over-obsess when
people in general are able
to kind of blur their
consumption. They can
look past certain imperfections.
But then when you combine
like an event,
that is, for example, a sport event,
that is probably going through satellite
or somewhere else
and goes to a central place for encoding,
and then you need to encode this older resolution
in real-time, you don't have time for QA,
you need to push that to CDNs,
you need to add probably DRM for protection,
you need to have that over a ton of different devices.
Then, yes, it is complex,
and also like your
in the web browser or in very much different devices
that you use for television,
where you had like a defined set of box
or cable box that you know where you control end-to-end.
So it's a challenge, but it's less,
I think the networking part,
while you agree to have 10, 20 seconds of latency,
I don't think this is very difficult.
Speaking of networking and latency,
so your new effort, as we mentioned, is Khyber.
which is aimed at ultra-low latency.
As you say, every millisecond counts.
And you're applying that to remote control machines
like robots, drones, computers.
Can you tell me about it?
Sure.
If you start from where we used to be, right,
you used to use FFMPEG to encode files, right?
And then we used FFMPEG and VLC to encode in streaming services, right?
And then you need to go lower and lower.
And the question was, where up to where we can we go?
And this question is very important because there are many use cases where you need to be fast.
And it's when you have feedback interaction, right?
We are not just listening to something.
You're actually controlling it, right?
And that's the biggest difference that compared to what we've done so far is that I need video to have a feedback on something that is happening life, whether it's a drone flying, whether it's,
controlling a humanoid robot from distance,
whether it's controlling a rover,
whether it's playing a video game in the cloud gaming,
because this is what I did on a previous job, right?
I was a c2 of a cloud gaming startup.
And this is a very interesting topic
because you push to the limit, the network.
You need to care not about the quality,
like we've done on video,
and we've talked about with X264,
you care about latency because a milliseconds is meaningful when you're controlling a car, right?
Well, you've seen, you've used Waymos, right?
When Waymos don't work, and that happens, even if one percent of the time,
there is someone that is basically remote controlling that.
And this is exactly the stuff that we're building.
It's really an SDK platform to do end-to-end control of.
machines. So this comes up
quite a lot in a lot of different contexts
in robotics, so obviously teleoperation.
Teleop is becoming more and more
important, including for
training robots
via machine learning.
Yes, and what we do is a bit different
from everyone else, is that
we take only one socket,
one connection, which is a quick
protocol based on UDP,
which is interesting
because it's done for low latency. It doesn't have
two of the, what we call the TCP end-of-line problem and the HTTP end-of-line problem,
it's siffered by default, but on the same wire, we send multiple streams, like multiple
track. We send audio, we sell in video, but we also send the comments, right?
Mouse, keyboard, game pad, and so on. And we do that while maintaining coherence, right?
Synchronization, because what people don't realize is that all the clocks actually drift.
And when you're controlling a robot, a robot is going to
to have like two cameras, five cameras, 10 cameras,
a ton of captors, GPS, and so on.
And if you want to train correctly your robotic AI model,
you need to have all those that are in sync and currents.
And what we've done, and it's all the stuff that we learn on VLC in broadcast in real time,
and MPKTS that Kierans know well, is that we account for clock drifting.
And so when I record a cyber stream, a robot,
I am sure that it's going to be predictive in the way you played back.
And so when you're going to do recording and training of your AI model,
you need to be sure that every time you retrain based on the data,
the data is going to stay coherent.
And clocks actually drift.
Like the existing solution works with one camera.
Once you're going to have five or sag, it's more complex.
So you want to make sure that the visual snapshot perfectly matches,
the time it actually happened.
Exactly. And also, if you're going to control, right,
I do something on robot. I need to be sure
that it is actually happening
at that precise time, right?
And so we have on the server,
which would be a robot, a time of, like,
re-time stamping mechanism
accounting for clock drift for that, right?
So that's one of the use case of Kiber
to control robots.
I also think, like, remote drones,
remote, whether it's defense or non-defense,
remote cars,
remote submarines.
There is many places in industry
or remote surgery
where the expert
cannot go everywhere the machine is
because it's either dangerous
or it's too costly, right?
So you allow people to have machines
next to you, right?
The goal of Khyber is to make distance disappear
because it's either projection of skills
or projection of power, right?
So imagine we are all like
within the meta-reban
and everyone else, right?
You need to stream there, right?
Because you're not going to run anything over there, right?
So you need GPU power that's on a cloud, on the phone, to stream that.
And so all of these use cases needs to be not about extremely low latency,
but real-time latency for video.
And so that means you need, we're toying with the encoders
so that the encoders encoder frame in four milliseconds.
And Kiran with his company also goes under those type of latency.
Because you need to optimize at max the local latency, right?
Because it's decoder, the encoder, and so on.
Because this time is going to be added to your networking time.
And it's not just about low latency.
It's also about reliability.
We do clever things like forward error correction, right?
So forward error correction is you over transmit a bit of data, right?
a few percent, and while over-transmit, you're allowed to lose some packets.
Because all of that is very difficult over an internet network,
where you're going to do things very far away.
And if you check that all packets are delivered, you add a ton of latency.
If you don't want latency, what we do is that we over-transmit some data
that you can reconstruct on the client side when there is things that are broken.
And we, a few days, weeks ago, we were doing the demo around Las Vegas for the CES,
about we had a rover that is fully 3D printed.
It's a very simple.
It's a car, right?
It's a small car with a telescopic arm.
And it was actually controlled from France, right?
And the video was with a webcam and a very small server, right?
A small PCB was basically running and sent that to someone that is on the other side of the class.
And so there is so many use cases.
You can also think about having AI
who are going to control many drones and so on.
And technically, we need to be amazing in video.
We need to be amazing at networking.
We need to care about any milliseconds in networking,
in encoding time, in decoding time.
And also you need to integrate very low level.
So sync everything together well, but what kind of latency can you get to?
Like when you say milliseconds, what's the goal?
So my goal is four milliseconds glass-to-glass latency.
What's glass to glass mean?
So it's easy, right?
You have a computer, which is running a program, right, probably a video game,
and this one is actually running, right?
It could be an example of a robot, right?
And you have the replicate that is done through the network.
And you want, if you take a 1,000-hertz camera,
you can take a picture and you want that to be at 4 milliseconds.
4 milliseconds means 240 hertz, right?
Yes, not.
So far, we achieve 7 milliseconds from Windows to Windows or Windows to Mac.
And if you look in the timing, most there is around 3.5 milliseconds inside the Nvidia hardware encoder
and around 2 milliseconds on the Intel decoder.
So like the encoder plus the decoder is already 6 milliseconds, right?
So in order to go down, we need either to have some other type of codecs
or some better encoder that are faster.
But four milliseconds would be the grail.
That's pretty nuts.
I love it, though.
I don't think anyone's ever achieved that, right?
That's fast.
You can achieve that with custom hardware, with SDI, with professional hardware.
But I want that to work over the Internet.
I want to work with any wrong.
robots where you're going to have a small Jetson nano in it or N-150, right?
I want that because there is going to be millions of robots or drones are just rolling robots
or flying robots or swimming robots, right?
It's just you, a machine that you control.
And either you need to teleoperate them or when everything will be fully autonomous,
you need to tele-observe them, right?
You need to check what's happening.
And in my view in the future,
all those remote cars will be
tele-observed by an AI model
which is just going to say,
well, everything is good.
And when it's not good, say,
hey, there is a problem
and then you have an operator, right?
And this is going to be about safety, right?
When you have your humanoid taking care
of your grandma or my grandma,
I want to be sure that everything goes well
and I'm not in those type of horrible scenarios
where the robots is dangerous.
Or when I'm driving,
I want, like,
car to stop when it should stop.
And if needed, someone takes care of that, right?
And so there is so many cases, scenarios about real-time.
And so the goal of Kaiber is to make real-time control of machine.
Distance disappear.
It's incredible.
And some of the same technology, some of the same ideas we're talking about is connected
to what you're doing.
And for me, it's amazingly challenging, right?
Because I would say that on video I'm doing okay, but networking, I have
much more to learn, right?
It's about like congestion protocols,
bid rate adaptation in real time.
But it's quite funny.
And so I created this project and we have fundraised in the US, of course.
But it's open source, right?
This is important, right?
Like we've not said that, right?
But everything on Kiber is open source.
But how do you make money?
It's a dual license, commercial and AGPL, right?
You remember what you said about licenses.
basically if you want to use Kiber in your product,
you must have your full product open source.
If you want to use this amazing technology,
but not open source,
you pay the commercial license, right?
So the small people or the hobbyist
and the very small guys who want to do that,
they can use the technology,
they build something that is open source and cool.
And if you're a large company,
you're going to have the support,
all the IP, the right to modification and so on.
So, yeah, it's really cool.
and also I'm building robots
and I love that, right?
Like the rover we have is 3D printed.
We are finishing a demo where it's an actual wing, right?
Like a type of drone wings that is also fully 3D printed.
We are trying to do a sailboat that is 3D printed
and we'll work on some humanoid.
Of course, they are not going to be very good robots, right?
It's not our job, but we're here for everyone to make robots.
Cool.
You're talking to the right guy.
I love robots.
There's a bunch of them upstairs.
And teleop is going to be really, really important,
especially as the number of robots kills across the world.
So 100%.
Let's talk about the future of multimedia.
FFMPEG, VLC, but some of the codecs.
We didn't really mention AV2.
So can we just lay out what is AV2?
What is the hope for it?
What is H2-6-6?
So AV1 is this codec that is done by the aliens for open media, right,
where there is Google, Netflix, Amazon, Apple, VideoLand,
where we try to make a royalty-free, very good codec, right?
And now it's being deployed.
But actually, the codec was finished in 2018,
but a codec takes years to be used in wide scenarios, right?
So AV2 is the next generation of this codec.
It's 30% better.
right?
So if you keep the same quality,
you got 30% bandwreid reduction compared to AV1.
What's the connection with the David and AV2?
We are going to do David too, right?
That I called DeVid, because de is two in French.
Ah, well done.
And you have to know that David is an actual,
what we call a recursive acronym, right?
Because it means D David is an AV1 decoder.
Oh, nice.
Nice.
I didn't even think of that.
And people should know that David's spelled with a one.
Yes.
And so Davey 2.
It's going to be spelled with the 2.
It's going to be DAV2D.
Sorry, I don't know how you pronounce that.
And again, we did a demo at the CES of VLC running the first demo of AV2.
So can you clarify to me the specification of AV2 and then the encoding and the decoding?
Sure.
So the specification.
is like the document
who explains how the Kodak is supposed to work, right?
And that's really AV2.
That is AV2, like H-264.
Then you have an encoder.
The current encoder is called AVM,
and there will probably be other encoders,
probably one called SVT, AV2,
and those are the encoder.
The same way X-264 is an encoder to H-264,
the same way that X-265 is an encoder
for the H-265 codec.
And the decoders for AV1 is David.
The decoder for AV2 is David 2.
The decoder for H264 is FFH264 inside FFMPEG.
The decoder for HVC is FFHVC inside FFMPEG.
And there is a next generation codec from the MPEG world after H2664, H265.
there is one that is called H266, also known as VVC.
So HEVC is H265.
VVC is H266.
Why is H266 super sexy is so much better?
So the question often we have is why are they two names?
Because most of the time, it is a conjunct work from the ISO world
and the ITU, which is the international telecommunication,
Union.
These are these two regulatory bodies.
One is a private entity and one is the United Nations.
Which one is the private?
ISO is private.
In theory, H-264 is MPEC4 part 10, H.264 slash AVC.
And this is a full name.
Nice.
So it's the concatenation of the ISO name and the ITU name.
Yeah.
Even though they work together.
This is politics, historical, you know.
For HVC, it's MPEG-H, H-2665, HVC.
Got it.
And there is H-266, which is also named VVC.
Is there a high-level thing to say about the improvement?
30% each generation is the best summary.
This is true both for the AV products and the H-26-456.
So the professional who are listening to us are going to kill us because they say, no, it's 35%, 25%, 25%, 25%, it's 50%, whatever.
But globally, you need to know that HVC is 30% better than H264, H260% better than H265,
because there is so many cases and so many serenarios.
For example, there are cases, especially for screen recording,
where the gains are humongous because you have the right tool that is done for that.
And so for a specific video, a new generation is going to give you 70% gain or 80% gain.
But there used to be a ton more codex,
but now the two main codex
for transmission are the H264,
H2365, H366, and the other
is AV1, AV2.
And I guess the major difference would be
the cost of encoding.
Yes, and the royalty of the patents.
And this is the reasons why you see
the AV version of codex is because
they try to be as royalty-free,
which means no cost for the patents
as much as possible.
because what you need to know, and we've not talked about that so far,
is that multimedia is what we call a patent minefield.
There is two places where you have the most patents.
It's everything related to 3G, 4G, 5G, RF, and multimedia,
because it's very mathematical and you can get great gains and so on.
So Google and Meta and Netflix wanted something where it was royalty-free.
There are people who say that they have patents outside,
but they are fringe patents, right?
So it's mostly true that is patent-free.
No, you should extend patent-checking was done
as part of the standardization process in AV-1, AV-2,
whereas patents are not even discussed in the M-PEG world.
Patents are off-topic completely.
Can you educate me at the patents side?
So usually, so MPEC does a format, right?
And then there is, everyone comes around and say,
well, I have all those patents or formats,
And they do usually a union was called MPEG L.A., MPEG licensing association.
And you put all the patents in, and then you ask everyone who's using this format to pay for it.
Wait, can you elaborate?
What does it mean to have a patent of a code?
Why is there many patents?
Imagine I'm doing something where I'm going to, instead of doing blocks which are square, I'm going to do rectangles, right?
Oh, so every idea.
Yes.
Somebody patents it.
Yes.
Oh, man.
Yes.
Oh, man.
People and how many lawyers are...
I mean, it pays for a lot of lawyers, right?
The biggest issue is not the following, right?
Because at time of H264, the patents were, let's call it, like, sane.
But there was so much money in that that for HVC, a lot,
there were a ton of things that were pushed inside the specification,
which are not useful in 99.9.9.9.
of the time, but just one could add a patent on it.
And so it became that for HGVC licensing, there was MPEGLA plus another patent pool
called HVC Advance, plus I think was Nokia was outside of the patent pool.
Yeah, a few of them are outside and some other one.
And so it was impossible to license, right?
And I think that several months ago HP decided that they were going to remove support from
HVC in their Windows laptops because the, the,
cost was increasing of those patents.
And it arrived at a point where, and there was uncapped pad.
And so for YouTube or Netflix, we could talk about hundreds of millions of dollars of licensing
for patents per year.
And they said, you know what?
As 100 millions per year, you know, I could create my own codec.
And this is what they did.
And so that's why we have the Open Media Alliance, Alliance for Open Media Media, where we are part
of, that is that created a.
V1 and creates AV2.
We create also audio codecs.
But yes.
So the main difference would be that.
And because you need to work around the patents or go do some things that are not patented,
a lot of things are different, right?
The basic things that were done in a back to 30 years ago are, of course, out of patents.
So, for example, there is things like a golden frame, a S frame, or different type of...
These are all patented ideas.
No, I can't believe it's not butter.
I can't believe it's not a beef room.
It's kind of what it is.
In some ways, it's like a...
Oh, so it's a different variant of a beef frame.
Yeah, that's a try and side step.
Things like that.
And so you need to have double creativity, right?
Creativity in terms of being more efficient,
but creativity of being sure that you don't infringe existing patterns.
And so, for example, VVC has all the patterns of HGVC plus new ones, right?
while AV2 tries to be as royalty-free as possible.
To what degree does FFMPEG and VLC have to think about this kind of stuff?
We don't.
And one of the reasons why VLC was in France is that France rejects software patents.
So most of those patents are illegal in France.
Because I once made the calculus that if I had to pay all the license fee for VLC,
I needed to pay more than 200 euros.
per user, the same in dollars.
But most of those patents are invalid in Europe
because those are called,
it's basically mathematical patents or idea patents,
and they are not valid in Europe.
Let me just at a high level,
just out of curiosity.
So the meme online and the interwebs on X and Twitter and so on,
and my own friends in Europe,
the sense is that Europe is not friendly to,
entrepreneurship. They overregulate. There's too much bureaucracy and so on. Is there anything positive
to say? Is there hope for entrepreneurship in the future of Europe? Is Europe over from a tech
perspective? Just look at the two of us, right? It's notable that there's two people from the
European continent on this podcast talking about video. It's fair to say, the community has waited
heavily. What you probably don't see yet is that there is a new generation of entrepreneurs
in Europe and mostly in France. UK has done it since a long time because it's more
Anglo-Saxon type of business, look at business, but especially like what happened in France
and of course sometimes it's a bit overdone with everything called French tech. But today,
most of the people who come on the market wants to create a startup.
15 years ago, it wasn't the case.
Everyone wanted to work on big companies
because when you failed in France, for example,
20 years ago, 15 years ago,
and you destroy your company,
which is normal for startup, right?
You were not allowed to create a new company, right?
There was a lot of stigma.
This stigma is gone.
There is so many things happening on AI in France and so on.
So there is, sure, over-regulations.
I know that, right?
I mean, that's up for now.
But it has some good things also.
I mean, is there some paralyzing aspects?
You know, if I look at the case of somebody
I've become close with, Paul Adirav,
you know, he was blamed directly
by the French government for the kind of things
his quote platform was hosting.
I could see the same kind of stuff,
basically, just as an example,
VLC being blamed for the kind of videos that people are watching.
But they tried, right?
Like, we had issues.
Like, we, like, I mean, that's the pressure that people worry about,
because if you have to think about that kind of stuff,
when you're kind of just obsessed about it.
No, you don't think about it, and that's, that's okay, right?
But what if they come in, what if they show up?
There is no office.
Vidurland doesn't have office.
I mean, what happened with Pavel, they arrested him, right?
So arrested them for particular videos.
or particular content that's being shared on the platform.
Sure.
I don't have any platform.
Everything is on the client side.
Yeah, but they can still arrest you.
On what ground?
I'm not sharing anything.
I'm not, the content doesn't go through my stuff.
For sure, but it's still lawyer fees.
That's the problem.
Yes, that's correct.
It's paperwork.
So, like, actually, if you had infinite trillions of dollars,
you would win easily because you're on the right side.
But the thing is,
there is a degree to which they suffocate you with paperwork.
That's the downside of bureaucracy through paperwork, through process.
Yes.
You know, it's the Kafka-esque thing.
You have to realize that one of the good things, for example, in France or most of Europe,
is that the answering to a court order does not make you bankrupt, right?
It's not like in the U.S. where it can actually bankrupt you, right?
The way the law system work is that, like, I receive lawyers letters every week, right?
And I can tell you that the cost of lawyer fees for Videland is less than $10,000 per year, right?
Right. So that's not really scary.
I mean, similar with Pavel, the intelligence agencies tried to, like, say, can you put a back door on the LLC?
Yes, two of them.
What do you say?
No.
I was a lot less polite.
I see you're basically saying hell no.
Like if we had to compromise our software, we would shut it down.
This is clear.
And what's the definition of compromise?
Allowing a government to do a back door.
There is no code that gets into VLC that we don't control.
And the way we compile VLC, you would call me completely paranoid.
Like we compile on boxes that are offline,
where we start by compiling the compiler.
We do everything offline on places
that have never been connected to the internet.
The way we do signing, there is double signature,
especially because, for example, we've seen,
and we believe it's a governmental agency
that is not from the Western world
to try to push a fake binary into our own servers,
and that scared us a lot.
And Vidaland is open source.
How can you kill it?
Like, I moved to where, right?
I moved to Malta.
I moved to, I don't know, Cayman Islands,
and I changed a domain name, and I start again, right?
Like, VLC is a tool.
It's a tool that is going to help people doing things.
We are not a platform.
And for patents, well, I'm sorry, but most of the patents,
like, you shouldn't be able to patent math and matrices.
Like, this is wrong.
Does VLC ever censor the kind of videos it can play and not based on the content of the video?
No, never.
We never do that.
Because like a VLC is completely offline, doesn't talk to any server.
So we don't know anything that you're using the software for.
So again, there's no government that can say, you know, like the French government come in and say,
we don't want, I think, anime is destructive to society.
we don't want any anime, not a lot of people.
No, they cannot do that.
And also what they tried is to say,
hey, I want to know if that person
watched that type of video.
And the answer is like, no idea.
So no on that too.
So for surveillance, no.
No, no, because the only infrastructure we have
is a downloading infrastructure.
There is no telemetry in VLC.
It would be difficult because of the international nature.
It would be difficult for you to incorporate that code
because there would be someone in the UK
and someone in Germany
and somewhere in the U.S.
as part of video and who'd be able to see that.
It would be extremely difficult.
The only thing that we can do, which happened is like we had the issue,
we had the case with some police in the US who said,
we have a murder case, right?
And the file is distracted or doesn't play in that version of VLC.
Could you help us, right?
We never have access to the video.
It's like a normal support, right?
Oh, it's really about playing the file.
Yes.
And like, I remember in the middle of the Afghan war, right?
I received an email from someone in the army, right?
I don't remember the grades, right?
It was just like, we have a big issue with the latest version of VLC
because it doesn't play correctly the file on an RTSP server that we have
where there is all the movies.
And he says, VLC is very important for the moral on the troop on the ground, right?
Because at night, I think it might be boring, right?
So they have a collection of videos to watch or movies over there, right?
And of course, I did an update and I broke some support of RTSP, right?
So I gave them another version just for them, right, because it was important.
And because VLC is completely open source, I think it is allowed on the U.S. Army laptops, right?
Because I guess someone in the U.S. military actually looked at it and said, well, okay, this is okay, right?
And the way we document how we process, that was okay, right?
So the only way we work with authorities is to help them doing support.
That's amazing.
That's really amazing story.
Yeah.
We don't see anything happening on how people use VLC, and this is strong.
Do you feel the stress of this?
So first of all, millions of people using it.
Second of all, the military using it, maybe sometimes pressure from governments.
Does that, that's a small team, right?
Yeah.
How big is VLC, like the core contribute?
How many?
Six, eight?
And everything legally is only me.
Everything that is legal is only me.
You're not stressed about this.
I used to stress about that a lot.
Yeah.
But the thing is, we're doing what we can for everyone, for the greater good.
We work that we make some extremely complex technology easy for everyone.
We are a tool and every tool is going to be used for great things and for bad things, right?
You cannot blame a tool, I think.
And this is like very important for us.
I used to be in a lot of stress.
I'm not anymore, right?
What's the secret to your Zen?
I mean, over and over, in the chats I've had with you in the conversation today
about every even tense topic, you're very Zen.
What's the source of Zen?
I have a way of thinking about what is the worst case scenario, always, right?
And the answer is, at the end, if I take like a chess play,
right? In the end, am I dead? Yes or no? Right? And I do that nonstop, right? And that's also how I do
my startups, right? Is that I'm here to get something great. What is the worst case? It got bankrupts.
That's life. A company lives, a company dies. That's okay, right? And so my moral way is always like,
am I dying in the end? Am I hurting someone? If the answer is no, then too bad, right? Like,
oh, some lawyers are going to be unhappy.
What are they going to do?
Take all the money of video land?
Wow, they're going to go 50 grand.
Amazing, right?
What are they going to do?
Is that the suits code is out there?
It's not stoppable.
Also, because what we do is good,
and it's done for everyone.
That's beautiful.
Karen, you said that there's an active archiving
preservation community.
I think that's super fascinating.
You wrote that there's stretch and budget,
but they see the extreme importance of FFMPag
as a Rosetta Stone,
so that multimedia can be played a thousand years from now.
I mean, that's a beautiful way to see FFMPag VLC
as a tool for preserving visual knowledge.
Yes, so one of the coolest communities in open-source multimedia,
mainly led by someone called Dave Rice,
I'll give him a shout-out, I think from City University of New York,
is the archiving community.
They've done so much stuff that they value that they value open source, one, because yes, they lack budgets, but two, they see the fact that archiving video is important for the world.
But being able to play that is a big problem.
Famously in the UK, there was something called the New Doomsday book, and they archived lots of stuff on BBC microcomputers.
Within 10 to 15 years, no one had the right software to play that.
I think it was 20 years or something like that, and someone had to go and reverse engineer this.
And that was like 20 years.
Imagine that in a thousand years.
I think one of the great things about FFMPEG is it's written in C.
C is the closest to mathematics you're probably going to get.
The closest to logic is...
Do you think in 1,000 years we'll still have C compilers?
Yes, we have languages that exist that haven't changed too much.
We have mathematical notation that exists.
It will be like Latin.
C will be like Latin.
It will be a thing that you learn from the past,
but it will still be usable in certain contexts.
So the archiving community,
are really great practically.
They, again, limited funds.
They funded the development of the FFV1 Codex.
So that's a lossless codec.
So the archiving community is really scared about the act of compression losing things.
And they have a fair point in this, you know.
If they compress too hard, it could change the view of the material.
They could be something slightly different here and there.
So they're really concerned about things need to be not just compressed well, but lossless and be fast.
And so they worked with FFM.
to develop a whole new code design for fast software-based encoding.
They're really concerned about resilience.
So if they're storing on tapes or other hard disks, I lose some bits.
I need to recover quickly.
I can't lose a whole gop because I've lost a bit, something like that.
So they're a really great bunch of people.
They funded GPU encoding in FFMPEG to make FFB1 encode faster.
And it's really about preserving the world's multimedia heritage in a way.
that's usable.
And there's a lot of great teams
and a lot of archival groups
across the world
who've chosen FFMPEG and FFV1
as their archiving solution.
And they can really provide us also
super specialist advice.
They can explain,
ah, in the 1950s,
colorimetry was done like this
on this certain type of tape.
And so there is this special case
that you need to handle
and you'll never get this anywhere else.
They know things on the,
video that we don't.
Yes.
Like every time I talk to was a Dave, Dave Rice,
or the people from the British film, it's just like every time I just learn something
new.
And I've been doing video for 20 years.
They have, especially on colorimetry and colors.
Storage, these other things.
I mean, they have a deep, deep appreciation of the content itself or the video itself.
And like, especially when you're thinking of lossless, they're terrified of losing something
essential about the thing
and in so doing, they're deeply
understanding the thing that is to be preserved
which you sometimes might not be thinking
about when you're obsessing about
the actual technology of the encoding and so on.
And when you enter the habit
whole of film scanners,
right, so you take those
things to make to digital
life. It's like a huge
topic that would take
another five hours of podcast
just on that topic. And there's a lot
of film that needs to be archived, film is
degrading. It's maybe not stored in the right environment. The other thing is they can,
what they also do is because it's open source, they give this away their workflows to
countries who can't afford to have archiving institutions where archiving is done by volunteers. It's
done by other things. They go and teach, you know, in India, they teach children to do, to do
FFMPEC commands. They're really great. They're really, they're really the model community,
the model ethos of what we're trying to achieve. They are such a great bunch of people. So,
interested in participating and being part of something much bigger because they realize the work
they're doing in a thousand years is going to tell a lot. In a thousand years, we may be drowning in
AI slot. This stuff needs to be important and, you know, archived, well, what was life like?
Yeah, it feels like capturing the 20th century and the 21st century is essential because it feels like
a transition point. Well, we went from scarcity of data to slop.
oceans of slop.
And that transition point is good to archive.
People don't realize, we are losing today a ton of films.
There is a ton of things from the 30s, from the 40s and the 50s that were, there is no value.
And tape, 70s and 80s, there's tape.
And there's not enough tape heads in the world.
To read all the tapes.
So they have to decide what they want to archive and throw away the rest of the tapes.
This huge moral hazard, I guess, for one of the better phrase around this.
topic because this is a digital record of human history and they have to make decisions.
And there's digital stewardship, I suppose, for one to, I've made that phrase,
I've made that phrase up, that's not a real phrase, to make sure the world can have this information
in something that's playable by everybody, not playable on some device that doesn't exist anymore.
And then there's, like, realistically speaking, there's a needle in a haystack where there's a lot of
value in archiving all that footage and then over time finding the gems that we don't know
are there.
Hey, there was something in that corner that we just didn't.
Yeah.
And now that would have been compressed away because some little thing.
Oh, wow.
There's something there.
And they've made sure that it's lostless.
They can prove mathematically that it's lossless.
They can run different tradeoffs for if there's bit for, they lose a bit, a single bit flips.
I can make sure that I only lose a portion of a given frame.
We can do error recovery on previous frames.
They can do all sorts of different things.
Do you think VLC and FFMPEG will be here 100 years from now?
FMPEG, yes.
Yep, VNPEG, maybe.
What's the future of, where is FFMPEG going?
Where is VLC going?
Like in the next, if you think about like five years, ten years, 20 years.
Five years, ten years is easy.
The question is after that, right?
The question is, do we arrive,
at something called holograms, right?
Yeah, so will VLC and FFMPEG expand to whatever multimedia?
Yes.
So multimedia might become, sorry for the pothead expansion of topic,
but if you look at something like Neuralink with Brain Computer Interfaces,
it's very possible that we start to consume what multimedia means
is whatever codec, whatever data that our,
our brain wants to consume through the brain computer interfaces.
That's one.
And virtual reality, of course.
You will have VLC for neural link.
And you'll have FFMPEG-I input format human brain.
There's going to be codex for the brain.
Sure, 100%.
Of course.
To compress neural information.
Today there is like, there are new codex for, for example,
what we call point cloud, right, or volumetric videos, right?
There is a ton of research on what we call RGBD, right?
the codex for depths that is useful for robotics and for 3D things.
The return of codex for compression of 3D elements.
Compression for astronomy.
For example, on VLC, we also have already a VR and XR version of VLC.
And also on Kaibir, right?
We talk about Kaiba.
On Kaiba, we also do streaming of XR content for the glasses who cannot have enough power
or inside the Apple Vision or the Quest.
So we already work on streaming 3DXR interiors.
interactive, low latency.
There is something called volumetric video,
point cloud videos.
So it's not stopping.
And yes, at some point,
we'll manage 3D data inside VLT and FFMPEG, right?
It's obvious.
So that's where it is moving.
Like the community is open.
Not everyone in the community sees that.
But like as Kiran and I,
we are entrepreneurs,
we know where it's going.
We see that, right?
So I suppose that there is a tension,
probably inside of a FMPEG is like,
hey listen
folks
we're really good
at doing video and audio
so like
why expand
like let's do the thing
we're really good at doing
in order to answer that question
we need to answer
the definition
of what is multimedia
and multimedia
is a digital
representation
of several streams
for the human senses
and we will
do that, right? So imagine there is now a way to not have a mic, but have an odor sensor and a diffuser of odors.
It will get into ffmpe. So your demoxer is, yes. Yes. Of course. Your demuxer has a new track type that is basically odors, right? And you already have...
Smell touch. It's like audio. You love a left and right nose track. You have a left and right audio pair. It's easy. Yes. Of course. Stereal smell. So in VLC, for example, we already have a plugin for Aptych.
It's mostly for what we called 4D cinemas, right?
You know, those ones on hydraulic, I don't know how you say that.
Hydraulic arms.
And where everything is moving.
Like you have in theme parks, right?
And there is a data feed synchronized, which is basically transporting this information.
Is there yet a standard?
There are many standards, right?
You make me so happy.
And so, of course, like, we have a plugin, which is not in the normal version of YLC.
that he's basically transporting those type of movements,
which is physical movements, which is apic movements, right?
It is a human sense, so it will get in.
That's such an exciting future.
Was it, I mean, it's a small community of developers.
How do you pull that off?
Like, if you're a contributor at FFMPAG or VLC, it feels stressful.
Like, just looking at Twitter, it's like, it's a huge amount of work
to make it work on all these different operating systems,
an incredible effort.
No, see it in the other direction.
We are not the contributors.
We are the maintainers, right?
So we maintain for everyone,
meaning that, for example, every year,
there is around 150 people who contribute to VLC
and maybe 300 on FFMPag, right?
Our goal at the small team is to get all the contribution in.
So if there is more usage, there will be more contributions, and those people will do the right module, the new formats and so on.
We care about the architecture of VLC, the architecture of FFMPEG, right?
Now we're doing things in VLC, which is a special audio, right?
We did the demo not long ago.
There was changes needed on the architecture, and we did the first special audio module.
When it's going to add the second one, it's going to be easy, or the third one is going to be easy, right?
goal and it's going to be the same for
orders or aptic, right?
We need to work the architecture
so that modules can be added
to add future capabilities.
So yes, we are
going, we are multimedia frameworks, so that's not
just audio and video. It's everything
that is timed
and
represent something that
you can sense. And if it's brain waves,
it's going to be brainwaves. I think that's
inevitable, sorry. I love this
on so many fronts, because
So FFMPAC and VLC are pushing companies and pushing the world to standardize.
So for example, to standardize brainwaves.
So standardize it would push, like I hope Neulink comes up with a standard for multimedia via brain computer interfaces.
Or for robots with haptic.
By experience, what happens is always the same, right?
You start, it's a new topic that is like, five.
different standards because everyone starts to do this.
The hype goes down because every time the hype goes down,
then people start to say, well, you know what, we need to do a standard.
People, because two or three companies, usually not the leader,
but the two or three followers do a standard.
And then we implement the standard.
And then it's the end of the curves.
It starts to be more pepper.
And then the leader's kind of pressured into it because it's better to do a standard.
Example, 3D audio, right?
Yeah.
Six or seven years ago, it was everything about
3D, you had the cardboard on Android, you had two audio formats.
They're all dead, right?
And now it's coming back with actual use cases and we learn from the mistakes of the past
standard.
So it will be the same everywhere.
And not try to avoid closed.
I saw somewhere you didn't have too many nice things to say about Dolby.
No, I don't.
Can you educate me on why, where they went?
What did they do bad?
That made you mad.
It used to be an amazing company doing tons of great things with amazing engineers.
They define what sound was.
And now it's mostly lawyers and licensing things.
Oh, so they're, yeah, they're closing stuff off.
They're trying to get money and licensing.
They don't innovate as much as they did and so on.
It's a bit like, I'm sorry to say, right, like HP, right?
no truth
oh since we talked about
Twitter a bunch in a bunch of different
contexts do you have a favorite
you have a least favorite
most embarrassing tweet
on either video land
off of fan peg twitters
the two my two favorites
are talk is cheap send patches
I think that that embodies a lot of the
stuff doesn't get as we've talked about
stuff doesn't get built unless someone does it
it doesn't just appear from the ether
uh the other
one that I like is FFMPEG.
Nothing is beyond our reach.
I think that comes from a U.S. military
satellite patch where I think they invented
some kind of
monitoring system. They could see the whole world
and this was released. Wasn't there something where
FFMPEG was running on a rover on Mars?
Yeah, so FFMPEG is used by the Mars
Rover, the Mars 2020
rover to compress
pictures. They really wanted
they wrote a paper about it and they
really wanted to use as much commercial off-the-shelf
technology as possible. FFMPEG runs on Mars
So we are a multi-planetary open source library.
Nice.
Very often we've seen tweets for people using VLC in weird places.
A lot of the people doing Formula ones in all the padlocks.
They use VLC to play the live feed.
We've seen the European Space Agency.
We've seen SpaceX, like, monitoring the launches with VLC.
And it's like fills you with joy, right?
I've seen a particle accelerator.
Oh, yeah, yeah.
We had one of the most amazing thing that I went for was to go to the SERN at the LHC
because they were using VLC to monitor all the captors on the ring,
because the ring is 27 kilometers.
And so they had some analog cameras,
and they were using some of the capture cards to go to analog to VLC.
So VLC could stream on their multicast network for the whole CERN to.
access to that. And like I visited that in 2010 with Laurent and and like we fixed their issue in
an hour or something like that, right? Because there was some parameters maybe not well documented
at that time. And he said, okay, for the whole day, what do you want to do? And we visited everything.
Like things were with antimatter and colliders and that was like one of the most amazing day of my
physical background.
Yes, it's used like everywhere.
Any tweets
carrying your regret?
No tweets are regret?
Or is it like that?
How does the French song go?
Regret nothing?
Yes, that's very important for me, right?
Don't regret anything.
No, it's because regrets are
a tax on your mind, right?
So learn from your mistakes, but don't regret.
Because you've done it.
So except if you have,
have a time machine to go back in time, don't regret.
It's going to just tax your brain.
Learn from your mistake.
Sure, don't regret.
It's like, it reminds me, it's beautiful.
It's a tax in your brain.
It reminds me of the Johnny Depp quote I saw where he was saying,
hate, you know, I don't hate.
Hate is a very expensive emotion.
Are you comparing me to Johnny Depp?
Because that would be your first one.
Well, gentlemen, like I said, I'm internally great.
for the software that the two of you and the bigger community have been part of building with FFMPEG and VLC and everything else.
I'm eternally grateful for the spicy tweets, never stop, and I'm grateful that you would talk with me today and give me this sexy hat.
I feel like a wizard.
I feel special and I feel special to get a chance to talk and celebrate the piece of software that brought me so much joy over the year.
So thank you for everything and thank you for talking today.
Thank you for having you.
Thank you so much.
Thanks for listening to this conversation with John Baptiste Kempf and Karen Kuna.
To support this podcast, please check out our sponsors in the description where you can also find links to contact me, ask questions, give feedback, and so on.
And now let me leave you with some words from the legendary Linus Torvalds.
Most good programmers do programming, not because they expect to get paid or get edulation,
by the public, but because it is fun to program.
Thank you for listening, and I hope to see you next time.
