The Pragmatic Engineer - Live streaming at world-record scale with Ashutosh Agrawal
Episode Date: February 12, 2025Supported by Our Partners• WorkOS — The modern identity platform for B2B SaaS• CodeRabbit — Cut code review time and bugs in half• Augment Code — AI coding assistant that pro engineering t...eams love—How do you architect a live streaming system to deal with more load than it’s ever been done before? Today, we hear from an architect of such a system: Ashutosh Agrawal, formerly Chief Architect of JioCinema (and currently Staff Software Engineer at Google DeepMind.)We take a deep dive into video streaming architecture, tackling the complexities of live streaming at scale (at tens of millions of parallel streams) and the challenges engineers face in delivering seamless experiences. We talk about the following topics: • How large-scale live streaming architectures are designed• Tradeoffs in optimizing performance• Early warning signs of streaming failures and how to detect them• Why capacity planning for streaming is SO difficult• The technical hurdles of streaming in APAC regions• Why Ashutosh hates APMs (Application Performance Management systems)• Ashutosh’s advice for those looking to improve their systems design expertise• And much more!—Timestamps(00:00) Intro(01:28) The world record-breaking live stream and how support works with live events(05:57) An overview of streaming architecture(21:48) The differences between internet streaming and traditional television.l(22:26) How adaptive bitrate streaming works(25:30) How throttling works on the mobile tower side (27:46) Leading indicators of streaming problems and the data visualization needed(31:03) How metrics are set (33:38) Best practices for capacity planning (35:50) Which resources are planned for in capacity planning (37:10) How streaming services plan for future live events with vendors(41:01) APAC specific challenges(44:48) Horizontal scaling vs. vertical scaling (46:10) Why auto-scaling doesn’t work(47:30) Concurrency: the golden metric to scale against(48:17) User journeys that cause problems (49:59) Recommendations for learning more about video streaming (51:11) How Ashutosh learned on the job(55:21) Advice for engineers who would like to get better at systems(1:00:10) Rapid fire round—The Pragmatic Engineer deepdives relevant for this episode:• Software architect archetypes https://newsletter.pragmaticengineer.com/p/software-architect-archetypes • Engineering leadership skill set overlaps https://newsletter.pragmaticengineer.com/p/engineering-leadership-skillset-overlaps • Software architecture with Grady Booch https://newsletter.pragmaticengineer.com/p/software-architecture-with-grady-booch—See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Transcript
Discussion (0)
We used to run something called as Game Day.
Now, this is basically a simulation of how an actual match is going to be.
We used to generate not just traffic.
We are simulating the entire operating protocol.
We'll say that, okay, the match is going to start at 7 p.m. let's say.
Okay.
Now, every system is supposed to scale before hand.
There's a timeline to it.
There is a checklist to it and so on.
No matter whether the teams are ready or not,
we will start the live stream and it'll start sending traffic.
Wow, really? You did that.
They don't know what the traffic is going to come,
what kind of a pattern is going to come and anything.
Because that's the exact set of our production.
You don't know what is coming and...
coming your way.
Ashutoshagabwal was a software architect and principal engineer at the largest streaming service
in India, Geocinema, now known as Disney Plus Hot Star.
In the middle of 2023, the system he architected set a world record at the time with 32 million
parallel-connected live streams during the finale of the Indian Premier League cricket tournament.
But how did they do it?
In today's episode, we cover how live-streaming works behind the scenes and how large-scale
live-streaming systems are architected.
The trade-offs in system design and how service.
Riverside load, stream leniency, and stream smoothness are all connected.
The importance of capacity planning and drills, including what the Game Jail drill meant at Geocinema, and many more details.
If you're interested about large-scale systems or live streaming, this episode is for you.
If you enjoy the show, please subscribe to the podcast on any podcast platform and on YouTube.
Welcome to the podcast.
Thank you. Thank you, Greg.
Can you help us imagine what it was like going back to the day this world record?
was said it was 32 million roughly that many conqueror connections at the time what was it like
you know like on the day you knew this big match was coming up there was a system that you did architect
you know preparing it for scale but then you know like you're getting a larger than ever spike
what what what what what what were you in the office was was it at home was it stressful was
it chill sure so uh uh most of the world record
which you have said in past is generally set on the finale of the event, right?
And by finale, you're probably settled into the event.
These events are not like one-off or one-day event,
like how it happens on other platforms.
Indian Premier League is like a 70-day event.
So it's a 70-day continuous madness, right?
And every day is more than 15 to 20 million viewers watching it,
depending on the game, depending on the players who are calming it.
So it's not like that, to be veterans of finale is relatively more calmer than the opening week.
the opening week is where the most of the madness is there.
By finale, you understand the traffic pattern better.
You understand what your requirements are.
Your protocols are much more depth, set in depth.
So in the initial part of the game, it is all chill.
Till you know what, till you're operating under the numbers which you've planned for.
The moment you start to see it getting into the yellow zone is when it thinks become interesting
because then you have to kind of get away from the automated auto mode of like a car driving
and running the platform to a manual mode.
which requires you to be looking at all the metrics,
looking at every issue which is coming in,
ensuring that you're triaging it on a, on a,
like on a very short period of time.
Otherwise, what will happen is that they will become noise.
So you have to also ensure that the ongoing issues is a clean board.
And in terms of setup,
we typically prefer to be in the office into the same location
because if there is an incident,
it is much easier to get into a room with all the key people
and be able to take calls and make decisions.
We have done this setup in remotely as well.
It's like we had a strict protocol on how we operate remotely
to ensure that there's not much of manners.
It's very easy for the Zoom call to go crazy.
Like a Zoom call,
the Zoom call on which all of us are connected is over 200 people
with multiple partners across the globe present over there to support the event.
It's just not our team or my team.
It's like multiple partners or partners.
who are present on the call who are helping us scale through the event.
This episode is brought to you by WorkOS.
If you're building a SaaS app,
at some point your customers will start asking for enterprise features
like Sammel authentication, skin provisioning, and fine-grade authorization.
That's where WorkOS comes in,
making it fast and painless to add enterprise features to your app.
Their APIs are easy to understand,
and you can ship quickly and get back to building other features.
WorkOS also provides a free user management solution called AuthKit
for up to one million-month active users.
It's a drop in a replacement for Alt Zero and comes standard with useful features like domain verification,
rule-based access control, bot protection, and MFA.
It's powered by Radix components, which means zero compromises in design.
You get limited as customizations as well as modular templates designed for quick integrations.
Today, hundreds of fast-growing startups are powered by WorkOS, including ones you probably know,
like cursor, Versel, and Perplexity.
Check it out at Workoals.com to learn more.
That is WorkOS.com.
This episode is brought to you by CodeRabbit, the AI Code Review platform transforming how engineering
teams shift faster without sacrificing code quality.
Code reviews are critical, but time-consuming.
CodeRabbit acts as your AI copilot, providing instant code review comments and potential impacts
of every poll request.
Beyond just flagging issues, CodeRiv provides one-click fixed solutions and lets you define
custom code quality rules using ASD grab patterns, catching sub-bull issues that traditional
static analysis tools might miss.
CodeRabit has so far reviewed more than 5 million pull requests,
is installed on 1 million repositories, and is used by 50,000 open source projects.
Try CodeRabbit free for one month at codeRabbit.aI using the code pragmatic.
That is coderabit.aI and use the code pragmatic.
Let's go into the specific details.
We're talking with live streaming.
Can you explain us how does video streaming work behind the scenes at an architectural level?
Sure. So let me share my screen to talk about how it works. Just give me a sec.
So there's something called a source feed. Okay. This source feed is coming from venue.
So from venue actually source feed comes afterwards.
There's from venues we have feeds like camera.
So like the venue is in this case like the stadium.
where the match is played, right?
Yes.
And then you've got a bunch of cameras there.
Yes.
So all the cameras are actually connected over fiber.
All of that come to PCR.
PCR is a production control room where like all the feeds from stadiums are coming.
And then there is a director.
So it's like a movie production where there is a director, there's a producer,
there are people who are operating the cameras.
The director is telling to the person on the ground that which camera to use,
what angle to show and which feed and so on.
So all of that is happening over there.
Lots of people, lots of yelling, lots of, you know, screens.
That's a place where we are not allowed to go in if it's on here.
It's like nobody is allowed to go in over there, to be very honest.
For PCR, we get us something called as a source feed.
This is basically what the actual production feed is.
Now, this source feed is something which comes to, so I'm talking about the video part of it.
Then I'll come to the server side or the packing part of it.
The source feed is then fed into the, from,
there is something called his contribution encoder.
Okay.
This contribution encoder's role is to convert the source feed into a,
because the source feed is a very high quality,
you need to compress it a little bit so that you can stream it to the cloud.
And then from there, the source feed goes into the cloud,
which is like a cloud ecosystem, like an AWS or a GCP or whatever,
wherever the encoding is happening.
This is kind of, again, a peer-to-peer private link.
So it's nothing over internet, by the way.
So this is like your master video feed, high quality,
but already a little bit encoded or compressed to manage it.
Yeah, so suites can be in like hundreds of mvPS or 150 mbps, right?
The contribution encoder will bring it to a stranded profile of like 40 mbps,
depending on whether the event is being streamed in 4K, 180P or something like that.
From there, it goes into Cloud.
And within Cloud, there is your distribution encoder.
Okay.
And this is the system which is taking care of encoding it for HLS, dash, or whatever formats,
the user is going to consume into.
And this distribution.
So with the HLS and other formats, these are like the formats that different players
will be able to support, right?
Yes, yes.
So the output is like an HLS stream or a dash stream.
Different combinations of it, like.
You have a different type of stream for mobile.
You have a different type of stream for TV.
Because, again, the form factors are very different.
The network on which they are going to operate is very different.
So all those combinations, we output, like, in past, we have outputed more than 100 variants of streams to the CDL.
And all of them, yeah.
Just so I understand.
So you are transforming into to multiple.
Like you create all of these different ones, right?
Like, as you set up to 100 of them.
Yeah.
So actually, actually, if you think about it, there are multiple sources.
feed by different languages.
So we used to stream in like about 13 plus languages.
So there would be one source feed coming for each of the languages.
And with platform combinations and all of those, the output feed gets into a range of like
500 plus kind of a number.
This whole thing is becoming a lot more conflicts than you would have thought.
Yeah.
And then there is an orchestrator on top of it.
This is where our engineering comes in, right?
This is mostly whatever I'm talking about is probably using some partner technology and stuff like that.
Then there is an orchestrator which is basically controlling all of these systems.
Not at the piece actually not at the venue and social.
It starts from somewhere from contribution encoding to your your cloud infrastructure.
But what endpoint to push to?
What should be the config of distribution encoder and what CDN endpoints to use to?
all of this is managed by the orchestrator.
So this is our engineering product which is orchestrating because on a single day,
we are not just hosting IPL.
Like we used to not just host IPL.
We used to host IPL.
We used to host other events as well.
There's a lot of football games going on.
There's a lot of other esports events which are going on and so on.
So right, we, while our focus on the Indian Premier League, but we also have other,
we used to have other 50 events to take care of.
So this orchestrator, orchestrator's role is to ensure that all of that workflows are set up
properly. And from here, what happens is this orchestrator sort of generates a playback URL.
So we now, the orchestrator takes care of knowing what the final playback URLs is going to be,
which is your playback endpoints. Okay. And then once the feed and everything is started,
it will send it something called as a content management system. This is what our users are
interacting with. So orchestrator will push all the generated playback URLs over here.
And the playback URL, so is this like the endpoint of a node or a machine that has the right format, you know, like the...
No, it's not an end point of a machine, it's an endpoint of the CDN where the video would be available.
It contains, it is...
So for every TV, mobile combination of Apple, Android, and whatever variants exist, for each of them, we generate a playback URL.
So playback URL is a certain spec, right?
And along with that spec, we push it to the content management system.
This is your client apps are generally interacting with your content management system for rendering the browsing experience.
Whenever you open the app, you're seeing the content list of contents and everything.
So this is how it does.
But that's not all, right?
This is just for a browsing experience.
And typically to scale, you would put a layer of good enough caching in between to ensure that.
things are scalable in nature right like that that's the one of the secrets
of making things work now this is about browsing experience so far what we
have spoken about is that you can open the app you can see the content you can
click on the content you can see the details of the same the moment you hit a play
button you call a much more complex system which is generally a playback system I
cannot talk about exact internals of it but this system is responsible for ensuring
the user is authorized to watch the video.
It will talk to a user system.
It will talk to encryption and DRM systems.
It will talk to the content management system.
And then give you a kind of an encrypted URL,
which is what the client will use to play back.
And are we talking about live streaming or is this playing back existing streams?
This is for live streaming.
The only difference in a, like, whatever I'm showing in the diagram is all about
live streaming in our VOD system you would not have this complex setup like over here how you
have the contribution encoder cloud and distribution it'll be a little bit simpler there will be a
single encoder which will encode and put it into the CDN and there would be similar
orchestrator for VOD content which will take care of this so the workflows are almost
similar but there are more complexities on this part on how to select which you are to
play back in live streaming then on on a VOD system yeah
One question I have is you mentioned CDNs, but CDNs are great for caching stuff and making sure, you know, spreading it out so that they're on the edge.
But this is a live stream, right?
Like every, I don't know, 100 milliseconds, you will have new frames arrive or every, I'm not sure how that's done.
But how does this square with the fact that, you know, like CDNs are great for caching, but they might not be the best for real time stuff.
Or are they good for real time?
So they are actually the
See the
The secret source of how live streaming works is in the HLS and the dash spec to be very honest
I can talk a little bit more about HLS so that you get in sense of how it works right
See HLS you have something called as a master manifest
Okay, so whenever you'll see a playback you'll it will look something like master dot m3 you wait
Okay, in this master manifest you have
The multiple information
So I'll say it will contain
Information of saying that this is a 240p video and this is it will point to a child
underscore 240
b dot m3 u8 and so on then similarly there would be a
this 4 ATP and so on right so you would have layers layers of manifest right and this is how the so what happens is within each of the child manifest you would have a list of segments okay segment 1.js it is we can use
is a segment a certain you know time period yes within a yes so how longer are segments usually yeah so that is configurable that is all subject to how we design how we
look at what kind of a latency do we want to offer to the users on the existing.
So typically they are between 4 to 6 second is what widely used in the industry.
So what happens is...
So like 4 to 6 second files right, you know, one after the other.
Okay.
So what the client is doing is essentially in naturalist protocol is it, when you initialize
the video, it calls a master manifest, then it gets the list of all these child manifests.
And it is keeping track.
So based on the start, start bit rate, based on the network condition, the user.
is, it will pick saying that, okay, this user should start from 480p or 720p or so on.
Okay.
And then the child is just calling this manifest again and again.
And if your segment duration is let's say, four second, it will call the child manifest
every four second and asking for an update.
Okay.
Fantastic for the update will be the next file, the next, you know, part of the next segment.
Yes.
So this is like a windowed manifest which has like 200 plus segments.
typically you select a duration which you want to keep in the in the manifest and the
player is kind of polling this manifest and getting the list of segments so as soon as it
sees a new segment it will go to the CDN request and get it down get it downloaded and
put it on the buffer over awkward zone so that is how it works so the role of the CDN is
that the CD is not operating at a hundred MS is what I wanted to call out
CEDN is operating at like a four to six second gap but still with so much of caching it becomes
tricky and this is where our engineering works is how we tune and fine tune in the
stadium configurations what is the right t-tel to use and so on right because if you
use too short a TTR then everything is expiring and you're always missing our
cash yeah and if you and if you use a very long TTR then you have a chances of
hitting and stale data yeah yeah so it's it's a fine game but now I guess I now
understand you know when when there's a big game happening may that be the World
Cup or something like that and you're watching it on
the stream every now and then, you hear a massive scream from the neighborhood.
And then two or three seconds later, you know, you see the goal or see the team scoring.
So I guess this explains, right?
Because there will naturally be roughly up to a few seconds of delay depending on which
end point you're hitting, depending on when you started the stream.
Yeah, good that you brought in this and right.
Like I'll explain.
So every time, any encoding system would be more efficient if it has a look back.
period, right? So if an encoder wants to be more efficient, compression technology is like saying
that, hey, I have a reference frame, and these next frames are delta from this reference frame.
So for it to be, for compression to be more efficient, if you can look back more and more, you would be
better, right? So there's something else called as group of pictures or group of GOP, right?
You say that my GOP is two seconds or my GOP is one second or four second, which is to say that,
okay, I'm going to look at optimized compression for this, the, so.
So every encoding stage will add to that latency, which you're talking about.
So that is one place where your latency is added.
Then the other latency is added on the client side.
Now, when you're streaming on our internet versus a TV,
in TV, what will happen is that if you miss downloading a segment,
like in segment EQ Island or certain frames,
you'll probably see a blackout and you'll move forward.
TV is not going to let you be, it is not going to optimize for you to be at the right place.
right?
It is always the latest point you're seeing because it's a broadcast.
It's real time.
It's real time, right?
So if you miss something, you miss something.
Like, if there is a, if there is a rain and you miss to get that packet on the TV,
you'll miss it.
But in internet streaming, every user is maintaining a buffer, right?
And to ensure that you get a smooth streaming, you will always have some buffer
configuration on the client side, which will be taking care of, like, your smoothness, right?
So what happens is whenever you start a playback, you're always,
starting a playback from five second to ten second behind the light point.
Because if I keep you on the light point and there is problem for any reason, right,
it's a distributed computing, it's a complex system, for any reason, if you miss out to
download that particular segment, you will end up seeing a rotor and you will have a bad
experience.
So it's a fine choice when you're doing it about scaling versus number of devices you want
to support versus continuous, like a better UX versus a, you know, a, you know, a, you know,
more line point, right?
Like, we found a sweet spot somewhere where it works for the users.
Again, I can disclose the numbers, but that's the complexity behind it.
Yeah, but I realize because in the sense of like, okay, well, this is the reality,
but, you know, this is what is it about, right?
Tradeoffs.
Like, you can, you can choose between having, you know, like some latency, okay, five to 10 seconds,
but then it works pretty smooth for most users and, you know, you don't have crazy infrastructure
requirements or if, you know, our requirement was, you know, like, I don't know,
someone came up with like you shouldn't have more than 100, 500 milliseconds, you would need
to architect a lot bigger system, a lot, you know, different tradeoffs, right?
So, so I'll give you example, right?
Let's say you have four second segments, okay?
And you've got, so you're polling for, you're downloading 15 segments in every minute.
You're downloading 15 manifest calls.
You're making 30 calls to the CDN every minute.
Now, if I change the CDA, the manifest, the segment duration to two seconds, your volume of calls
will double, which means, even.
even though you're downloading the same amount of data, the number of requests which will be hitting,
which is the amount of compute cycles which will trigger on the CDNs will increase, right?
And no matter whether we say the internet and cloud is infinite, it is not really infinite.
There is finite amount of space and resources and everything.
So when we're designing for scale, you have to factor all of this in, we have to factor all of this in from the perspective of capacity, from seeding capacity.
So sure, your networks have gone to 5G and all of that, right?
but there are so many choke points within the infrastructure
that you have to account for, you have to be aware of.
And that is where the playback system becomes very crucial
because it is aware of all these nuances
and deciding the best experience for the user.
Yeah, and I guess this just shows how different internet streaming is
versus traditional television.
You know, like when you had an alloc signal from the radio go out,
you send it one way, every device catches it.
If it catches it, if it misses it.
misses as as you said or when you have our traditional cable you know the cable connection to the tv
again you just push out the signal is the same real-time signal so it just doesn't have this level of
you know you don't need to worry about the internet infrastructure the the internet providers
your bandwidth all of this changing it's fascinating i do have one question related to this
adapt to bitrate streaming as a word that i've heard and you know i can kind of imagine what it means
then you kind of already touched on it.
How does it work and who controls the adapter bitrate?
Is it the server?
Is it the client?
So it's a mix of it.
Largely the capability is driven in the player,
but it's all about fine tuning.
Like, adaptive bitrating is, if I were to simply explain it,
you have a network speed of 4MBPS.
Every layer which you see within the video,
which is 240V480P,
every layer is an average bandwidth.
So 240V would be probably at a 200 KAPS,
720p would be at about an MBPS and so on.
So what your player is doing is, as in when it is downloading the segment,
it is also measuring your bandwidth.
Because you're downloading a 4MB of, let's say, a segment chunk,
how much time it did it took.
So it knows an average speed, right?
And when it sees that it is drifting apart,
like your download speeds are getting slow,
it will recalibrate and the player would jump to a lower layer or this thing.
So it goes to the manifest and says, like,
hey, could I have the 240P?
Because I can measure that I'm not fast enough to download all this.
Yes, exactly.
So that's where you see the blurring of the video and all of that seamless switch is taking
care of the player.
But that is the simplest implementation of the XO player or the client side player.
Now, if you do that and you try to run an IPL match on that, I don't think it will work.
You'll encounter numerous issues.
You'll get a bunch of customer complaints and so on.
So there are a bunch of parameters which player allows you to control.
like what threshold to choose to switch between layers what should be your starting bitrate what should be your
threshold threshold for error what is a buffer
What is a buffer duration which is left before you make a decision to switch?
So all of that those parameters is something which is fine tuned when we are doing engineering right and then some of these parameters are
governed from the server as well so the server is also looking at how things are operating around the world
like in that ecosystem and then making some calls.
So if you have to do degradations,
the server can choose to do degradations by limiting the number of layers
you're accessing it.
So what you're saying is when my server is getting overloaded,
you also have the choice to start serving either like longer segments,
lower bit rate, et cetera, in order to, you know,
like you're going to make these decisions.
Yeah. So again, from the place where I was saying that I'm pushing to
see in a numerous company.
There are some combinations that are designed from a reliability perspective that we can offer the best experience up to a certain concurrency, but after a certain concurrency, we might have to switch users to a little bit more degraded.
Yeah, yeah.
It makes sense, right?
Yeah.
So all of this is now part of the engineering which we used to do.
This is not something which CDN offers, right?
And so CDN is sure handling much more, a lot of segment delivery and stuff, but that orchestration is where the tricky business is.
Yeah.
So I guess is it safe to say just, you know, there's that thing where you're streaming something.
May that be a live video, either a live video or video on demand.
And when it gets blurry, I guess two things can happen.
Either it's your client's side deciding your bandwidth is not long enough or if it's a live event,
it might be that the, you know, the engineering team or the system decided.
that in order to sustain all this high load,
it needed to switch back to conserve the server resources.
Yeah.
So the third aspect to it, right?
Your client, now this is where your Ikelbrim kind of concepts
coming to the internet, right?
When, as in when the congestion starts,
let's you connect you to a tower.
Now, 100 people are watching earlier.
The tower is a finite capacity.
You could stream downstream to let's say.
By tower, you mean like a mobile tower.
mobile tower.
Like a 4G, 5G tower.
4G or a 5G tower, right?
They have finite capacity on the number of clients they can handle like an access point.
If they start to get congested, they will start to naturally throttle you or start bandwidth sharing,
which will also trigger slow download.
So this is not a parameter where I'm going and controlling and saying that, hey, the user should get blurry or this thing like that.
It is natural which will happen, right, depending on the infrastructure layout and where you are connected.
there's so many layers into you know what we started off is just you know like oh just streamed live
video oh so it's it's funny right when we get user complaints you have to look at at so many places
like was it our system was it our client or was it like some intermediatory network and if it's
intermediary network how do you prove that there was a network congestion because you're not we do not
control those systems we do not have visibility on the metrics of those systems so we have to
build intelligence on the client to figure out all all
of this. Yeah, it's, it's very complex. When I used to work at Skype and, you know, it was on
my LinkedIn, people knew. And sometimes people would either message me or friends of a friend saying,
oh, you know, you work at Skype. So I had this problem where I had this video call and it got really
blurry. Like, why was that? And I was like, you know, like, like, like you said, I didn't have as
deep of understanding of everything that happened. But you mentioned that you cannot monitor everything.
What is it that you can monitor?
Because I imagine, you know, just as an engineer, you need to have as much information as possible to, you know, like figure out what is the best experience.
But there's going to be practical limits and also just kind of data collection limits.
So what is it possible to monitor and what is practical to monitor on the client's side and then on the server side?
So the amount of data collection processing you can do is infinite.
There's no end to data collection and analytics.
What it comes down to is what will help you to in a case of an issue or an incident, what will help you to figure it out, right?
And the way I describe metrics is that there are leading indicators and there are trailing indicators.
Okay, leading indicators has things which will ahead of time tell you that there is a problem about to happen.
So we would we would have three or five metrics which are defined as leading indicators.
and we are very sensitive, we would be very sensitive towards those metrics.
Like on the client's side, it would be the amount of buffer time you're seeing,
which is how many seconds did you see a buffer in a minute?
Or what is a playback failure rate, which is basically when you start a video,
did you encounter a failure?
Or while watching a video, did you encounter a failure, which is a fatal in nature?
Like not the app crash, but the video crash kind of a thing.
So there are some of these metrics which will be treated as a leading indicator.
and these metrics would be collected on the client side,
there would be corresponding metrics on the server side,
obviously you'll be monitoring bandwidth,
number of requests coming in,
latency response times,
a bunch of other things which will be monitoring on the server side.
The leading indicators get priority over the traffic.
Like you would always want to have them process ASAP
within a minute,
sometimes under 30 seconds also,
so that you are reactive,
you're getting alert before anyone complains to you.
So that's where you optimize for.
Then you have trailing data, which is basically, let's see you figured it out that there is an issue.
Now you need to get into the details of it, right?
And the funny thing with the live streaming is that your time is also moving.
Right.
It's like it's just not that, okay, there is a system state or there is a state, right?
The state is also constantly evolving.
So you need to measure a lot.
You need to collect a lot of data and you need a lot of tooling to be able to visualize.
what is happening and how it is being rendered.
So there's a lot of data collection.
We get processed a little slowly.
It's available on dashboard for us to consume and figure it out.
But that's how we look at metrics.
And then on the clients that obviously you're collecting all of this data,
but then you need to push it back to the server, right?
Like every, you know, again, I'm assuming this is an engineering tradeoff.
Like if it's every second, every 10 seconds, every minute, that kind of stuff.
I don't think every second would scale for us.
Imagine like 30 million users watching concurrently and every second they're sending data as like, I cannot scale that system to be very honest.
Yeah, but it's interesting because now, you know, like we have service clients downloading and then we also have the upload of the metrics, which is also this fine balance, I assume.
When you decide, okay, even just this relatively simple example of how often should a client upload their leading indicators, how did you decide on, you know, what that will be, may that be?
every minute or so did you go to a whiteboard and figure out what does this mean for a server?
How much can we process?
How many device was it a mix of prototyping and doing low tests or was it a combination?
It's its combination, right?
It's not an easy decision.
In fact, during the event itself, the frequency keeps changing.
Really?
As in when you're scaling, you would be okay to compromise on certain aspects to scale the system and have some sense, right?
You will also use some form of sampling as well.
You'll use a balanced approach to figure out how to do.
And you can collect all the data at every second level,
but you have to process.
There is costs associated with it as well, right?
And just not cost.
Again, cloud is finite.
So even though CDNs are finite,
even the cloud service providers also have finite capacity.
So you have to take into account that do you want to prioritize
your playback systems and content management systems?
or do you want to prioritize data collection?
So all of this has a priority and an association to say what tier of the system falls into
and that tier design defines how the system degrades.
So there's always a degradation framework which is to say that okay if the match is not huge,
then yes, we will collect more optimistically at every 15 seconds or every 30 seconds or whatever that number is.
But as a game starts to become hot or more and more traffic starts to come in, then you were at runtime start to degrade
systems, which means that
degradation happens in different shapes
and formats. You will probably reduce
data which is coming in. You will change
the interval at which you're collecting data.
You would start sampling.
All of those parameters
start to come into the play. And all of this
is part of the design, right? This is all
the exercise we do before the game
on resource planning,
on capacity planning. Like capacity planning
is a huge exercise. And
that also decides what are the cutoff
points. Sometimes you will say that, okay,
need to support up to let's say, X million users or 50 million users or 100 million users.
That means you have to work backward from there and say that if I have to support 50 million
users, what are the P0 service requirements?
And then you say that, okay, this is the residual capacity.
So the P2 services have to operate in this residual capacity, which means that I can do certain
things.
I can't do certain things.
So on capacity planning, you already kind of outlined how it works, but you said it's a big
exercise.
What does a good capacity planning look like?
like and why does it take so long? Because what you describe here, honestly, yeah, you know,
just do the math, do this, that, but obviously it's more complex, right? So, capacity planning is a very
complex exercise. In fact, for most of the video streaming, the capacity planning for the next year
starts at the end of the first, like the previous year. Because you have to lay down infrastructure,
you have to work with providers to ensure that their data centers have to scale up, their network
have to scale up their power requirements have to scale up all of that has to be taking care of
it's not that simple so there's physical infrastructure being involved right we're not you cannot
just like providers will not go and scale their infrastructure sure they have a 10% plan 20% plan
but if you're operating at a much more intensity and adding more year on year then you have to
work backwards from there so that is one aspect of it the other aspect is during during the tournament
there can be overlap with other events so there might be reservations done by other companies
Like, there's a big e-commerce sale going on.
Let's say if there's a Black Friday.
Like, Black Friday is not there, but it's...
Yeah.
So the point is you have to look at the resource.
You have to figure it out.
You have to ensure that you've blocked enough resources for the time period.
And the other complex pieces, how much to ask?
Every year, the nature of the game changes.
There is no set pattern.
How many people will come in?
So there are certain models.
We will look at it and see that.
okay, this is the user traffic which has been coming in,
which was in the previous year,
and that's, we have seen platform to grow up by X percentage.
So we'll apply some modeling to it
and figure it out what should be the next year's optimistic number.
Actually, not an optimistic number.
It's a very pessimistic number.
We operate with a very pessimistic number.
And then work backwards from there.
So the reason it's complex is because you have to work with so many providers,
understand how much resourcing and all of that is there,
which means they have to go back within their engineering teams,
get that numbers.
Share that number of starts and then we do all that.
Just to make it really concrete,
are we talking about how many virtual machines you'll need,
how much bandwidth you'll need, how much,
like what are the things, the things behind the resources
that you're going to have a list of like,
oh, we need this much.
Primally, it boils onto the same core system metrics
like compute, RAM, disk, and network.
All of that boils on to that.
Every tier has those numbers,
you have to figure it out.
When you're running servers, it's going to be number of VCPs.
Generally, that is a limiting factor.
Disc and RAMs are not limiting in cloud.
Network becomes an interesting problem for the streaming part of it,
not the, actually it becomes interesting on APA side of APA also.
Because, yes, video you're doing like tens of TVPS and probably more than that.
But API is also consuming a significant amount of data download.
So and those networks are designed differently than the video networks,
because video networks are more designed from a heavy caching perspective,
that the offload should be super high,
but APIs is mixed traffic,
there's security, there is different tiers of security,
there is PIA data flowing in,
there is your firewalls and so on.
So you have to look at all of those parameters.
When I say parameters, again,
is network consumption compute resources available at each hop,
figure out what is the least common denominator
and work from there.
Yeah, and then you did mention before that the cloud is finite.
Now, I didn't hear too many people saying this because usually when you, you know, like I'm a startup, I'm building something that like I will be aware of the cloud is expensive, but I'm never going to think that it's finite because as far as I'm concerned, I can always pay more to, you know, scale up.
In your case, that's not really the case, right?
And this is because you're so large, right?
This application is so large.
Yeah, yeah.
The use case is so large that like year on year the growth is so much that it's very hard to preempt.
And as I said that you're doing pessimistic planning because you want to support best or best.
The infrastructure exercise, the growth of infrastructure is relatively hard because I'll give you an example.
So let's say in a city like Bangor or Mumbai there is a finite number of TVPS available and the CipS capacity available.
and the C-Dian capacity available.
Now, if you have to add more capacity,
the providers need to purchase real estate.
They need to add links to it.
They need to deploy servers to it.
The servers need to be procured, imported from another country.
They need to be configured, onboarded as part of the network, and so on.
So it's a slow process.
It's not, it can't happen overnight.
So basically what you're saying is, let's say you as a, you know,
geo or any other streaming service is like, okay, we think,
that this game that's going to be in six months, last year in Bangalore, five million people
watched them just saying a number. This year we think it'll be $8 million because of our projections.
We think the current CDN capacity based on our calculation is not sufficient. So now we're
going to go to the CDM providers and try to make a case for them to invest. That's pretty hardcore,
not even engineering. This is like, you know, it goes beyond engineering. Yeah. And you start
started the discussion saying that, hey, why don't you code?
Because I feel this is a much more harder problem than actually coding it, to be very honest.
Okay.
I'm starting to understand.
It's important to say connected, but it seems like this seems just as important.
It's probably more important to do this thing.
And, you know, go to the vendor, you know, sit in a meeting with them and explain to them why you need to order those servers from wherever.
Yeah.
Yeah.
And another common question I'm asked on the subject is that, hey, oh, sure, Bangladesh doesn't have capacity.
Why don't you go and get it from Delhi?
it's not that easy.
Anytime, like, if you understand, it's a caching infrastructure.
It's a tree, right?
So it's also edge, right?
Yeah, it needs to be close.
Yeah, so it needs to be close.
The moment you, the more upstream you go into a branch and try to get it, the capacity
available over there is finite.
So if I were to say that, hey, let's root this excess of 2 million traffic.
Let's say that's consuming over 2 tbPS.
There has to be that amount of backbone infrastructure present in the country to be,
able to rotate to another setting.
This episode is brought to you by Augment Code.
Wish your AI co-pilot actually knew your code base?
Augment code is different.
Augment is built for professional engineering teams working on complex systems,
not hobbyists building to-do apps.
Other AI assistants struggle with context.
Augment understands your entire code base,
architecture, patterns, dependencies and all.
We're talking instant understanding of systems with millions of lines of code
in under 200 milliseconds.
Teams like Webflow, Lemonade and Kong use Augment to ship better code faster.
No more context switching, no more generic suggestions, just an AI that works the way expert developers do, fast and in flow.
Plus, with enterprise-grade security and zero training on customer code, you can trust Augment with your systems.
Start building with Augment in your favorite IDE and see what it's like when an AI actually understands your code.
Try it free at Augmentcode.com slash pragmend.
So we are very touched on.
This is, I think, one pretty good example of like an India slash APAC specific engineering challenge.
What other engineering challenges have you seen that are pretty specific to APAC?
You know, like if when you're building a system in Europe or US, maybe you don't even need to think about it.
But in your case, you know, it gave you a lot of headache.
So capacity is one aspect of it.
The other comes as the mobility.
India is a country which is mobile intensive.
People don't watch it on TVs or laptops or this thing, right?
India is a country which kind of had a leapfrog in terms of generation, right?
People who are not using any computers or any mobile devices started onboarding to mobile devices directly.
They did not see laptops or tablets or connected TVs and so on.
Connected TVs are still picking up, but it's still not there at the level at which it might be there in the West countries.
With mobility comes to the problem of that when the device is constantly moving, it is chaining radios and networks at.
And you're saying you sell towers and all that.
Sell towers and all.
A lot of audience are like the taxi drivers or the cab drivers
or people who are on the move who are getting off the office.
The event happens in the evening.
People are getting off the work.
Yeah, everybody's holding a little phone and watching it.
Exactly, right.
Like most of the ads used to run like that.
So people are watching on the move.
So that becomes tricky.
Sometimes you're connected to a 5G tower and then you go to a 4G tower and 3G tower and so on.
right like you're on a train and you're moving you're moving at a speed your towers and switching that is one aspect the other is a battery uh most of the games happen in the evening right and and if you're started your day in the morning you probably charge your phone you're off the work you're not getting it out so your battery so that becomes an interesting problem to also tackle right like how do you ensure that you're not doing so much of computation on the device that you end up burning of the phone battery i i would have not this would have not been on my top
list of things to worry about, but I guess there you go. You need to worry about it.
You need to worry about it. Like, you're streaming or your downloads or video consumption
cannot eat up the phone battery, otherwise the user will not watch. And how, what do you do
around this? Like, what kind of considerations do you have? So, it's a lot of how much
background processing you're running, what's the phone brightness you're using, what's the volume
levels you're using, what is the color intensity, all of these parameters, kind of, the profile
of the video kind of controls how much battery consumption is going to happen, how frequently
are you polling, how frequently are downloading, what kind of complex encoding algorithm
you are using?
So we're using H-264 versus H-2-65, the codex, right?
Like H-EVC versus A-VC, each of the codex have its own complexity.
If you use more compression, then it means it will use more compute, which means it will consume
more battery.
So do you have this thing or it's theoretically possible then to have on the
client side based on the battery level request a different type of stream you could do that
technically right but the point i'm trying to make is that all of these parameters are taken into
account when you're these are not probably decisions that you make at runtime because at runtime
you would be more focused on like you would want to have as less variables as possible yeah so this
these would be considerations which you'll probably do when you're releasing features or when you're
adding more the same thing when you're building it yes you're building it because runtime like
like I would want to have only one or two parameters,
which I'm tweaking at runtime.
More than a number of settings,
it requires more coordination, verification,
testing, and all of that.
Sure, you can have automation, but trust me,
automation doesn't work after a point.
Like, you need to still have some level of manual testing in one.
Okay, so an event is on, and you know,
it's a new who will be a big event,
and you know, more and more people are joining in,
and, you know, the load just keeps going up
and, you know, eventually you set a world record.
But as this has happened,
happens, how can you scale things up?
How can you both plan to scale up a system and what can you do on the spot?
And I'm assuming horizontal scaling, I'm sure, will be a thing.
But for example, is vertical scaling basically having bigger boxes, bigger devices,
is also an option or a practical thing to do?
So I think the vertical scaling, I'm trying to think how much of what it is.
See, vertical scaling only happens on the database or the database layer, to be very honest, right?
Computers largely horizontal in nature.
We try to, like, for example, if you're using a...
Actually, databases have also now become horizontal in nature.
You can add more boxes and you'll start to split and scatter data.
But you would want to avoid any kind of that, like scaling at the data layer at all when you're operating at load.
So most of your databases and caches are preemptively scaled, pessimistically to a higher number.
compute is what you're controlling on the fly.
In fact, auto scaling doesn't work.
So you have to always,
auto scaling doesn't work for this kind of a workload.
Okay, can you tell me more?
I would have thought, you know,
the whole point of auto scaling is that it should work,
but obviously under the hood is a bit complex.
Can you, like, give us a bit of a, you know, explainer,
a 101?
Sure, sure.
So why does the auto scaling not work?
Yeah.
See, the auto scaling works by adding a box at a certain.
rate, right? And then default auto
scaling, which is provided, most of the providers
have a cool off periods and a bunch of
parameters which is not entirely flexible.
So what happens is that when the game starts
or when there is an innings break and the audience come back,
that's a sudden search.
And in our experience, the auto scaling has never been able to
respond to that in a very speedy manner.
So let's say before the, so I'll give you a class example,
right let's say before the innings break you were operating at a 20 million concrete i'm
making that number out right now innings break happened 15 million people chose to drop up
5 million straight back now when the innings recover you would have 20 million people coming in
but the auto scaling is not going to respond with that number in mind it will be like oh i'm
seeing this much of traffic rps let me add more boxes most boxes but you have to understand that
there are already 20 million people which are on the stream so after the innings break you have
to go back right so if you use auto scaling you will be screwed
So that's why we used to have our own custom scaling systems and custom scaling providers.
And that will look into a concurrency.
Concurrency is a metric which we would align to and say that this is a golden metric
to scale against.
We will build models to translate that to every system services.
Because it's easier to talk in that language than actually talking in another language.
Like every system having a different metric to scale against is complex.
Then saying that everyone in the company scaling against a common metric.
And we are very solid and sure about this metric.
So you can compute and saying that, okay,
I have X amount of concurrent users,
which means that this is the traffic I would expect on XYC services.
This is my user journey.
100 people open the app.
By number of people go on to the playback page,
then Z number of people open this.
So there's an entire formula which you can build out.
And then you can from there map back and say that,
okay, this is the amount of request you would see.
And there will be certain systems.
I'll give you an example.
when the innings break happened, again, innings break,
or some kind of an event happens,
let's say a key baller or a key batsman,
a key batsman get out, right?
What people do is people have a tendency to press back button.
The moment they press back button,
they go on to them?
Press back button, do you mean the exit stream
or they want to just rewatch it again?
No, they just want to go back,
like close the stream and go back to the homepage.
Oh, gotcha.
So exit, like either people will just swipe up and close the app,
but a lot of people will press a back button.
The back button will take these users to the home page.
And homepage has a different set of APS to be called,
which was not seeing the traffic at the rate it was seen before.
And they're going to have a big spike.
Exactly.
So all of that intelligence can't be taken, like answering your question on why auto-scaling doesn't work,
is because of these UXs and user journeys which cause problems.
So we have to ensure that.
Basically, you're saying that the whole company or the team will know that this is our
concurrent user target and every system just needs to be.
prepare to potentially in a single second get that many requests if you know
people like in your your example is getting the back button and then fetching the home
page yeah so so so we have custom scaling and and so all of this isn't present so
that's how the systems are taking care of these custom scales are constantly
watching at the concurrency number whether the concurrency number coming and being
reported is accurate or not is a big question so you need to have triangulations and
proxy to figure it out whether the the concurrency number coming from the system
is correct or not. Because if that gets messed up, your entire scaling gets screwed.
So you've given us already a very deep dive into video streaming and we probably just only
scratched some of the surfaces. But how did you learn all of these things outside of on the job?
Are there additional resources that you do or you kind of educated yourself?
Basically, if someone wants to understand a lot more about live video streaming,
what are recommendations you would give them?
And they don't have the opportunity to work at a company like you did just yet.
Yeah.
So this is a, I think it's a nice part and the sad part.
You cannot Google our issues.
There's no stack overflow for our issues are the things we experience, to be very honest.
Some of the things around the video is, there's plenty of good resources around the video.
There is a GitHub link.
I'll probably find it and share it with you.
That is a very detailed explanation of how video works.
starts from text and then goes on to how image is done, how image compression is done,
and then how image becomes a video frame by frame description.
There's a very good document around it.
And so I can share that.
So that is something which I learned.
Rest all of the stuff is on the job.
I don't think these problems are something which are available on the internet anywhere.
On the job, what were things that helped you become a lot more experience,
reliable senior, like what kind of approaches or mentality helps you to soak it in just really
quickly?
So I think for me it has been planning and like the entire exercise of going through and drills.
So we used to run something called as a game day.
Now this is basically a simulation of how an actual.
actual match is going to be. We used to...
So you were like simulating his highest traffic event ahead of time.
And not just just traffic. We are simulating the entire operating protocol.
So we'll have... Oh, wow. Really? You did that?
Yes. Yes. So we'll say that, okay, the match is going to start at 7 a.m. let's say.
Okay. Now, every system is supposed to scale.
Beforehand, there's a timeline to it. There is a checklist to it and so on.
No matter whether the teams are ready or not, we will start the live stream and it will start sending traffic.
Okay? Because in a live event scenario,
you do not have a green light, you're not carrying a flag to say that, okay, the train or the traffic can come in, right?
The game is going to start at 7.30 p.m. It's going to start at 7.30 p.m. no matter whether you like it or not.
So we used to do those kind of a drill. And in this drill, just so I understand, were you simulating, you know, like synthetic traffic?
Or was it more that every team had to scale up, you know, like do whatever.
synthetic traffic. We would generate, like, we would generate terabytes of data. Yes, we would generate.
There was a team that was spinning up, you know, like pretending to be users.
So we used to have a framework to simulate that kind of a traffic.
We used to, this is a platform called flat.io, which we used to use, which will actually
file it.
And we will not give teams access to that dashboard.
Yeah.
So, like, they wouldn't know what's hitting them.
They don't know what their traffic is going to come, what kind of a pattern is going
to come in anything.
Because that's the exact side of our production.
You don't know what is coming in your way.
You need to have your systems being able to tell you whether they are in a healthy state or
a bad state.
So all of this is like learning from there, learnings from there.
And most of the learnings like I've been into this industry for about like seven plus
years, right?
And it's not like we reached that 32 million number in a single year.
We've done smaller numbers and figures.
So over the course of time, also you have learned and while building the systems we learn
from that.
Well, and one thing that I kind of appreciated that you told us is, you know, I was very focused
And I think everyone's focused on like, oh, when you set the record.
But you said something interesting, which was, it was a 70-day marathon.
And the beginning was actually a lot more challenging, you know,
and it wasn't setting records, but it was already the traffic patterns,
the, you know, dealing with all of those things.
So I think it's a good reminder that, you know, like sometimes it's not the thing
that gets the most publicity that is the hardest work.
It's the work that you put in before.
No, the first week.
So it's a 70 to 80 days, by the way, not 7 to 8 days.
Oh, wow.
Those of us we're not following cricket and we're just used to either football or other games where it's just a lot shorter.
It's a very long time period.
It's a very long time.
And it can get very intense on people as well.
Like it's very stressful.
Every day there is a match.
Some other days there are two matches like back to back and two matches a whole complicated scenario.
Because how do you switch from one match to another match traffic shifting and all of that becomes super complex?
it's not that straightforward from an operational point of view.
So yeah, the first week is very intense.
The days before the first week is very intense.
But after that, you start to settle in.
We also try to aim that every year on year,
we have improved on our operational protocol, automated a lot more.
So I think someone's telling me that,
so I moved from Hot Star to Geo Cinema, right?
At Hot Star, the level of automation that reached is like people can actually be on a copy
and the live stream match up to 20 to 30 million match can just run without people.
Oh, well, you know, over time, it gets easier.
Yeah.
So for engineers who would like to get better at architecting systems,
what were things that helped you in between, you know, building them,
coding them, being in these meetings?
Were there, you know, and I'm talking not really about live streaming.
Now, this is more like generic software engineering.
Would you have advice on activities they can be?
do behaviors they can pick up that that helps you just learn faster.
So there's certain things which I have followed, I can share what I do and probably that
can, if that helps.
So I think one of the things around scaling and building complex systems which I've learned
the hard way I'll say is that you must understand anything and everything will, which
can fail will fail.
Murphy's Law.
Okay.
it's people are very overconfirited about their systems they're like hey we have done everything
we have scaled databases we have done transactional systems and all of that and so on and so on
they don't they underestimate okay and i can tell you like i have countless example of
where people have come to and say that here our systems are scaled up everything is fine tuned
this is good and i have broken their systems and load testing
you're going to make friends i can see yeah so so so like it's it's it's it's
people don't like me and everyone doing those things.
I'm just pulling your leg.
Yeah, but I'm just saying that that's one core principle,
which I feel is super important.
The moment you get deeper into that principle,
you start looking at everything within your system,
every decision, every configuration, everything,
because you know that that configuration at some point can miss.
Even the smallest thing as database connection, pool sizes,
the number of connections you're making to database,
the number of servers.
Like, for example, someone said, hey, our connection full size or fine, dune, we can scale infinitely.
Now, but there's a cap on number of database connections you can make to a database, right?
So you need to ensure that.
So when you're scaling, when you're scaling compute, this compute can scale up to a finite number
because if it goes beyond that, it will start choking databases, then your request and queries
start to get queued up, which will increase waiting times and so on, right?
That is one, right?
The other is metrics and measurement.
It's very important to have very detailed measurements in place for your systems to be able to figure out where the problems is.
Kind of a water flow and the same.
There's a common paradigm problem I've seen with every engineer.
So you know how you measure response times, right?
Now, people go and measure response time by looking at the on the load balancer that this is my request and this is a response time.
Now, even in the load balancer, there's a request, the way to load balances is, there's a request,
queue. You request gets queues and then there's a like a cron or like a worker which will pick up the
request and process it. Now the way you measure response latency is from the time you
queue the request to the time you return the response. That is your response latency. Now what
people will do is people in the processing function which is listening to the request, we'll
put the timing function to start of the processing function and end of the processing function
which will get response time. But they're like, oh my server is working all fine.
everything is good.
Yeah, you don't know that it's sitting in the queue.
That's sitting in the queue.
And this is the same problem which can happen down this track.
So that is the level of detailness you have to go through when you're designing metrics and measurement.
Yeah.
And I guess what I'm hearing is it also really helps for you to just don't, you know, like go deep and understand exactly how it works.
Because, you know, for a low balance or like it's kind of easy.
It's like, oh, it just balances a little bit.
Once you go in there and understand like how is it implemented, what kind of queue does use?
how big can it be all of the internal latency, etc.
Yeah.
That's good advice.
Thank you.
Yeah.
So that's one.
Third, third thing is, I hate APMs.
I've never used APMs in my life.
Application performance management, yeah.
Yeah.
So a lot of people say that, hey, why don't we put APM?
We can figure out stuff on there.
Now, APMs have a, like, at least this is my biased opinion.
I feel that it makes you lazy.
Sure, it can help you get all the metrics and everything in place,
but people become lazy.
People don't get into the deeps of their system.
They're not measuring.
They're relying on APM to do the right things for them.
And sure, APMs have evolved over the course of time.
You can argue, I'm sure if this video goes out,
people have a lot of views on it.
But the point is that I feel like it makes you lazy.
Not having APM forces you to go into the details,
forces you to look into every corner,
forces you to measure every aspect of your code,
measure every performance of your code,
and fine tune it.
Well, thank you.
And to close it off, we'll just do some rapid questions.
I'll fire a question and then you just go, what comes to it?
Yeah.
What programming language are you the most productive with and which one is your favorite?
So for building systems and services, I use Java.
Otherwise, for scripting or a script junkie, I'm very familiar with Ruby and Ruby's my go-to.
Like, it's like English.
Like, you're writing English and it just works.
So Ruby is my go-to for any scripting work and kind of stuff.
Nice.
It's a friendly language.
I mean, most people will use it because of rails or together with rails.
But by itself, it's a nice language, no?
Yeah, I started with Rails, actually.
My career started with Rails.
But then I got hang of Ruby.
Like, it's very easy.
Like, if you have to process some data, like, you have to analyze something,
you write some models.
You just write in Ruby.
You get all the gems, start working, you just get them.
What are some blogs or websites that you read to keep up to date?
So I read a lot of hacker news.
Hackern News has a lot of fun content and latest content over there.
Then LinkedIn is a good source, to be very honest.
Like somehow the net of people I have is like they keep posting a lot of interesting stuff.
So I get a lot of good content from there.
Apart from that, I have keen interest on latest technology.
So like CS events or some of the events which are happening, what are the talks being presented?
What are the things being about?
is something I keep a tap on.
Awesome.
Well, this has been just very, very interesting
to go into all the complexity
of something seemingly so simple as live video streaming.
Thank you very much for sharing all this
and for being on the podcast.
Yeah, thank you.
Thank you for inviting me.
It's been a wonderful conversation, sir.
Thank you to Ashutosh for taking us
behind the scenes of building a large-scale live streaming service
and one that set a world record at the time.
Check out the show notes below for reference
as we mentioned in this podcast
and to see related deep dives you can read in The Pragmatic Engineer.
If you enjoyed this podcast, please do subscribe on your favorite podcast platform and on YouTube.
Thanks, and see you in the next one.
