In The Arena by TechArena - Riding the Visual Edge with Varnish

Episode Date: February 27, 2024

TechArena host Allyson Klein chats with Varnish CTO Frank Miller about his company’s innovation in delivering media at the edge....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome to the Tech Arena. My name is Alison Klein. We're coming to you from the Mobile Road Congress event in Barcelona, and I am so delighted to be joined by Frank Miller, CTO of Varnish. Welcome to the program, Frank. Hello, Miller. Glad to chat with everybody. So Varnish has been on the show before, but for those who may have missed episode one of the Varnish series, can you just introduce the company and your role?
Starting point is 00:00:48 Yeah, so Varnish has been in the content delivery software game since 2005 with open source. And you kind of have the 2010s. We decided to kind of follow the red hat route and enterprise. So we do support right now some of the largest content delivery folks in the world. And we have many value props. Right. But I think the main one, once again, is just the robustness of the Blackboard. We're just phenomenally efficient.
Starting point is 00:01:15 Right. So from a cost perspective, we have a significant advantage with the other folks in the current arena. Like, you know, you're optimizing the cost basis. But we're phenomenally open. So from a Lego block perspective, it really is easy to build on unique intellectual property for a lot of the enterprises. But my role as a CTO, yeah, I've been a CTO for goodness gracious, maybe nearly 20 years
Starting point is 00:01:38 across various industries. Now, the content delivery space has obviously gone through a tremendous amount of transformation during the pandemic when everybody was sitting in their homes waiting for content to be delivered. I don't think that content delivery was ever as exciting before that as it was when we actually realized the criticality of the use case. Tell me how that that influenced technology development during the time and where are we today with tech it's really interesting because technology has always done two things number one it's trying to reduce the cost basis and then also it's trying to do top line right to create new opportunities generate revenue over 20 years I've watched technology go from single user box things, right? Because I've built mobile networks.
Starting point is 00:02:29 Right. All the way to everything's just abstracted compute that you can do many things from broadband all the way to deliver different software as a services from the same invested platform. So you have one thing that has many sources of revenue generation. That's where content delivery has gone. Content delivery started in boxes. And now it's just another workload that you run either and this is more personal. So I used to be an embedded system engineer, electronics for seven years. We used to develop and be in leading rank, phenomenally efficient code basis.
Starting point is 00:03:13 And I've actually built two public clouds and always was challenged with the mindset that public cloud engineers are just unlimited. Well, guess what? It's a new age. Right. We don't have unlimited stuff and there's an opportunity to optimize. And that's, I think we'll discuss more of that during the interview. And you had some of the value prop last year.
Starting point is 00:03:35 Right. Focus. Right. Yeah. Now, when I look at the solutions that you offer, you know, I read about web app performance acceleration, content delivery, telco edge solution delivery, reverse proxy and HTTP acceleration caching. Makes sense of this to me in terms of the end to end solutions that you're delivering and the customers that you're serving. It's all about, think of us like Lego blocks that deal with doing, you know, we even do a forward proxy, forward reverse proxy for anything HTTP. Luckily, content, of course, over the top is delivered via HTTP protocol. So we can deliver streaming content, on-demand content, software downloads, right?
Starting point is 00:04:18 But what's interesting is we've also can accelerate API. So APIs like you have on your application. Everything underneath is still going through HTTP through a web server. There's different web-based API protocols. And still as an origin server, serving up those requests, those API requests. So we help that too.
Starting point is 00:04:39 So anything HTTP, we can help. The last one is you'll see that we've released some stuff outside of Artifactory. Right. That's also HTTP based. So we can help with JFrog, for instance, Avon servers. We can help on API gateway, single workload off. If it's HTTP, we can help.
Starting point is 00:04:59 Okay, nice. Now, you know, I know that you guys like to talk about performance. Every time I talk to you, you're setting some sort of performance record. And you're also looking at performance efficiency. Yes. Tell me where we are today with performance. What drives it and how do you improve it? So from a performance basis, at least last year, that was low-hanging fruit. We worked with our partners at Intel Labs and really looked at how to get the most out of the CPU, how to get the most out of memory, the most out of disk.
Starting point is 00:05:31 At the end of the day, if you look at a perfect system, you shouldn't see any locks or blocks anywhere. It should just be water flowing down a pipe for content being delivered to the network interface. That's a beautiful analogy. Unfortunately, I don't think reality works like that all the time. I try to keep it to a water hose and other piping systems. But yeah, so that's really what we did. And we're fascinated. How much could we get by just applying good software engineering, you know, kernel to
Starting point is 00:05:59 operating system and hardware knowledge, which is almost like a lost art these days. Right. It's really hard to find people who know this very well. Luckily, we have a crew in Oslo and Stockholm with Absinthe Wizard. So some of the original wizards at Varnish from the early days are still here. But what we discovered is that first last year, we were able to use about one-fifth or one-sixth of the power. Well, that depends on the workload.
Starting point is 00:06:23 That's amazing. I would have been happy with, you know, a half, you know, double increase. We were astounded. It's number one there. But what's more important this year, so we set the record in a single system for how many bits you could put out, which was 1.2 to 1.8 terabytes. We've got a little bit above that, 1.4.
Starting point is 00:06:44 You know, we keep on playing with things. But what was more ambitious was looking at the power. So if you go to 2022, we hit the record at about maybe 386 megabits per watt. We sat there in the room and said, let's hit the top line mark. Let's do a moonshot. Let's see if we can do above a gigabit per watt. Wow. Let's see what happens. And we hit that. we can do above a gigabit per watt. Wow. Let's see what happens.
Starting point is 00:07:05 And we hit that. We hit 1.18 gigabits per watt. Keep in mind, these are not science fiction systems. These are systems you buy off the shelf. No super hardware accelerators. This is just good software engineering practices. So we did that. But then we saw, you know what?
Starting point is 00:07:21 What's the value of this? How much money can you save? So over the last year, and it'll be published, it should be published by next week, we did a study literally, what does this mean on watts? So we took legacy systems and the 100 gigabit per second. We looked at a one terabit workload over five years. And with those variants,
Starting point is 00:07:44 how many from a megawatt hour perspective, how many megawatt hours do you see? So we looked at this. It was fascinating. If you unfortunately have legacy systems and 10 gig interfaces, you'll save 90% in megawatt hours if you just build a 400 gig system.
Starting point is 00:08:00 That's crazy. That's nuts. Well, what if you have a 100 gig system? Most modern networks you see are 100 gig. You save a little bit above 60%, maybe about 64%. And that's still amazing. On the system we would have bought last year. Yeah, 60% is crazy. It is crazy stuff.
Starting point is 00:08:18 Then we turned it into dollars. We took a look at the cost and we got the same results. So we converted it it's pretty significant it's literally i think the cost i know from memory please don't quote me to be published next week i think looking at the cost for that one terabit of a 10 gig system came to about 120 some thousand per year in power if you do it with a 400 gig system it's like 12,000 per year that's in a power bill that's significant study we did on purpose because just say hey we got this incredibly fast piece of software what does it mean so it's okay looks you know most
Starting point is 00:08:59 major folks like get pair rounds of disneys or you know, that one-terabit device. And so if you have brownfield stuff, if we use today's system at 400 gigs, it's still not that expensive. How much money are you actually spending? And that's enough to do a good return on investment calculation for anybody. So full analysis should be published within a week. I can't wait to dig into that,
Starting point is 00:09:21 because what you're describing, I've always thought about the move to 400 gig is something about performance. But this is really getting to the heart of the economics. Yeah, it is. And keep in mind, we're being conservative. Actually, in a one-year system, you can do 800 gig. Two of you, you can put, you know, 1.6 terabit. Yeah.
Starting point is 00:09:42 So we said, okay, let's just cut that now. Do an analysis and see what happens. But those numbers will be out. The other thing we did is we wanted to say that you just don't have to run the largest system. So we did an analysis of running multiple families all the way down to the medium to the smallest family to see if we could stay above the one terabit per watt boundary.
Starting point is 00:10:08 It did stay there, I think. It was down to about a 200 gig system. Starts to decline a little bit until you get like a Xeon D. Right. But if you look at the future, so if you go like a Sierra Forest, a Xeon D will have an embedded 100 gig. That's fantastic.
Starting point is 00:10:24 So we'll test that too. So we're going to focus over the next 12 months trying to keep that efficiency down to that last bit, those entry-level processors. When you talked earlier about moving from just serving content providers to moving into enterprise, what has that changed in terms of the types of solutions that you're delivering? And do enterprises have any different requirements that are notable? I think it's due to availability and customer experience. So what we see there is high availability, which is one of the reasons we support Varnish Enterprise. Performance,
Starting point is 00:11:01 keep in mind these performance metrics are not from varnish cache in varnish enterprise we actually took a look at the bottlenecks in varnish cache which is in memory management and file system management in linux and actually improved that so you know performance matters availability matters and also the ability to have intelligence. So we added a controller reference on top of this that allowed you to easily manage complex global caching configs, multi federated caching configs easily. We created this reference architecture to perform vertical scaling, horizontal scaling, awareness, easy push out, and integration into any type of engineering, integration and testing.
Starting point is 00:11:54 Very fantastic. Now, it's MWC 24, and I would be totally not a great host of my podcast if I didn't ask you, when you look at this landscape, you look at what your technical teams are doing, how does AI enter into this? And do you see AI helping with automation of this? Or, you know, is there any link? What's interesting is there's two links here. So link number one, you're exactly right, is the operational efficiency, keeping stuff going
Starting point is 00:12:25 from root cause and predictive analysis. We're actually working on this. We're looking at the models of how to run variations as best as possible because we have all these knobs so we can walk real time. We're looking at being able to see if there's a problem, what is the problem,
Starting point is 00:12:42 and automatically make the change, but also predict this coming. If we see this coming, there's other opportunities in AI and predict caching. Because the cache changes over the day of what people watch. So if we can observe that, we can prefetch and put
Starting point is 00:12:57 these things out there and actually change the dynamics. It could be soap operas, it could be sports, whatever it is we can now what i'm really interested in too is security so it's anomaly detection um i i used to be a security walk too i did certain advisories in the day but um zero day exploits really bothering so being able to catch something that there's no signature for and being able to adapt and you know take care of this and like raise a flag and say there's no signature for and being able to adapt and you know take care of
Starting point is 00:13:26 this and like raise a flag and say there's something going on we're going to dampen this now you know come take a look at the logs we haven't seen this before um the other one that's interesting from a security perspective is you know even ai coming at you or um you know people doing scraping of content you should be able to stop this too so there's various ai use cases where we're building now and looking to productize in 2025 so it's a customer experience type operation just you know we want to protect the customer experience we don't want churn but it's also from a security perspective mainly tied to you know um things that you just can't put a signature around.
Starting point is 00:14:05 You really have to look at anomalous and statistical behavior and other AI frameworks to make a change. What's interesting is our architecture reference that we're doing this is different. Most AI platforms go to the mothership, and the mothership makes
Starting point is 00:14:22 the decision to kind of do stuff. That's not good enough we really feel that ai should also be federated model should be light and tight actually push to the edge right to the edge cdn create policies that say hey when you see these things you can do this this and this so we'll create a framework where you have ai models and you can have policies whether you want it to do this, whether you want, you know, person in the middle, right, or whether you just want notifications. We're working on this framework now.
Starting point is 00:14:52 It's interesting. AI at the Edge keeps coming up in interviews and both from a standpoint of folks who are looking to accelerate it and folks that are looking at it as something that needs to be developed more quickly in their solutions. So the fact that you're saying this is just another data point of this is a broader trend. Talking about broader trends, we're at MWC. It's day two. What have you seen from the industry that's interesting across the MWC landscape?
Starting point is 00:15:18 And what do you think we're going to be talking about in 24? It's interesting. I've seen two things. I've seen more acceptance of oran which i you know i expected you know oran is complicated you should need a testing reference to get you know finally all the vendors to come in and agree to get together and certify their indian 5 and 2b 6g system um hopefully oran i know that's just the ran side right but takes the lessons of what didn't work with Mano. Right.
Starting point is 00:15:45 Because we went through that phase. Right. And still, because I built 4G and 5G networks with discrete components, I did it with Mano. But I couldn't get support from the vendors unless I bought their complete box. The stovepipe solution. Yeah. And it just cost just as much to find the boxes so everything so where we're pivoting now to more containerized references at least oran right on on that side of the network
Starting point is 00:16:13 indian common certification i'm hoping that changes the cost dynamics number one number two may you know met right um it's now multi-use. And that's another workload. I think a unique use case on the CDN side is if you have extra cores at the edge for Mac that's not doing your, you know, your baseband work and stuff like that, run a CDN. An edge CDN right next to the edge packet gateway is perfect. Yeah, it makes a lot of sense. Yeah. You want to talk about cutting down latency,
Starting point is 00:16:46 you're right there. That's going to pull tens of milliseconds of latency out of the equation. So I think there is an opportunity to, you know, edge CDNs now, at least with folks who have M&O assets, are on the network, sorry, the network framework
Starting point is 00:17:02 as close as possible to the customer. Because that'll be better for the customer experience. So I think that's a change too. Yeah. Well, Frank, I'm really looking forward to seeing this paper. Are you guys talking about anything else at MWC this week that's notable? I think that's the big one. It's really about the value prop.
Starting point is 00:17:21 And I think the other notable thing I'm going to toot our horn a little bit is what happened at the Super Bowl so at the Super Bowl you can check out the LinkedIn but really it's a load for the Super Bowl there were problems
Starting point is 00:17:37 ongoing in the Bowl and our partners at Paramount Plus we were in the war room with them we basically by the end of the Super Bowl, had 75% of the content protect the customer experience. So I think that's a demonstration. And they were actually, they weren't surprised. We'd already finished this.
Starting point is 00:17:55 But we were there to help. And I think at the end of the day, you know, we're just there for customer success, Paramount and their customers. That's a great example of enterprise resilience right there. Absolutely. So for those of you listening online, definitely read that Super Bowl paper. It's a good one. And then I'll be sending out a link to your power efficiency on 400 gig as soon as it's available.
Starting point is 00:18:19 Where else can folks go to get information about Varnish Solutions? Oh, you could just go to, you know, varnishtsoftware.com. I think we have a pretty good web presence that actually goes through all the solutions, goes through the history, code examples, you know, very deep examples. And if you really want to go crazy, oh, I don't have the book here. We actually have about 800 paper-based varnishing journals that are programmed. Because I think that's the other unique thing about
Starting point is 00:18:49 varnish that we don't discuss a lot is we have an engine. All these features you can glue on. But it's also an open language. And it's process language. And our varnish configuration language, if you write a unique piece of ip that
Starting point is 00:19:06 you want to do it's actually compiled and linked into the runtime so it's not like this process slow language you're actually building your own software framework that's why we're so popular with the big folks right because they can create unique ip and it runs at execution time that's fantastic thank you so much for being on the program today. It was so good to get to know you and learn more, a little bit more about what's going on at Burnish. Thank you, Allison. Appreciate it. And like I said, I'm available if anybody has any questions. You can just check out my LinkedIn page and just message me. I'll answer. Sounds great.
Starting point is 00:19:41 Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.