In The Arena by TechArena - How Verge.io is Rethinking Compute, Storage & AI Readiness
Episode Date: February 21, 2025Tune in to our latest episode of In the Arena to discover how Verge.io’s unified infrastructure platform simplifies IT management, boosts efficiency, & prepares data centers for the AI-driven fu...ture.
Transcript
Discussion (0)
Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators
and our host, Alison Klein.
Now let's step into the arena.
Welcome to the Tech Arena Data Insights series.
I'm Alison Klein.
Because it's Data Insights, it means I'm joined by SolidIME's Janice Narowski.
Welcome to the program.
Janice, how are you doing?
Oh, I'm great.
Thank you so much for having me back on.
Alison, how are you?
I'm doing great.
It's fantastic to be doing another Data Ins, and I am delighted on the topic today.
Why don't you introduce the topic to the audience?
Yes, I'm super excited as always about these episodes, but in particular today because we
have George Crump, who is the Chief Marketing Officer of Verge.io, and they have a really
interesting perspective, not just on storage, but the overall solution
for IT infrastructures around the world and different cloud solutions.
So we're just excited to get the perspective from George and just have a deep conversation
on this topic.
Hey, thanks for having me guys.
I appreciate it.
George, I was reading up on Verge.io in advance of the show. And I loved the fresh perspective that you guys offered in terms of IT infrastructure
oversight and really creating a unified, efficient platform.
Why don't you tell me a little bit more about the solution to get us started and what drove
your team to create this new approach?
Yeah.
And you really nailed the fundamental difference.
We believe that if you look at most solutions on the market today, they
plump together different components and we say they integrate via the GUI.
So they're still different behind the scenes, but you have hopefully
one GUI to manage everything.
We don't know if that's the right approach.
What we did that's entirely different is we took all the different aspects
of infrastructure, the networking, the storage, the virtualization layer, and integrated it into a single code base.
And then of course, as part of that is the good, right?
So everything's all integrated into one thing.
And as part of that, that delivers a massive amount of
efficiency for our customers.
Right.
And so we can get more performance on less hardware.
And we're also very portable in the hardware we support.
I've got customers that are running on five, six, seven year old servers, right.
And they're mixing in brand new storage right into those old servers, right.
And so it gives a lot of flexibility.
The genesis of the company was it started with our CTO, a
gentleman named Greg Campbell.
And he was actually developing a search engine, if you will, to compete with Google and Amazon.
I guess he just liked competing with really big companies.
But what was frustrating him was that he just always was spending so much time managing
the infrastructure.
And I guess if you're a super smart guy like Greg is, you decide, I'll just write my own infrastructure software.
And so that's what he did.
And so it took a little while to get the whole thing done, but we've come out
the other side with this highly integrated, very stable piece of code
that for the people, the audience that would get into this, the entire code
base is less than 300,000 lines of code.
That gives us efficiency, of course, but it also means that things like bugs and things
like that are less likely to creep in because we're not dealing with 35 million lines of
code, which is really what our competition deals with.
Yeah, George, many customers or many of your customers and just organizations in general,
they're all seeking alternatives right now, right?
Especially in this space and looking at solutions that
integrate not only just storage, but compute and network.
What verticals are you seeing as early adopters and other specific
workloads that drive a more urgent need for this consolidation?
Today, it's a very horizontal need.
As you said, a lot of people are rethinking and relooking at their infrastructure. Costs have really gotten out of control.
The way software models work today, you can't price things like you used to.
The common practice in this space is to charge by processor or worse,
charged by core or charged by capacity.
As we've spoken about in past solid, I just came out with twenty two terabyte drive. The licensing costs on one drive now is
pretty significant right. So you can't change the way you think and so that's
what we've done is we have a very simple licensing strategy it's by the server we
don't care what's in the server so it's very predictable and so that appeals to
a large swath of IT professionals.
Where we've seen some early emergence though,
is in the cloud service provider,
managed service provider type of space
where they're dealing with supporting hundreds,
if not thousands of customers.
And what they're looking for there
is not only that efficiency, the predictable cost model,
but also the ability to support multi-tenancy.
So if you're a cloud service provider, the ability to create what we call a virtual data
center or a tenant per customer gives you a lot of capability because you can build
this gigantic compute infrastructure and storage infrastructure, but share it across thousands
of customers and do so very easily. That's really cool.
And I think about the workloads that are being deployed in those types of
environments and wanting to dial in not just compute, but compute storage
and network for the workloads.
And I think one of those areas that's really the case is with the coming
integration of AI into applications and needing just
really well-balanced infrastructure.
What are some of the unique requirements that you're seeing coming into play this
year for workload consolidation, workload optimization, and how your
solution marries with that?
Yeah, I'll tell you, we could almost do a whole podcast on those two letters, AI,
right?
So what's very interesting is what we're seeing right now is almost a sea change where people
are driving to a new infrastructure, primarily because of these cost concerns, efficiency
concerns, things like that.
But they know on the horizon, there's this AI thing, right?
And I think corporately, a lot of people are sort of AI curious, but they're not necessarily AI active.
And there's a challenge there because the example I always use, if you're Coca-Cola,
you're not going to load your secret recipe for Coke Classic into ChachiPT or Gemini or any of
these guys, right? Because it's your stuff, right? So the concept of building a private AI where you can house the whole thing
internally and ask it questions becomes very interesting. But again, it's a crawl, walk,
run sort of thing. You're not going to probably do this tomorrow. You've got to solve today's
infrastructure problem. But while you're doing that, make sure that that infrastructure is,
for lack of a better word, AI ready.
And so that's where a lot of our customers are focused. And so some of the tenants of that would be at a very basic level, it would be,
can you virtualize GPUs, right?
Can you carve those GPUs across multiple workloads and do so dynamically?
Cause there's one thing that's going up in price.
It's the cost of a GPU.
Can you access those GPUs remotely?
If you look at the latest innovations from Nvidia and others, the GPUs are really big.
They're not going to go on a card anymore.
Right.
And so can you get across a wire to get to that GPU performance?
And can you do so efficiently?
And then finally, can you help customers deploy AI models? Because there's
a serious skills deficiency when it comes to AI. And so can you help customers deploy
these different models? So essentially with maybe three or four clicks of a button, they
can fire up their equivalent to, for lack of a better word, a private chat GPT.
So why don't I kind of set the stage a little bit as we follow up with some
additional questions here, but would really like to know, George, what are
some of the pitfalls that your customers are having with alternative
solutions in the market and how is Verge.io migrated to this?
So, when you're talking about switching infrastructure, the big
thing is going to be migration.
It doesn't matter how inexpensive you are.
If I can't get from point A to point B in a relatively quick and easy fashion, learn it quickly.
And all my data comes over most of my settings, some of our things like that.
It's, if you will, cheaper to pay the more expensive price and just stay where you are.
And so where we've invested a lot of our technology is integrating in that
migration process.
And so again, we're very true to form.
Everything has to live within the core code.
We don't build different modules for things.
So we've built migration support directly into the core code of the product.
And so when you log us into your prior infrastructure software, it essentially
sees us as a backup product.
And at that point assumes that it's allowed to give us everything
and we just pull it all in.
And so we've had situations where we've converted thousands of
virtual machines in a weekend.
The cut over time for a huge data center is a weekend.
The downtime during that cut over time is a few minutes.
So we can prep the environment, we can allow the customer to do all their testing.
And then that final cut over is just essentially, if you will, a quick sync and they're fully
converted to theirs.
So that's probably the big thing.
The other one is course of knowledge or skills gap.
The current infrastructure, most of the people that I talk to, they've been
running it for not years, but decades.
And so how do I take that knowledge that I've ascertained over that period of
time and transfer it to this new thing?
Obviously you're going to be different, but the way I describe it, you want to
be different, but not weird, right?
And so if a VM is still a VM, if a network is still a network, that makes
that learning and conversion process mentally much, much easier.
Now you talked about a really elegant solution that is lightweight from a code
perspective, comprehensive from a management oversight perspective, but one
of the things that we know enterprise really cares about is reliability.
How have you taken on the challenge of ruggedness
and reliability for the solution
and how has the market responded?
Yeah, so one of our core philosophies is
that everything's gonna break at some point.
You hear a lot about zero trust.
The software has zero trust of the environment.
And so we are constantly prepared for something to go wrong.
And there's a bit of AI, I would call it narrow AI.
I don't want to be guilty of AI washing, but there's a bit of narrow AI in the
product that is constantly analyzing all the conditions in the environment,
understanding what could go wrong.
And if something did go wrong, what would it do to solve that problem?
I spoke to a customer two days ago as an example where they had the worst case scenario.
They had a multi-site environment.
The two environments under the former infrastructure were connected together and they didn't have
a failure.
They had an intermittent failure where the network connection, their ISP essentially,
the connection would go up and down throughout the day, every four or five
minutes, it would go up and down.
And so that caused in their former environment, because one of their peer
sites, they hadn't converted yet.
It was going up and down.
It was losing data.
It was becoming corrupted.
The site that was running our software.
So it's almost a perfect apples to apples comparison had no issues because our software
assumed it was prepared for failure.
And when the network was down, all the operations between the two sites stayed active.
And then when the network came up, even for a few minutes, that's all we needed to sync
data back and forth between the two sites.
The worst thing you want to do when you take your car in to get fixed is say it breaks
sometimes and the mechanic always looks at you like you're crazy.
That's the worst case scenario.
And so our software was able to really elegantly work its way through that until the ISP was
able to fix the issue.
So ease of integration.
Let's talk a little bit about that.
Ease of integration is a key consideration when users are considering new products.
But can you discuss how you work with customers to deploy?
Yeah, sure.
So I think part of it goes back to that portability
of the code, the fact that we can,
I would say in 99% of the cases,
run on the existing hardware.
So the hardware that's running the infrastructure software
that we're replacing, we can nine times out of 10,
or probably 9.9 times out of 10 run on the way I
described it is if that server was built in the last six years, we'll probably
run on it.
And so that's that first layer of integration.
The second is what if I need to add more storage or more network
capabilities or anything like that?
We have the ability to, even though that server might be four or five, six years
old, you could put brand new storage in it, for example, or brand new network
connectivity, and the software will automatically understand that you did
that and use that storage to its fullest capability.
So we don't need to, for example, rewrite the code when a new version of a drive
comes out or a new version of a network cart
comes out, the software automatically adapts to those situations.
So it makes that hardware layer much simpler.
No, I know that you've worked with SolidLine to integrate their drives into an overall
solution.
Can you talk about the collaboration between the companies?
Yeah, absolutely.
There's a couple of areas there. So we have built into our product, the ability to create multiple tiers of
storage, different classes of storage, if you will.
And it's interesting when SSDs first came out, there was a lot of discussion
about, for example, tiering, because SSDs are really, really expensive and
hard disk drives were relatively inexpensive.
So tiering between those made a lot of sense, but especially over the last
five,
six years or so, that curve has changed where a lot of customers started to go,
Hey, I'll go all flash or all SSD, however you want to say it.
And so all of a sudden tiering wasn't that popular.
Now we're seeing it swing back around because we have these different classes
of SSDs, which Solidime of course makes.
And so our software is able to leverage those different types of technologies
correctly so that you get the maximum lifespan out of them.
What's interesting is with our software, we were able to run on Solidime's newest stuff
with no changes at all really to the software. And we were able to fully support that. So we were getting in the lab tests, phenomenal 1.5 million IOPS on 64K using
64K blocks, which is more of a real world test.
If you translated that to 4K, you would be up around 30 million IOPS.
And these were data center class systems.
They were good, but they weren't like what I always say, they weren't a crazy time.
They weren't this stuff that nobody would actually pay for.
These were servers that you would find in a typical data center and we were getting
massive amounts of performance, both read and write performance.
Now clearly in the way the technology works, the QLC class drives would deliver very similar
performance to the TLC class drives in reads, which is exactly what you would
expect, but the write performance on the TLCs was better. But again, the way the software works,
we're able to adjust and make sure we get the maximum potential out of any of those drives.
And then the other thing that's important is we can live migrate virtual machines between tiers.
So if you have a workload that you put, for example, say on TLC, because it's
going to be a high write application and then for whatever reason it sunsets, or
it doesn't have the amount of write that you expect it would or whatever, you
can live move that to the lower cost tier.
The other thing that's kind of interesting is we have this capability
that we call iGuardian, which is the ability to provide the data in line if there's
multiple simultaneous hardware failures. And so it's sort of a separate box that sits there. It's
almost like a parody server for the entire environment instead of where in radio you would
have a parody drive. Well, that's an ideal use case for these high density, high capacity QLC
types of drives. And so it works really well in our environment.
And so we were able to leverage that technology as well.
Awesome.
Thank you, George.
That's amazing insight.
Really appreciate the kudos there.
We touched on this a little bit, but with the rise of high-density storage needs,
how do you drive performance and mission critical workloads?
Yeah.
It's interesting when you guys first announced the 122 terabyte drives, I
thought, wow, that's really cool.
And then I started thinking about, wait a second, that's going to cause you problems.
And so if you think about it, most storage technology today counts on a spy level
of parallelism to generate performance.
And by that, I mean, lots of drives.
The normal increment has sort of been, you've gone from two to four to eight
to 16 to 32, and then all of a sudden you're five to 122, right?
So it's almost exponential in the increase that changes things, right?
Because now all of a sudden, what am I going to do?
I have to tell a customer to buy 25 or 30, 122 terabyte drives when they only need a
third of that capacity.
With our software the way it's designed, we still like parallelism, don't get me wrong,
but we don't need the high quantity of drives.
We're able to, on just a few drives, the example that I gave with the performance that we did
with you guys in your labs, as
I recall, there was only either four or eight drives per server in that environment, right?
So we can deliver very high performance without a lot of drives.
And I think that's going to be a key reckoning point for storage solutions going forward
is what type of performance can you deliver without a high degree of parallelism? And it's been a dirty little secret, if you will, in the industry and hasn't
really shown up yet because generally speaking, we've had relatively
low capacity drives.
Now the flip side of that is at least for today's standards, the exception
would be AI, going back to that one more time with the models.
If customers continue to bring LLMs in-house and use
their data to train them, obviously those are going to have the opposite problem.
They're going to have a lot of capacity, right?
They're going to need a lot of capacity, I should say.
And so in that aspect, the ability to support that and move that
data in there is also critical.
So we kind of benefit from either situation for customers that will
need fewer drives and for customers that will need fewer drives and
for customers that will still have lots of capacity demand.
This has been fantastic, George.
Thank you so much for sharing about what the team at Verge.io has been able to accomplish.
And I can't wait to hear more about solutions deployment and how customers are taking advantage
of this.
One final question for you, where can folks find out more about the solutions
we talked about today and engage your team?
The company name is the website.
So verge.io is the place to go and you should be able to figure
out where everything is there.
And we have a webinar that we just recently did with the team at Solidime
where we showed some of the performance numbers I'm speaking of, and
you'll see that there as well. Awesome. That brings us to an end of another Data Insights
podcast. Janice, always delightful to chat with you and George. Thanks so much for being with us
today. Thanks for having me. Thank you, Allison. Thank you, George.
Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net.
All content is copyrighted by the Tech Arena.