In The Arena by TechArena - MemryX Delivers AI Performance at the Edge
Episode Date: February 23, 2024TechArena host Allyson Klein talks to MemryX VP of Product and Business Development Roger Peene about how his company is transforming AI at the edge with their silicon and how we’re sitting in an AI... revolution.
Transcript
Discussion (0)
Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome to the Tech Arena. My name is Allison Klein. I am delighted today to be joined by Roger Peen, Vice President of Product and Business Development with MemoryX. Welcome to
the program, Roger. Thank you, Allison. Hey, I appreciate you asking me to be on your podcast.
It's great to be here. Well, you guys are actually fitting into a topic
that I've been covering quite a bit. We've been talking about on the tech arena about the
incredible breadth of innovation that's been going on with the growth of AI. And, you know,
AI has moved past the large cloud service providers into everyday industry on the planet,
looking for ways to adopt the technology. And with that move, we see this
diverse customer base looking for unique solutions. How does that represent an opportunity for
companies like yours? Great. Well, first, I think we can agree that we're in the middle of an AI
revolution, not unlike the industrial revolution, in that AI, I believe AI will become pervasive
in our everyday lives. If you're a technologist or a strategist for a company out there,
and you haven't yet identified how to harness the power of AI in your business, then
you're quickly falling behind. You know, today, AI processing, a lot of it is done in the cloud, which requires a ton of hardware and power to execute.
For example, to train an LLM model or a Gen AI model can take thousands of GPUs and a ton of power and cost to train it.
Also, queries and inferencing can be very expensive. It's been reported that querying AI in the
cloud costs 10 times more than a traditional search query, which really creates a financial
discontinuity, if you will, and creates a burden for cloud providers. So I believe, and I think
others believe, that it makes sense to take
a lot of this AI processing that currently is running in the cloud and push it out to the edge
and closer and closer to where data is created, just from the perspective of better latency,
better performance, and much lower cost. Now, MemoryX has been coming up a lot in
industry conversations that I've been having.
Can you introduce us to the company and your solutions and tell us a bit about why you decided
to focus here at the edge? Sure. So MemoryX was founded in 2019. It's a startup that was created
out of University of Michigan in Ann Arbor with the task of creating a edge AI accelerator that can uniquely run
these edge AI models. So what we've created is a data flow processor. And what's really,
really neat about the company and the architecture is it was created from a blank sheet of paper.
We didn't try and force fit legacy architectures or technologies
like DSP engines or microprocessor engines into and form fitted into AI. It was really created
from scratch. And not only that, we created both the silicon, the hardware, along with the software, the SDK, together. So they worked hand-in-hand for ease of deployment
and for optimizing the execution of these AI models.
And so you ask me, why did we decide to focus here
and focus on the edge?
And as I mentioned earlier, running AI models,
all AI models in the cloud can be expensive.
And the challenge with moving AI processing from the cloud out to the edge is that there's a lack of optimized processing capability.
Sure, you can run AI on CPUs and GPUs, but they're still pretty inefficient from a power versus performance versus cost
perspective. So having a dedicated data flow processing engine that was created to optimize
running the AI models really creates value for edge AI solutions. So today, our processor is 100 times more efficient than a CPU, and 10 times
more inferences per watt efficiency than a than a GPU today.
Those are incredible statistics. And you know, I think that for those of us who may not have
designed for the edge, can you talk a little bit about how edge is different and what kind of constraints you have to design around to land equipment at the edge?
Sure. So the one thing about the edge, software is much, much, much more
diverse than, let's say, a vertically integrated solution or even a market like just a standard
PC segment.
One of the things that we really focused on was ease of use.
The intention is to get hardware out of the way of software to enable developers to focus on
what's important, and that is the AI models themselves and tuning the AI models for that
specific edge workload. And so to that end, we developed a software development kit that can take
AI models from any major framework, and we require no quantization of that model,
no pruning of it, no compression and no retraining. We can take an AI model as is and compile it
directly down into our hard works. So the time to be up and running is hours instead of days, weeks, or even in some cases, months.
And so getting that ease of use is really important because you're going after this very large swath of applications and different usages.
When you look at that diverse landscape that you just described, it made me think about the plethora of target markets that
you're going after. And you talk about them on your website. I was reading about it in the prep
for the interview. Can you shed some light? You know, you've been engaging with customers
on the products. Can you shed some light on where you're seeing early traction in the market and
where you expect rapid movement in 2024? Sure. So I think that some of the usages
that we're really focused on right now
are in the PC segment,
which is edge servers that are used for applications
like video management servers,
imaging, and things like that.
We're also focused on industrial PCs
for obviously industrial applications.
So think manufacturing, defect detection, these types of usages, and also smart city applications.
Now, we're also very applicable to a bunch of different usages.
For example, automotive, both driver monitor assist platforms around vehicle monitoring
smart mirrors these are all things that we've worked on we've done some pocs on
but the challenge for a startup like ours is simply time to revenue so while we're very
applicable to that we don't have the luxury of waiting three, four years for revenue to come in on those. So what we're really doing is focused on sponsor customers in those segments, and they're
not really our target priority for 2024. Got it. Now, you talked about the ease of use and getting
workloads up and running in hours versus days or weeks. How do you navigate the rapidly evolving
software stacks and frameworks in this space when you're landing an accelerator across so
many different environments and potentially different potential platforms? Great question.
And in some cases, I think that there's all these different models.
So I should be clear that our accelerator, while it can run audio models and some other models, it's very focused on computer vision specifically.
So that's kind of our sweet spot.
And that's where we think we can offer the most value to the customers.
Now, within computer vision, obviously obviously there's new models coming out,
there's new operators.
And so through our compiler,
we have the ability to inject techniques
to basically comprehend new technologies
and new operators as they come to market,
but then also move towards integrating them
into the hardware.
So having the flexibility of kind of general purpose Mac units and general purpose computing elements within the silicon really helps our ability to accept new technology, accept new operators as they come to market. But obviously, we're always looking to optimize and be able to work very well with the new stuff coming out.
It's in a very rapidly evolving segment.
Now, I've got to ask,
because we've been talking about chiplets a lot on Tech Arena.
Do you see MemoryX pursuing chiplet designs with other vendors?
And how do you view, I know you've been in this industry for a really long time. How do you view the chiplet technology maturing
in the market? So I have been in the industry for a while. I've been in semiconductors for about 30
years. And what I've noticed is the expectations on front end ramps of technologies are usually much stronger than
reality. And the back end curves are much slower than reality. So while chiplets is a buzzword,
and chiplets is something that's getting a lot of attention, I think, from a productization
perspective, we recognize it'll be a little bit of time before all the standards are
in place, all the elements come together, where they can be developed and validated and the
interoperability is there. With that being said, from day one, we expected our product to be a
chiplet. So if you go to our website, we actually talk about this. And so we
can sell our accelerator as a discrete piece of silicon that can go into an MCM package.
We can package it ourselves or we can sell it in the form of modules. Now, moving forward,
to be able to support the chiplet ecosystem, looking at interfaces like UCIE and those types
of things is something that we will consider
on next generation of products. But bottom line is we do believe that
chiplets and disaggregation within a package is eventually going to come to market and we're
going to be ready to support it when it does. You talked a little bit about your prioritizations
for 2024. What does success look like for you this year? And where do you think and be able to test with it.
So within 30 days, we brought that silicon up and running.
We built up some M.2 modules that contained four of our chips on a single module.
And then we converted nine demonstrations to the production silicon in time to showcase them at CES in early January.
Since then, we're currently validating the chip and we're going through the validation and
qualification process to bring a high volume chip to the market. We expect to do that in the second
quarter such that in the second half we can start to ramp production, work with customers,
solve their problems, and then ultimately ramp our business throughout the rest of 2024.
This has been a fascinating interview, and I can't wait to hear more from the MemoryX team
as we head through the year and see more as these products start hitting the
market. Where folks, excuse me, where can folks find out more about MemoryX and connect with your
team, Roger? Well, I think the best place is our website, memoryx.com. And what we've done with
the website is we've tried to be very transparent with our approach and our architecture, because I
think that in this space, there's a lot of hype. There's a lot of,
I hate to say it, but broken promises. And so our approach is to try and be really forthright with
regards to what we can do, the value of our chip, the value that we bring, such that when customers
actually get the product and they put it into a POC, there's no surprises.
And I'm going to go back, Alison, really quickly to the ease of use. When I first went out to
customers with PowerPoint, I didn't have the chip and I shared with them what we can do.
The level of skepticism was through the roof. In fact, I did have one person tell me they basically didn't believe a word I was saying,
which is really interesting.
Until we got the parts back
and we were able to provide it,
then the proof is in the pudding.
And all of a sudden,
we got a really, really good response
from the customer's experience.
So again, the website's the best place.
And if you'd like more information,
you can reach out to us via the website or feel free to subscribe to our monthly newsletter
where we're kind of putting out the latest and greatest in terms of our deliverables and our
accomplishments. Roger, the fact that you told me what your use case targets were was shockingly refreshing in the AI startup world.
So thank you. And I appreciate that. And I can't wait to hear more from MemoryX.
Thanks for being on the show today.
All right. Thank you very much, Alison. I really appreciate being here.
Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net.
All content is copyright by The Tech Arena.