In The Arena by TechArena - MemryX Delivers AI Performance at the Edge

Episode Date: February 23, 2024

TechArena host Allyson Klein talks to MemryX VP of Product and Business Development Roger Peene about how his company is transforming AI at the edge with their silicon and how we’re sitting in an AI... revolution.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome to the Tech Arena. My name is Allison Klein. I am delighted today to be joined by Roger Peen, Vice President of Product and Business Development with MemoryX. Welcome to the program, Roger. Thank you, Allison. Hey, I appreciate you asking me to be on your podcast. It's great to be here. Well, you guys are actually fitting into a topic that I've been covering quite a bit. We've been talking about on the tech arena about the incredible breadth of innovation that's been going on with the growth of AI. And, you know, AI has moved past the large cloud service providers into everyday industry on the planet,
Starting point is 00:01:01 looking for ways to adopt the technology. And with that move, we see this diverse customer base looking for unique solutions. How does that represent an opportunity for companies like yours? Great. Well, first, I think we can agree that we're in the middle of an AI revolution, not unlike the industrial revolution, in that AI, I believe AI will become pervasive in our everyday lives. If you're a technologist or a strategist for a company out there, and you haven't yet identified how to harness the power of AI in your business, then you're quickly falling behind. You know, today, AI processing, a lot of it is done in the cloud, which requires a ton of hardware and power to execute. For example, to train an LLM model or a Gen AI model can take thousands of GPUs and a ton of power and cost to train it.
Starting point is 00:01:59 Also, queries and inferencing can be very expensive. It's been reported that querying AI in the cloud costs 10 times more than a traditional search query, which really creates a financial discontinuity, if you will, and creates a burden for cloud providers. So I believe, and I think others believe, that it makes sense to take a lot of this AI processing that currently is running in the cloud and push it out to the edge and closer and closer to where data is created, just from the perspective of better latency, better performance, and much lower cost. Now, MemoryX has been coming up a lot in industry conversations that I've been having.
Starting point is 00:02:45 Can you introduce us to the company and your solutions and tell us a bit about why you decided to focus here at the edge? Sure. So MemoryX was founded in 2019. It's a startup that was created out of University of Michigan in Ann Arbor with the task of creating a edge AI accelerator that can uniquely run these edge AI models. So what we've created is a data flow processor. And what's really, really neat about the company and the architecture is it was created from a blank sheet of paper. We didn't try and force fit legacy architectures or technologies like DSP engines or microprocessor engines into and form fitted into AI. It was really created from scratch. And not only that, we created both the silicon, the hardware, along with the software, the SDK, together. So they worked hand-in-hand for ease of deployment
Starting point is 00:03:46 and for optimizing the execution of these AI models. And so you ask me, why did we decide to focus here and focus on the edge? And as I mentioned earlier, running AI models, all AI models in the cloud can be expensive. And the challenge with moving AI processing from the cloud out to the edge is that there's a lack of optimized processing capability. Sure, you can run AI on CPUs and GPUs, but they're still pretty inefficient from a power versus performance versus cost perspective. So having a dedicated data flow processing engine that was created to optimize
Starting point is 00:04:36 running the AI models really creates value for edge AI solutions. So today, our processor is 100 times more efficient than a CPU, and 10 times more inferences per watt efficiency than a than a GPU today. Those are incredible statistics. And you know, I think that for those of us who may not have designed for the edge, can you talk a little bit about how edge is different and what kind of constraints you have to design around to land equipment at the edge? Sure. So the one thing about the edge, software is much, much, much more diverse than, let's say, a vertically integrated solution or even a market like just a standard PC segment. One of the things that we really focused on was ease of use.
Starting point is 00:05:40 The intention is to get hardware out of the way of software to enable developers to focus on what's important, and that is the AI models themselves and tuning the AI models for that specific edge workload. And so to that end, we developed a software development kit that can take AI models from any major framework, and we require no quantization of that model, no pruning of it, no compression and no retraining. We can take an AI model as is and compile it directly down into our hard works. So the time to be up and running is hours instead of days, weeks, or even in some cases, months. And so getting that ease of use is really important because you're going after this very large swath of applications and different usages. When you look at that diverse landscape that you just described, it made me think about the plethora of target markets that
Starting point is 00:06:45 you're going after. And you talk about them on your website. I was reading about it in the prep for the interview. Can you shed some light? You know, you've been engaging with customers on the products. Can you shed some light on where you're seeing early traction in the market and where you expect rapid movement in 2024? Sure. So I think that some of the usages that we're really focused on right now are in the PC segment, which is edge servers that are used for applications like video management servers,
Starting point is 00:07:18 imaging, and things like that. We're also focused on industrial PCs for obviously industrial applications. So think manufacturing, defect detection, these types of usages, and also smart city applications. Now, we're also very applicable to a bunch of different usages. For example, automotive, both driver monitor assist platforms around vehicle monitoring smart mirrors these are all things that we've worked on we've done some pocs on but the challenge for a startup like ours is simply time to revenue so while we're very
Starting point is 00:08:00 applicable to that we don't have the luxury of waiting three, four years for revenue to come in on those. So what we're really doing is focused on sponsor customers in those segments, and they're not really our target priority for 2024. Got it. Now, you talked about the ease of use and getting workloads up and running in hours versus days or weeks. How do you navigate the rapidly evolving software stacks and frameworks in this space when you're landing an accelerator across so many different environments and potentially different potential platforms? Great question. And in some cases, I think that there's all these different models. So I should be clear that our accelerator, while it can run audio models and some other models, it's very focused on computer vision specifically. So that's kind of our sweet spot.
Starting point is 00:08:58 And that's where we think we can offer the most value to the customers. Now, within computer vision, obviously obviously there's new models coming out, there's new operators. And so through our compiler, we have the ability to inject techniques to basically comprehend new technologies and new operators as they come to market, but then also move towards integrating them
Starting point is 00:09:23 into the hardware. So having the flexibility of kind of general purpose Mac units and general purpose computing elements within the silicon really helps our ability to accept new technology, accept new operators as they come to market. But obviously, we're always looking to optimize and be able to work very well with the new stuff coming out. It's in a very rapidly evolving segment. Now, I've got to ask, because we've been talking about chiplets a lot on Tech Arena. Do you see MemoryX pursuing chiplet designs with other vendors? And how do you view, I know you've been in this industry for a really long time. How do you view the chiplet technology maturing in the market? So I have been in the industry for a while. I've been in semiconductors for about 30
Starting point is 00:10:17 years. And what I've noticed is the expectations on front end ramps of technologies are usually much stronger than reality. And the back end curves are much slower than reality. So while chiplets is a buzzword, and chiplets is something that's getting a lot of attention, I think, from a productization perspective, we recognize it'll be a little bit of time before all the standards are in place, all the elements come together, where they can be developed and validated and the interoperability is there. With that being said, from day one, we expected our product to be a chiplet. So if you go to our website, we actually talk about this. And so we can sell our accelerator as a discrete piece of silicon that can go into an MCM package.
Starting point is 00:11:12 We can package it ourselves or we can sell it in the form of modules. Now, moving forward, to be able to support the chiplet ecosystem, looking at interfaces like UCIE and those types of things is something that we will consider on next generation of products. But bottom line is we do believe that chiplets and disaggregation within a package is eventually going to come to market and we're going to be ready to support it when it does. You talked a little bit about your prioritizations for 2024. What does success look like for you this year? And where do you think and be able to test with it. So within 30 days, we brought that silicon up and running.
Starting point is 00:12:10 We built up some M.2 modules that contained four of our chips on a single module. And then we converted nine demonstrations to the production silicon in time to showcase them at CES in early January. Since then, we're currently validating the chip and we're going through the validation and qualification process to bring a high volume chip to the market. We expect to do that in the second quarter such that in the second half we can start to ramp production, work with customers, solve their problems, and then ultimately ramp our business throughout the rest of 2024. This has been a fascinating interview, and I can't wait to hear more from the MemoryX team as we head through the year and see more as these products start hitting the
Starting point is 00:13:05 market. Where folks, excuse me, where can folks find out more about MemoryX and connect with your team, Roger? Well, I think the best place is our website, memoryx.com. And what we've done with the website is we've tried to be very transparent with our approach and our architecture, because I think that in this space, there's a lot of hype. There's a lot of, I hate to say it, but broken promises. And so our approach is to try and be really forthright with regards to what we can do, the value of our chip, the value that we bring, such that when customers actually get the product and they put it into a POC, there's no surprises. And I'm going to go back, Alison, really quickly to the ease of use. When I first went out to
Starting point is 00:13:53 customers with PowerPoint, I didn't have the chip and I shared with them what we can do. The level of skepticism was through the roof. In fact, I did have one person tell me they basically didn't believe a word I was saying, which is really interesting. Until we got the parts back and we were able to provide it, then the proof is in the pudding. And all of a sudden, we got a really, really good response
Starting point is 00:14:18 from the customer's experience. So again, the website's the best place. And if you'd like more information, you can reach out to us via the website or feel free to subscribe to our monthly newsletter where we're kind of putting out the latest and greatest in terms of our deliverables and our accomplishments. Roger, the fact that you told me what your use case targets were was shockingly refreshing in the AI startup world. So thank you. And I appreciate that. And I can't wait to hear more from MemoryX. Thanks for being on the show today.
Starting point is 00:14:54 All right. Thank you very much, Alison. I really appreciate being here. Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by The Tech Arena.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.