SemiWiki.com - Podcast EP279: Guy Gozlan on how proteanTecs is Revolutionizing Real-Time ML Testing

Starting point is 00:00:00 Hello, my name is Daniel Nennie, founder of SemiWiki, the open forum for semiconductor professionals. Welcome to the Semiconductor Insiders podcast series. My guest today is Guy Goselin, Protean Tech's Director of Machine Learning and Algorithms, overseeing research, implementation and infrastructure of machine learning solutions. Prior to ProtionTex, he was project lead at Apple focusing on ATE optimizations using embedded software and machine learning and embedded software engineering at Milonics. Welcome to the podcast, Guy.

Starting point is 00:00:40 Nice to be here. Thank you for having me. First, can you tell us how you got your start in semiconductors? Yes, certainly. Actually looking back to my life story, I think I didn't have any other option. As a kid, I was very enthusiastic for physics and mathematics. I think like many eighties kids, enthusiastic also by computers, but not only the applications on top of them, but also I like to take those computers apart

Starting point is 00:01:17 to really understand how things are going inside, taking out the RAM and hard disks and everything. And in addition, my father was the first, one of the first employees in Intel in Israel. And I remember as a kid, you know, during summer vacations going to dad's work, putting on this white suit going into the fab and getting amazing explanation of this magic that is happening

Starting point is 00:01:49 in those fabs. For me it was as a kid it was an amazing experience how those machines are telling the story of everything that is around us all this technology technology, for me it was amazing. And I think that after that studying majoring in physics and mathematics in high school just led me to directly to this field. So, you know, I took some one decade of shift in my life, joining the Israeli Air Force. But after that, right after my service, just started my studies in electrical engineering, and later on master degree in computer science, focusing on machine learning. So I think this whole thing led me to be in the semiconductor industry.

Starting point is 00:02:48 That's a great story. So what brought you to Proteon Techs? So actually I started my career in, as you said, in Mellanox actually as a firmware engineer. And this was the first time I was starting to understand the combination between software and hardware. And this combination was very interesting for me. And after that, I joined Apple. And I was leading there a very interesting project where we tried to optimize the test time

Starting point is 00:03:28 and the calibration time of Apple Silicon using firmware. So we did it for RF chips but we also implemented it on many other solutions of Apple. And then it was very obvious for me at the time to see how machine learning technology, which was at that time very immature, but to see how we can leverage this new technology into silicon, silicon testing. And it was a major success. And I really fell in love in this combination of machine learning together with Silicon.

Starting point is 00:04:17 And I heard about Porte Antecs actually from a friend at Apple. And when I started to look into this company, I must say that I was amazed because I told myself, oh my God, this technology is actually what I needed for the Apple project. If I had this technology back then, I would have done amazing things

Starting point is 00:04:45 even better at my project at Apple. So I told myself, I must join this company. From friends, connections, I just jumped in and got an interview at Prodiantex and I'm here for the last four years. Yeah, I agree with you completely. We've been working with Proteantex for I think three years now, an amazing experience, a very new approach.

Starting point is 00:05:15 So what do you see as the big challenges in high volume testing today? So, you know, as the technology improves and we go to better process nodes and the chip becomes more and more complex, that means that the testing of them become much more complex. So when you are going into such complex process and such complex chips, there are several factors that you need to take into account when you're going to high volume testing. First of all is the test time or what I call the test cost, because the cost is not always only the test time, but also the time and effort it takes you to develop this kind of testing.

Starting point is 00:06:06 It's also an amazing technology, something that you need to invest time in. The second factor is the yield. So the market today is very competitive as you know and customers today cannot risk a low yield. Low yield means that your product will cost more for the end customer and you cannot be competitive. And the third one, which I think is one of the most important factors, is the quality. Because today, chips are going into our day-to-day life.

Starting point is 00:06:45 They are going into vehicles, they are going into sensitive sites, the medical areas, and many more. We cannot risk low quality. Quality today becomes a key factor in the consideration of testing. I don't think you will find someone who will agree is a key factor in the consideration of testing. I don't think you will find someone who will agree to compensate on quality when he's going

Starting point is 00:07:13 into autonomous car or when going into an airplane. And even in our cellular phones, we cannot risk quality. So I think those three factors need to be balanced somehow. And I think this is one of the major challenges here. It's the balance. But keep in mind that it's not only because we're going to more complex process nodes, like three nanometers and two nanometers,

Starting point is 00:07:49 where you can find many challenges, you will have more variability there, and it will be harder to catch defect. But it's also due to the fact that we're going to more complex packaging. We're going into multi-chip packaging and those things create even bigger challenges in the testing phases.

Starting point is 00:08:15 So I think we can all agree today that the traditional methods are not enough in order to balance those three factors in an optimal way. Got it. What is Proteantex strategy and application to address the market needs? So as you said, Proteantex is actually taking the very different and interesting approach. I think the first thing here is that you can take better decisions by actually looking at the physics health of the chip. Okay.

Starting point is 00:08:52 And even better to compare between expected behavior to what you actually measure. So what is what Proton Antics is doing is actually where integrating IP that was created by Proteantex into the customer chip. So this IP is very, very small. You can consider it like small sensors, what we call agents, and they're integrated into the customer chip. Because they're small and because they have almost zero impact of a power performance scenario,

Starting point is 00:09:28 the customer can actually put 10 of thousands of those kind of agents inside its chip. And now you actually get amazing visibility of many physical factors of the chip, looking at the temperature and the voltage, and looking at the clock integrity and power integrity of the chip. And even while tracking the actual margins of millions and millions of logical paths inside the chip.

Starting point is 00:10:02 So now we have created the first thing that is needed in order to go to the next phase. And this is the basics of the data layer. So we have deep data, tons of data from each chip, spatial view for each chip. And now what you can do with that is endless. So for example, you can take, let's say IDDQ measurements, for example,

Starting point is 00:10:32 and you can create a model that is based on those agents to predict the IDDQ measurement. And once you have this kind of machine learning model, you can run it on any new chip that arrives. You get the predicted IDDQ measurement, and you can also do the actual measurement. And now you can compare between them. So why is this so important?

Starting point is 00:10:58 Because you're saying from the model, you are saying, okay, according to the DNA of the chip, from the core of the model, you are saying, OK, according to the DNA of the chip, from the core of the chip, I know that the leakage should be something. But once you're comparing it to the actual measurement, now you can see if you have discrepancy. And if you have, for example, higher leakage than what you expected to be according to the chip DNA, it might be that you have some kind of shorts inside your chip.

Starting point is 00:11:30 Okay? So this can provide you a different angle of looking at those measurements. The second concept that we are implementing is the shift left strategy. So as we all know, shift left is is one of the holy grails in testing, because the cost of a single RMA, for example, is few order of magnitude bigger than to screen devices in early stages. So for example, RMA can cost up to $50K, while chip in wafer salt, when you screen it ahead of time, it can cost you less than $10.

Starting point is 00:12:13 So you see the difference between them. So this is another approach that we are taking because proteanthics can actually measure the agents, the DNA of the chip and all those measurements in WaferSort, but to create a machine learning model that predicts what will happen to this system or this chip in the later stages. So this is actually magic, but it happens and we help a lot of our customers using those approaches.

Starting point is 00:12:49 Because our customers now have these monitoring agents, we can actually provide the solution to create those machine learning models that can predict RMAs, that can predict failures in final test or in system level test, but using the data from early stages like Web for Sort. If you can do that, customers can save a lot of money in production, but also increase the quality and the reliability of their devices.

Starting point is 00:13:25 And how are you employing ML in your solution? So actually we have a lot of usage with machine learning in Proteantex. Outlier detection that I just mentioned is just one example. We're using machine learning from optimizing performance through tracking system in the infield and many other applications. But as you probably know, the basics of a good machine learning model is actually data.

Starting point is 00:13:59 So high quality data, high resolution data, this is the basics of creating a good machine learning model. And Proteantex deep data is perfect fit for that. For machine learning engineer, it's pure gold, this kind of data. So the agents are embedded into the customer design and monitoring throughout all the testing phases. So this data together with a customer relevant data, like we said in the IDDQ can create amazing machine learning models.

Starting point is 00:14:35 And we actually take this data from production and this data is being sent to our platform in the cloud. And it's being pre-processed there. In the platform, we have developed very strong machine learning tools for our customers that they can create their own models, but also we have automatic flows to identify anomalies or to create generic machine learning models.

Starting point is 00:15:03 So basically, those powerful tools is both general and specific for specific tasks. Like we gave an example for IDDQ and we provide solution for many angles of machine learning. We use a supervised machine learning and unsupervised models. And once you have this kind of model, like we said in the IDQ, this machine learning tool can actually suggest the actual

Starting point is 00:15:38 thresholds that you need for deciding if a sample or in this case a chip is far from the normal behavior, and you need to mark it as an outlier. I think one thing that is very unique about PoteantX is that it's not just a model and not just an IP, it's a vertical solution, a full vertical solution. That means that the models can be built in the cloud,

Starting point is 00:16:10 can be easily downloaded and compiled to SIP code, and those can actually go to the test floor and running in real time on the tester. But not only that, those results are going back to the cloud for retraining and reevaluating your model. So the model is actually evolving during the time. So this is one of the strengths of Proteantex, this vertical full solution.

Starting point is 00:16:43 So you build a model with this powerful data, those powerful IPs, and amazing machine learning tools. And you can take it to any test stage that you want. You can run it on ATE. You can run it on system level test or system test, and you can actually use those models to screen devices in real time. This is a very strong capability. And those retrain is also a powerful method because tests are changing, the environment is changing, the material is changing, and it's shifted.

Starting point is 00:17:25 So we can just go back to the cloud and retrain the model easily. You can download those models back to your test environment. And this full cycle is very powerful for our customers. What type of challenges did you see with this type of ML models? Well, actually many of those, but I'll try to identify what I believe the three most important ones. I would say so first in order to create a good model, you need a massive amount of data, but not only massive amount of data,

Starting point is 00:18:07 of data, but not only massive amount of data, it needs to be in high resolution and it needs to be very good data, high quality data. And this is the most important basic part of any machine learning model. The second is you need to understand your data. You cannot just take a huge amount of data and put it into a black box and hope for the best. You need to understand the physics of it. You need to understand how it's being affected by temperature, by voltage, by process, for example. Otherwise, you will end up with garbage in, garbage out. So we have data.

Starting point is 00:18:42 We need to understand the actual physics of the data. And the third one is that creating a very good model offline or in the cloud or wherever you're creating your model, it's not enough. You need to productize it. You need to leverage your model for actual activity, as we said in the tester or in the infield. So for example, you need to track the model in the cloud to be able to understand who created this model,

Starting point is 00:19:17 what material was used for this model, what was the performance of it, and you need to be able to compare between different experiments that you are doing and so on. You also need to be able to take this model and use it, actually use it in the tester. So it should run smoothly. You should be able to track the versions.

Starting point is 00:19:41 You need to make sure that transition from cloud to the tester or to the edge is smooth. And also this is very, very important. Machine learning model are trained according to the data that they have trained on. So if there is a shift in the process in production, for example, the model is shifted as well. So you need a good mechanism to track those models, to track their behavior in the field or in the ATE, and to make sure they are still aligned, and if not, alert and retrain,

Starting point is 00:20:20 because otherwise your models are good enough for a very short period of time. And I think those three factors are being covered entirely in the Porta Antec solution because we deliver a robust application that allows not only the state-of-the-art models, but also clear visibility and the ability for the customer to run the trained model on the tester and circle back from the tester to the cloud to track for drifts or for misbehavior of the models. Interesting and what are the different techniques you are deploying? So actually we provide many outlier detection as well

Starting point is 00:21:06 as shift left applications. So some of them are specific for specific tasks. Like we said, IDDQ or VDD mean or a calibration outlier detection based. Some of them are based on the spatial view, the spatial signature of the chip or the spatial signature of even the whole wafer. We have many application of outlier detection and some of our models are supervised models.

Starting point is 00:21:38 As I said, some of them are unsupervised and even we're using semi-supervised mechanisms in order to provide the best in the state of the art outer detection mechanisms. And why is this so important? Look, at the end of the day, we see the impact on our customers. I think we can see the satisfaction and this is what drives us.

Starting point is 00:22:01 So we provide them a real solution, an end-to-end solution, from the IP to the cloud tools to provide them real actionable insights and back for the full model monitoring in the Proteantex platform and all the way back to the Edge or the tester. We see some of our customers that are using our technology, they've managed to improve the reliability by up to 10 X of the DPPM.

Starting point is 00:22:33 They reduce the false positive. They were able to increase their product quality, which as I said at the beginning, it's very important for our customers. From my point of view, that can only be done if you have a full vertical solution, the IP, the cloud, the platform, and the Edge application, and only if you have a very good understanding of your data

Starting point is 00:22:59 and you create the top of the state-of-the-art machine learning solutions. Great conversation. Thank you, Guy. Thank you so much for having me. That concludes our podcast. Thank you all for listening and have a great day.

Your Ad Here

SemiWiki.com - Podcast EP279: Guy Gozlan on how proteanTecs is Revolutionizing Real-Time ML Testing

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.