SemiWiki.com - Podcast EP279: Guy Gozlan on how proteanTecs is Revolutionizing Real-Time ML Testing
Episode Date: March 28, 2025Dan is joined by Guy Gozlan, proteanTecs director of machine learning and algorithms, overseeing research, implementation, and infrastructure of machine learning solutions. Prior to proteanTecs he w...as project lead at Apple, focusing on ATE optimizations using embedded software and machine learning and embedded software… Read More
Transcript
Discussion (0)
Hello, my name is Daniel Nennie, founder of SemiWiki, the open forum for semiconductor
professionals.
Welcome to the Semiconductor Insiders podcast series.
My guest today is Guy Goselin, Protean Tech's Director of Machine Learning and Algorithms,
overseeing research, implementation and infrastructure of machine learning solutions.
Prior to ProtionTex, he was project lead at Apple focusing on ATE
optimizations using embedded software and machine learning and embedded software
engineering at Milonics. Welcome to the podcast, Guy.
Nice to be here. Thank you for having me. First, can you tell us how you got your start in semiconductors?
Yes, certainly. Actually looking back to my life story,
I think I didn't have any other option.
As a kid, I was very enthusiastic for physics and mathematics.
I think like many eighties kids,
enthusiastic also by computers,
but not only the applications on top of them,
but also I like to take those computers apart
to really understand how things are going inside,
taking out the RAM and hard disks and everything.
And in addition, my father was the first,
one of the first employees in Intel in Israel.
And I remember as a kid, you know,
during summer vacations going to dad's work,
putting on this white suit going into the fab
and getting amazing explanation of this magic that is happening
in those fabs. For me it was as a kid it was an amazing experience how those machines are telling
the story of everything that is around us all this technology technology, for me it was amazing. And I
think that after that studying majoring in physics and mathematics in high
school just led me to directly to this field. So, you know, I took some one decade of shift in my life,
joining the Israeli Air Force. But after that, right after my service, just started my studies
in electrical engineering, and later on master degree in computer science, focusing on machine
learning. So I think this whole thing led me to be in the
semiconductor industry.
That's a great story. So what brought you to Proteon Techs?
So actually I started my career in, as you said, in Mellanox actually as a firmware engineer.
And this was the first time I was starting to understand the combination between software
and hardware.
And this combination was very interesting for me.
And after that, I joined Apple.
And I was leading there a very interesting project
where we tried to optimize the test time
and the calibration time of Apple Silicon using firmware. So we did it for
RF chips but we also implemented it on many other solutions of Apple. And then it was very obvious for me at the time
to see how machine learning technology, which was at that
time very immature, but to see how we can leverage
this new technology into silicon, silicon testing.
And it was a major success.
And I really fell in love in this combination
of machine learning together with Silicon.
And I heard about Porte Antecs
actually from a friend at Apple.
And when I started to look into this company,
I must say that I was amazed because I told myself,
oh my God, this technology is actually
what I needed for the Apple project.
If I had this technology back then,
I would have done amazing things
even better at my project at Apple.
So I told myself, I must join this company.
From friends, connections, I just jumped in
and got an interview at Prodiantex
and I'm here for the last four years.
Yeah, I agree with you completely.
We've been working with Proteantex for I think three years now, an amazing
experience, a very new approach.
So what do you see as the big challenges in high volume testing today?
So, you know, as the technology improves and we go to better process nodes and the chip
becomes more and more complex, that means that the testing of them become much more
complex.
So when you are going into such complex process and such complex chips,
there are several factors that you need to take into account when you're going to high volume testing.
First of all is the test time or what I call the test cost,
because the cost is not always only the test time, but also the time and effort it takes you to develop this kind of testing.
It's also an amazing technology, something that you need to invest time in.
The second factor is the yield. So the market today is very competitive as you
know and customers today cannot risk a low yield.
Low yield means that your product will cost more for the end customer
and you cannot be competitive.
And the third one, which I think is one of the most important factors,
is the quality.
Because today, chips are going into our day-to-day life.
They are going into vehicles,
they are going into sensitive sites,
the medical areas, and many more.
We cannot risk low quality.
Quality today becomes a key factor
in the consideration of testing.
I don't think you will find someone who will agree is a key factor in the consideration of testing.
I don't think you will find someone who will agree to compensate on quality when he's going
into autonomous car or when going into an airplane.
And even in our cellular phones, we cannot risk quality.
So I think those three factors need to be balanced somehow.
And I think this is one of the major challenges here.
It's the balance.
But keep in mind that it's not only
because we're going to more complex process nodes,
like three nanometers and two nanometers,
where you can find many challenges,
you will have more variability there,
and it will be harder to catch defect.
But it's also due to the fact that we're
going to more complex packaging.
We're going into multi-chip packaging
and those things create even bigger challenges
in the testing phases.
So I think we can all agree today
that the traditional methods are not enough
in order to balance those three factors in an optimal way.
Got it. What is Proteantex strategy and application to address the market needs?
So as you said, Proteantex is actually taking the very different and interesting approach.
I think the first thing here is that you can take better decisions by actually looking
at the physics health of the chip.
Okay.
And even better to compare between expected behavior to what you actually measure.
So what is what Proton Antics is doing is actually where integrating IP
that was created by Proteantex into the customer chip.
So this IP is very, very small.
You can consider it like small sensors, what we call agents,
and they're integrated into the customer chip.
Because they're small and because they have almost zero
impact of a power performance scenario,
the customer can actually put 10 of thousands
of those kind of agents inside its chip.
And now you actually get amazing visibility
of many physical factors of the chip,
looking at the temperature and the voltage,
and looking at the clock integrity and power integrity of the chip.
And even while tracking the actual margins of millions and millions
of logical paths inside the chip.
So now we have created the first thing that is needed
in order to go to the next phase.
And this is the basics of the data layer.
So we have deep data, tons of data from each chip,
spatial view for each chip.
And now what you can do with that is endless.
So for example, you can take,
let's say IDDQ measurements, for example,
and you can create a model that is based on those agents
to predict the IDDQ measurement.
And once you have this kind of machine learning model,
you can run it on any new chip that arrives.
You get the predicted IDDQ measurement,
and you can also do the actual measurement.
And now you can compare between them.
So why is this so important?
Because you're saying from the model, you are saying,
okay, according to the DNA of the chip, from the core of the model, you are saying, OK, according to the DNA of the chip,
from the core of the chip, I know that the leakage should be something.
But once you're comparing it to the actual measurement,
now you can see if you have discrepancy.
And if you have, for example, higher leakage than what you expected
to be according to the chip DNA, it might be that
you have some kind of shorts inside your chip.
Okay?
So this can provide you a different angle of looking at those measurements.
The second concept that we are implementing is the shift left strategy. So as we all know, shift left is is one of the holy grails in testing, because the
cost of a single RMA, for example, is few order of magnitude bigger than to screen devices in early
stages. So for example, RMA can cost up to $50K,
while chip in wafer salt,
when you screen it ahead of time,
it can cost you less than $10.
So you see the difference between them.
So this is another approach that we are taking
because proteanthics can actually measure the agents,
the DNA of the chip and all those measurements in WaferSort,
but to create a machine learning model that
predicts what will happen to this system or this chip in the later stages.
So this is actually magic,
but it happens and we help a lot of our customers using those approaches.
Because our customers now have these monitoring agents,
we can actually provide the solution to create
those machine learning models that can predict RMAs,
that can predict failures in final test or in system level test,
but using the data from early stages like Web for Sort.
If you can do that,
customers can save a lot of money in production,
but also increase the quality and the reliability of their devices.
And how are you employing ML in your solution?
So actually we have a lot of usage with machine learning in Proteantex.
Outlier detection that I just mentioned is just one example.
We're using machine learning from optimizing performance
through tracking system in the infield
and many other applications.
But as you probably know, the basics
of a good machine learning model is actually data.
So high quality data, high resolution data,
this is the basics of creating a good machine learning
model. And Proteantex deep data is perfect fit for that. For machine learning engineer,
it's pure gold, this kind of data. So the agents are embedded into the customer design and monitoring
throughout all the testing phases.
So this data together with a customer relevant data,
like we said in the IDDQ can create amazing machine learning
models.
And we actually take this data from production and this data
is being sent to our platform in the cloud.
And it's being pre-processed there.
In the platform, we have developed
very strong machine learning tools for our customers
that they can create their own models,
but also we have automatic flows to identify anomalies
or to create generic machine learning models.
So basically, those powerful tools is both general
and specific for specific tasks.
Like we gave an example for IDDQ
and we provide solution for many angles of machine learning.
We use a supervised machine learning
and unsupervised models.
And once you have this kind of model, like we said in the IDQ,
this machine learning tool can actually suggest the actual
thresholds that you need for deciding if a sample or
in this case a chip is far from the normal behavior,
and you need to mark it as an outlier.
I think one thing that is very unique about
PoteantX is that it's not just a model and not just an IP,
it's a vertical solution,
a full vertical solution.
That means that the models can be built in the cloud,
can be easily downloaded and compiled to SIP code,
and those can actually go to the test floor
and running in real time on the tester.
But not only that, those results are going back to the cloud for retraining
and reevaluating your model.
So the model is actually evolving during the time.
So this is one of the strengths of Proteantex,
this vertical full solution.
So you build a model with this powerful data,
those powerful IPs, and amazing machine learning tools.
And you can take it to any test stage that you want.
You can run it on ATE.
You can run it on system level test or system test, and you can actually
use those models to screen devices in real time. This is a very strong capability.
And those retrain is also a powerful method because tests are changing, the environment
is changing, the material is changing, and it's shifted.
So we can just go back to the cloud
and retrain the model easily.
You can download those models back to your test environment.
And this full cycle is very powerful for our customers.
What type of challenges did you see
with this type of ML models?
Well, actually many of those, but I'll try to identify what I believe the three most important ones.
I would say so first in order to create a good model, you need a massive amount of data, but not only massive amount of data,
of data, but not only massive amount of data, it needs to be in high resolution and it needs to be
very good data, high quality data. And this is the most important basic part of any machine learning model. The second is you need to understand your data. You cannot just take a huge
amount of data and put it into a black box and hope for the best.
You need to understand the physics of it.
You need to understand how it's being affected
by temperature, by voltage, by process, for example.
Otherwise, you will end up with garbage in, garbage out.
So we have data.
We need to understand the actual physics of the data.
And the third one is that creating a very good model offline or in the cloud or
wherever you're creating your model, it's not enough.
You need to productize it.
You need to leverage your model for actual activity,
as we said in the tester or in the infield.
So for example, you need to track the model in the cloud
to be able to understand who created this model,
what material was used for this model,
what was the performance of it,
and you need to be able to compare
between different experiments that you are doing and
so on.
You also need to be able to take this model and use it, actually use it in the tester.
So it should run smoothly.
You should be able to track the versions.
You need to make sure that transition from cloud to the tester or
to the edge is smooth.
And also this is very, very important.
Machine learning model are trained according to the data that they have trained on.
So if there is a shift in the process in production, for example,
the model is shifted as well.
So you need a good mechanism to track those models, to track their behavior in the field or in the ATE,
and to make sure they are still aligned, and if not, alert and retrain,
because otherwise your models are good enough for a very short period of time.
And I think those three factors are being covered entirely in the Porta Antec solution
because we deliver a robust application that allows not only the state-of-the-art models,
but also clear visibility and the ability for the customer to run the
trained model on the tester and circle back from the tester to the cloud to
track for drifts or for misbehavior of the models.
Interesting and what are the different techniques you are deploying?
So actually we provide many outlier detection as well
as shift left applications.
So some of them are specific for specific tasks.
Like we said, IDDQ or VDD mean or a calibration
outlier detection based.
Some of them are based on the spatial view, the spatial signature of the chip or
the spatial signature of even the whole wafer.
We have many application of
outlier detection and some of our models are supervised models.
As I said, some of them are unsupervised and even we're
using semi-supervised mechanisms in order to provide the best in the state of the art
outer detection mechanisms.
And why is this so important?
Look, at the end of the day,
we see the impact on our customers.
I think we can see the satisfaction
and this is what drives us.
So we provide them a real solution,
an end-to-end solution,
from the IP to the cloud tools to provide them real actionable insights
and back for the full model monitoring in the Proteantex platform
and all the way back to the Edge or the tester.
We see some of our customers that are using our technology,
they've managed to improve the reliability
by up to 10 X of the DPPM.
They reduce the false positive.
They were able to increase their product quality,
which as I said at the beginning,
it's very important for our customers.
From my point of view, that can only be done
if you have a full vertical solution,
the IP, the cloud, the platform, and the Edge application,
and only if you have a very good understanding of your data
and you create the top of the state-of-the-art
machine learning solutions.
Great conversation. Thank you, Guy.
Thank you so much for having me.
That concludes our podcast.
Thank you all for listening and have a great day.