Semiconductor Insiders - Podcast EP335: The Far Reaching Impact of UCIe with Dr. Debendra Das Sharma

Starting point is 00:00:07 Hello, my name is Daniel Nenny, founder of Semaywiki, the Open Forum for Semiconductor professionals. Welcome to the Semiconductor Insiders podcast series. My guest today is Dr. Devendra Das Sharma, a senior fellow and chief I.O. architect in the data platforms and artificial intelligence group at Intel. He's a member of NAE, fellow of ICCI, and a fellow of International Academy of AI Sciences. He is a leading expert on IOS subsystem and interface architecture. He co-invented the Chipplet Interconnect Standard UCIE and is the chair of the UCIE Consortium, which is what we're going to talk about today.

Starting point is 00:00:44 Welcome to the podcast, Abendra. Thank you, Daniel. It's an honor and a pleasure to talk to you. It is also an honor and a pleasure to talk to you. First, can you tell us a little bit about your story career journey with Intel and what you're doing today? Sure. I did my undergraduate in India, at Indian Institute of Technology, IIT, Khadakpur. Then I came to US to do my PhD at University of Massachusetts Amherst.

Starting point is 00:01:15 And after graduating with my PhD in 94, I joined HP, now HPE, Hewlett Packard. So we were doing server chipsets at HP. Of course, iOS is part of that as is memory and all those good things. In 2001, we got our group got acquired by Intel. So I've been with Intel. So technically have been with Intel because of the acquisition since 1994. Started off working on the server chips at Intel. And that was the time we were making that transition from the PCI bus-based architecture to PCI Express,

Starting point is 00:01:55 links based so I've been involved on driving PC Express since then and that's how I started in the I.O side well I was involved in IOSite from the HB days so I've been pretty much an IO guy all throughout my career and then you know with I did compute Express link multiple generations of PC Express still very involved in that I did compute Express link still involved in that very much and then of course the latest one is the chiplets, which is universal chiplet interconnect express, UCIE. That's great. So let's start off. Can you share a quick introduction about UCIE? Sure. So as I said, UCIE basically stands for universal chiplet interconnect express. It's an open industry consortium. We formed it in March of 2022. And our goal was to drive an open industry

Starting point is 00:02:54 standard that will result in an basically an open chiplet ecosystem. And what we have seen from our experience with PC Express and CXL and USV is that these kind of open standards, they unleash innovations in whichever area that you open them up with. So the goal of UCIE was to unleash innovations on the on-package landscape. Currently UCI has more than 140 member companies across the globe and basically these will be your who's who in the field of foundry semiconductors packaging IP cloud service providers automotive right you name it and as I said the model that we've had in mind was similar to that with PC Express USB CXL

Starting point is 00:03:43 effectively we have seen the goodness that open standards belong so for example at a board level you can today build our and this has been true for decades now, right? And any kind of a networking device or an SSD or an accelerator with PC Express. And you don't have to worry about, hey, who is going to be my attach point? Almost all the platforms today, they have PC Express.

Starting point is 00:04:09 So you just focus on doing the work in which you are innovating on. And you have an attach point. And the same concept was there in our minds when we form UCIE. You can build chiplets, right? the chiplets are becoming the way to deliver the required power performance in today's systems. And it is not a niche that people are in. We are in that's mainstream today. If you look at any company's offerings, primarily you are going to find that there are multiple

Starting point is 00:04:39 chiplets that are effectively stitched together. So our goal was to democratize that to make sure that there is a standard in which everybody is going to connect. And then as a system input package provider or a SIP provider, you can decide to choose, hey, who's accelerator am I going to use? Okay, so company X, Y, or Z has an accelerator that meets my need. I'm going to put that in. Well, it needs to speak the same language as the rest of the chiplets are putting in that

Starting point is 00:05:08 particular package. And that's what UCI is. Effectively, the goal of UCI was to have an open standard incorporating heterogeneous integration of chiplets, right? And those can be accelerators, compute, I.O., memory, you name it. And there are, again, multiple reasons why we are in the chiplet era today.

Starting point is 00:05:35 And those are things like everybody's up against radical limits. Smaller dies is better. So you want to have lots of smaller dice that you want to put them together, better integration. So we decided that in 2022, that the time was right us to go and standardize UCIE. So that way, while all these innovations are happening, it's going to also open up and give an open chiplet ecosystem.

Starting point is 00:06:01 Yeah, I agree. I remember when PCIE came out years and years ago, and it was a struggle. But UCIE, the consortium, I mean, it's such a fast-growing organization. And it's important, right, for AI, certainly all AI chips are based, the big ones are based on chiplets. But so can you walk us through the different specification features, you know, how How was the spec evolved maybe 1.0 to, I think, 3.0? Sure, happy to.

Starting point is 00:06:27 So UCI 1.0, which again, it came out in March of 2022, when we formed the consortium, it was a fully developed specification. There we addressed connectivity for what I would call planar chiplets. So basically the idea is that you put chiplets next to each other on the package and you connect them. They're not sitting on top of each other. There's a setting next to each other, right? So think of it as a planar connection and that's basically what UCIE 2.1.0 was. And basically there are two types of planar connectivity.

Starting point is 00:07:00 I'm going a little bit technical here. So at that planar level, there is the regular 2D kind of connectivity, which is known as standard packaging. You basically have a little higher bump page similar to the way your regular package, once you put things on the package. where the pumps are about 100 to 130 micron minimum distance between them and these 2d connections are a little far off in the sense that you can place chiplets up to 25 millimeter from each other connect them and that's very good from a cost effective kind of packaging there's another type of planar connectivity that is known that is used in what we call advanced packaging it's called

Starting point is 00:07:45 two and a half d basically two it's neither 3d not 2D right when it's more denser than 2D so 2 and a half D. And you know, you basically have very high density connections. Most of the times you are using some kind of an interposer. So for example, you might have like a passive silicon which just has wires that is used to connect between the dyes. So bumpage is very much reduced, 25 to 55 micron. That basically enables that bumpage reduction enables the high density connection and the distance has to be sought about less than 2 millimeters. So basically what we did with uci1.o, again talking about

Starting point is 00:08:24 linear connectivity, is we defined a complete specifications and the goal was you build chiplets based on ucii 1.2, you decide whether you want 2d or 2 and a half t connection, but once you have decided that it's going to interoperate with any other chiplet that is designed to that uci1.2 specific as on. So we define things like it's a layered approach like any protocol successful protocol is like we have things like physical layer that this does things like you know how does your analog circuits work what is the voltage level how is the channel done you know how do you train the link if there is an error in the lane how do you repair right i mean things like you know which order the bumps are there how do you connect so it basically

Starting point is 00:09:08 deals with all the nitty-gritty details that you need in order to make things interoperate we also I also had things like die-to-di adapter that, you know, you have the link running, you get a better. How do you handle that? So all of those are handled there. So, and then we also talked about how does this interface with the underlying protocol? We talked about how the software discover things. What is the software hardware interface? How do you manage these chiplets? How do you do compliance testing? All of these things got addressed with UCI1. We did a 1.1, which basically addressed the automotive usage with along with some optimization. So that was UCI1.0 and 1.1.2.0, what we did was there are two major things we did with UCI2.0. First one is we introduced vertical connectivity. In other words, you have chiplets that are stacked on top of each other. So they are connected vertically, right? It's known as 3D connectivity. And think of it as these are like multi-storid buildings, right? You build up, right? you build up so that gives you much more higher density and in the packaging world higher density is better right because you want to you want to deliver a punch within a small package so multiple and you can have this multiple multiple vertical chiplet stacks that you can interconnect using planar connectivity so and there are

Starting point is 00:10:32 the 3d kind of connectivity is is also prevalent in the marketplace so you and those are characterized by very low bump pitches that helps a lot in terms of things like getting more bandwidth and reducing power so that's what the first aspect that we did with uci2.0 is that in addition to the planar connectivity that we did with uci1.0 and 1.1 we introduced the notion of vertical connectivity so you're building up right in an in addition to building out on the package level now a second aspect that we addressed with UCI2.O is around, you know, there are lots of things around things like testability, manageability, debug, things that you need to address, not just at the interconnect level, but also holistically at the package level, chiplet level, platform level, we address that,

Starting point is 00:11:26 that one, those aspects with UCI2.O. So things like, you know, how do you do firmware download, you know, how do you do testing of the dye, right? Once you put things on the package, how do you do testing, testing, how do you debug, all of those kind of aspects we addressed with UCI2.O. So two things we addressed, as I said, in UCI2.O, one is vertical connectivity, and the other one is around the testability, manageability, debug all of those aspects. 3.0, which we introduced third generation, that was a few months back. This is March of 26. I believe that was sometime in August or September of 2025. We doubled the planar bandwidth density that is to address the ever demanding AI machine learning HPC needs.

Starting point is 00:12:14 So we double the density bandwidth that you get. And we made a lot of enhancements around power optimizations. We also, by the way, addressed a new market segment, which is around continuous transmission protocol. So think of it like a DSP chip that you have, which is basically a bunch of analog components that you want to connect to an SOC. So that interface wanted to, a lot of DSP providers wanted to migrate to UCIE, right? So we made some minor enhancements and that made it possible. We also made some enhancements towards

Starting point is 00:12:48 manageability. And by the way, all of these specifications that we are evolving, we evolve them in a fully backwards compatible manner. So that's, you know, probably more than what you were asking for in terms of the specification and evolution, but that's a very short of history. of that. And by the way, for the people that are listening in, if they want to delve deeper, each of these, there is a very detailed webinar that works through, publicly available webinar. You can go to the UCI website and download and listen to the webinars. No, that was a great description. So one other things I want to ask about is key target metrics. So what are they? I mean, why does UCIE provide the key target metrics and maybe you can give us some example?

Starting point is 00:13:33 Yeah. So let's talk about the key metrics, right? So effectively, when I'm putting together something, whether I'm an architect or if I'm a designer or I'm a validator, it doesn't matter, right? And if you think about an open system, that needs to be, think of it as you provide clear expectations and guidance, what you expect each of the component in the SIC, which is a system in a package, to do each chip to do. Because you may not know that chip. that you might be putting in it might be getting designed so at the interface you need to know what to expect so let's talk specifics here so we have multiple sets of metrics one of them is for example linear bandwidth density basically what it tells you is hey if i give you one millimeter of sore line on my chiplet how much of gigabytes per second of bandwidth will i get for a given frequency. Now that's a very reasonable expectation, right? It can be, hey, I don't know, right? You have to know exactly how much a bandwidth you are going to get. Now, you know, of course, in this particular case, given that we in UCI provide the bump map, we define the frequencies. It will follow really naturally. You can do the calculation. But that's basically what the specification does. It sets clear guidelines saying, you know, if you're under this frequency, this is a how much of gigabytes per second per millimeter for this kind of link you're going to get. Then as a SIP provider you can decide that or as a chiplet designer you can design say,

Starting point is 00:15:10 hey, I need two links or I need four links, right? Whatever, whatever maybe your case depending on your need, you can then go and choose that in order to get the desired performance. A second metric for example that we provide is power efficiency. Now this particular one is it does depend on your implementation. It depends on the process technology node. It depends on how you have done your designs, all of that, right? Power efficiency is dependent on some of that.

Starting point is 00:15:38 But by providing the key metric, it basically gives you a zip code around which you will land. So you might be better, you might be slightly worse, but it gives you roughly the ballpark of what to expect. It's a very important parameter, because as a package designer, you need to know how to do power delivery, what kind of thermal solution you need to provide. You know, so you need to get some idea as to how much power this thing is going to consume.

Starting point is 00:16:08 Latency is another zip code type of metric that we provide, right? Of course, you know, it can vary a little bit depending again on your process node, how you have done your design and all of that, but it can vary too much, right? So you are expected to be within a particular range. we provide for example through the die-to-di adapter to the bump and back through the entire stack we give a guidance that you need to be about two nanoseconds now why is that important well if i'm designing to something i don't know who i'm going to necessarily interface with if i'm a chiplet designer that is so when i'm doing my cue sizing and all of that i need to figure out how deep to make my cues because it's a credit loop that i need to uh take into account so if you told me it's two nanoseconds i'll

Starting point is 00:16:54 say okay fine you know maybe somebody will be four I I maybe I double the number not that I'm expecting people to be worse by two X but if you want you can decide how much of safety margin you want to happen then you can decide on your key sizes accordingly if we give nothing then no way to really design to something so your designs become either to conservative in which case you are unnecessarily wasting the die area power all of that or or you will fall sort in terms of the actual bandwidth that you're going to get. So these are the reasons why we provided the metrics. Some of them are, like I said, they're very quantitative, given that we have

Starting point is 00:17:36 fixed everything else in the spec, like the bandwidth density, those kind of things, you're going to get exactly what we say you're going to get. Some of them are a little bit of a zip code around the zip code kind of number, but at least it helps set up the right expectations. The fundamental reason is that you don't want really to be in a little bit of a zip code. situation where you are surprised right this eliminates that those negative kind of surprises makes complete sense so how does UCIE support a seamless experience you know with plug-and-play chiplets beyond you know just the interconnect sure of course like we talked about at the interconnect we have specified everything right i mean how does you know i talked about the layers how software interacts

Starting point is 00:18:22 with it so those are very well specified so you designed to something you're going to interoperate with another component. The question you're asking is that, you know, if you step back from there and look at how do you get a plug-and-plate chiplets, right? So what we did with 2.0, so this is something we addressed in 2.0. And also, you know, we continued on that theme with 3.0 and we will continue with that theme going forward in addition to the other things. I did briefly talk about things like testability, debug, manageability, those aspects that we introduced with 2.0 at the triplet, package, or SIP level, and the platform level. Now let's talk about some specific examples and see how we address those. So, for example, if you think about testing,

Starting point is 00:19:09 right, people use, you know, you make your die, you want to test it to see, is it, you know, are things working or not? So you use tester. And the way testers work is that they're going to probe the bumps. Now with UCI for example, if you are doing micro bumps with advanced packaging or with 3D where it is pitch is even lower, you really cannot probe those micro bumps of the hybrid bonding pad, right, with a tester. So you need to be innovative and use other bumps like you know, you might have a PC Express bump. You might have a USB, right? So you can come through that indirectly and test your entire chiplet including this interface. Now some other questions arise with chiplets like, you know, how do you? deal with repair once you have packaged something? How do you repair something that might fail in the field? Right. I mean there are questions around hey how do you debug something that you have assembled on the package

Starting point is 00:20:07 in the external interconnects like PC Express or USB you can put a logic analyzer you can put a scope you can see what is going on. Here you really cannot once you package something I can't really go in and probe something there. How do I handle manageability, right? I mean, how do I do firmware download? How do I upgrade firmware, right? All of those things between the chiplets. How do we do all of these things securely? And especially, you have to also consider that some of the chiplets may not be directly accessible from the package pins. It might be the way the SIP is constructed. Then the question also arises that, okay, hey, if I figure out a way to debug or test how much of debug bandwidth do I need? Well, you know, as you would expect, the answer varies, right? So we looked into all of these aspects and said that we need to define something that works seamlessly at the chiplet level, SIP level, platform level, what's with existing solutions that involve other standards, right? People, you know, if you have if you have a manageability solutions with MCTP, we don't want to ditch it, right? We want to work seamlessly with that.

Starting point is 00:21:19 And also things need to be scalable, right? That will address how much of bandwidth, for example, you need. So these are all things that you ask that are beyond the interconnect. And what we did with UCIE was we decided to take a holistic view and come up with some innovative solutions that can be applicable, that can be portable, and also at the same time that is very lightweight.

Starting point is 00:21:40 So what we did was we defined a common infrastructure. We said that, okay, you have existing IP blocks that do test, for example, maybe you got a, memory array that you have a test, you know, you have a test harness or a test infrastructure that is built in, right? We're not asking you to change that, right? That's fine. Keep that.

Starting point is 00:22:01 You know, but what we did was that we provide like a wrapper, think of it in that way, that's going to work with any of your existing IP building blocks. It's going to work with any of your existing external interfaces. So things like, you know, how do you bridge to an external interface? We define that for that off-package connectivity. And then how do we take this for the especially chiplets that are not directly connected, mechanisms to transport those things seamlessly across chiplets. So we define all of that.

Starting point is 00:22:29 It's like a small wrapper and then we define some packets that identify saying, hey, I'm using it for the common infrastructure and it's going to then work seamlessly with the chiplets for people that deploy UCI2.O. And again, if you want to delve deeper, please do check the webinars that are available in the UCI website. But those are some of the things that we did, starting with UCI2.0, so that we look at these aspects, not just from the point of view of the interconnect,

Starting point is 00:23:01 but we step back and look at it from the point of view of chiplets package. I need to work also seamlessly at the platform level. Right. Very interesting. So where are we at today? I mean, how are UCIE consortium members deploying UCIE in the ecosystem?

Starting point is 00:23:17 Okay, so there are two sets of things I'll talk about. One of them is the usage model and then I'll give you some examples. So from a usage model point of view, there are two broad categories of usages. First of all, of course, first usage would be constructing an SOC at a package level, right? Basically what things that used to be at the platform level are now coming to the package level. So think of processing chiplets like CPU, GPU, accelerators, on package memory, I.O. Connectivity for external I.O. Memory connectivity, analog chiplets, optical interconnects, all of those things. Those are all packaged together, right?

Starting point is 00:23:59 So that's the first problem statement for which UCI exist is to use chiplets to form a SIP or a system in a package. So that's, of course, that's the most obvious one. Second time that type that we addressed is off package connectivity. So things like co-packaged optics so for example you construct a package you have a really cool co-packaged optic solution and you want to connect all of these packages using UCIE at the rack level or even at the pod level this allows you to create things like you know dynamically composable systems at the rack level for example hey I need some extra memory I can go off to this other location that has memory using my optical connection through UCIE co-packaged optics get that extra memory while I need it.

Starting point is 00:24:49 And then it's not memory as in your RDMA kind of memory, but these are like direct load store, right? Low latency memory like you would be connecting DRAM next to your SOC. So you can go access that while you need it. And then later on you can release that back to the pool. So that's what I mean by dynamically composable systems at the rack level.

Starting point is 00:25:11 It's a very powerful construct if you think about it. And at the same time, also we wanted to support Distributed processing effective distributed processing once I have these things with UCIE with the load store Message passing low latency accesses I can do distributed processing very effectively at the rack level and point So that's the second type right so first type is of course construction of this of the SIP second type is the off package connectivity things like co-packaged optics so that's the usage model point of view that the consort some members are very interested in and they keep driving and then you know within those there are different classes like you know automotive needs this, hey, you know, my DSP chiplets need these. So we are constantly increasing our footprint in the different segments where things are, things are applicable. People are ready

Starting point is 00:26:00 to move to UCI from their proprietary stuff. Now real implement this on point of view, we'll definitely talk about things that are in the public domain, right? Of course, I can't talk about things that people are not ready to announce publicly. So, So the first one, real demonstration that I have seen came the interop demo between Intel and Synopsis. That was in fall of 2023. So two test silicons. One done, of course, for Intel's was done using Intel process technology. T-SN synopsis was done using TSM.

Starting point is 00:26:37 It was Intel's E-MIP-based packaging. And we showed that the two chiplets were interoperating with UCIE. at speed. So, think about it, fall, so March of 2022, we did announce the consortium. That's when the spec is available. About a year and a half later, people are doing interrupt.

Starting point is 00:27:01 The speed has been very impressive to say the least, right? You got the chips back, you're testing, right? Then shortly after that, chiplet submit, 2024, you had synopsis, cadence, alpha, wave, which is now part of Qualcomm. they were showing silicon that are running UCIE. There is of course widespread availability of VIPs, which is the verification IPs.

Starting point is 00:27:24 You need that for making so that your design works, right? Last year, 2025 OFC, IRLAPS demoed their UCI-based, co-packaged optics. This was the second type of uses that I was talking about, right? Disaggregation at the rack and the pod level. So that was that target usage, but IR Labs did public demo of UCIE-based. go-packaged optics. Again, last year, which was like a few months back, 20-25 hot chips, NVDA had an NVIL-fusion through UCI chiplets presentation in there. Also, when we did the

Starting point is 00:28:00 announcement of UCI, Synopsis released a customer survey that they do, that we also used during that announcement for UCI 3.0. So they did a survey and I'll talk about a couple of metrics that they did. So first thing that they did based on their customer survey was what fractions of designs are based on UCI. Well, UCIA was 39%. UCIAS, which is the standard packaging, 58%. So 39% for advanced, 58% for standard, total of 97%, which is an impressive segment of the market. The second aspect that they also reported was by application. So server and AI applications were 53%.

Starting point is 00:28:45 out of those automotive was 19% networking was 10% consumer was 8% restware and storage and miscellaneous segments right so very wide range of usage models and then of course a predominant use of UCI which is extremely powerful indicator right of how healthy and strong the ecosystem is so clearly you know the need was there we came up with the right technology the the right open standard-based thing and the market is and then basically the ecosystem is using it, which is always a good thing to see. Great. Last question.

Starting point is 00:29:28 We know we're out today, but what about the future? How is UCIE technology evolving to meet some of the next generation of performance demands? So as I said earlier that personally have been driving standards for more than two decades now, fortunate enough to be to be doing that and PC Express CXL and now UCIE one of these things to keep in mind that these are multi-generational and multi-decade standards and also one standard one standard that we drive but multiple applications and and you know it covers the entire compute continuum right It's not for only this segment or that segment. It's meant for the entire compute continuum.

Starting point is 00:30:14 If you look at PC Express, it's ubiquitous right across everywhere. And that's also the goal for UCIE, which we have been very successful based on some of the statistics I just talked about. So one of the things that we have learned is that the compute landscape keeps changing. And it goes through evolutions, it goes through revolutions, right? And what you need to do is to keep innovating and stay ahead. Sometimes you might fall behind, that's okay. But in that case, you just acknowledge it and then move, move quickly to make up wherever you are lacking.

Starting point is 00:30:51 So that's what basically drives successful standards for multiple generations for multiple decades. And we are at the start of that journey. As you can see, we are quickly addressing any usage model needs that the industry has. We are addressing at the same time the performance needs that they have. So we will continue to drive UCI as the ubiquitous standard for on-package connectivity, and this will be applicable across the entire compute landscape for decades to come. So we are always working on the on a set of very interesting problems. So there is no lack or dearth of those. So stay tuned, but you know, it's a very exciting time to be

Starting point is 00:31:36 in Chiplets and to be in UCI. I can agree completely. It's an exciting time to be in the semiconductor industry, and it's people like yourself that made this all possible. So I really appreciate it, and it was great speaking with you, and hopefully we can catch up and talk sometime. Unfortunately, I missed you at the Chiplet Summit. Recently, you did a keynote, but maybe at the next conference,

Starting point is 00:31:57 but it would be pleasure to me. Absolutely, yeah, yeah. If you are here, let me know. That concludes our podcast. Thank you all for listening, and have a great day. Thank you.

Semiconductor Insiders - Podcast EP335: The Far Reaching Impact of UCIe with Dr. Debendra Das Sharma

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.