SemiWiki.com - Podcast EP317: A Broad Overview of Design Data Management with Keysight’s Pedro Pires

Starting point is 00:00:00 Hello, my name is Daniel Nenny, founder of SemiWiki, the Open Forum for Semiconductor Professionals. Welcome to the Semiconductor Insiders podcast series. My guest today is Pedro Pires, a product and technology leader with a strong background in IP and data management within the EDA industry. Currently a product manager at Kisight Technologies, he drives the roadmap for AI, and data management solutions. Pedro's career spans roles in software engineering and data science at Cadence, Cliosoft, and KeySight. Welcome to the podcast, Pedro. It's great to be here.

Starting point is 00:00:40 Thank you for having me. So Pedro, let's start off with what brought you to IP and data management. Do you have an interesting story you can tell? Sure. So I started my career at Cadence. I was working on the front end and analog and mixed signal tools, so virtuoso, Spectre, and incisive. And I was working as an application engineer and part of my job was often to build custom scripts and solutions to address one problem or another that a customer might have. And I quite enjoy that part of my work to be able to do data engineering and scripting and visualizing the data in interesting ways. So I took a liking on data engineering. And right about that time, I found out about the opportunity to work at CleoSoft. I was doing data management, so I jumped in.

Starting point is 00:01:32 Great. Yeah, I know you from CleoSoft. So what do you think are the biggest challenges in the industry right now in regards to data management? Good question. So right now all the challenges, all one way or another, revolve around two things that are coming together. First of all, there's this decades-old trend in which we see projects that are getting complex. So complexity here comes under the form of more files, bigger files, but also different domains and disciplines that come together. So different ways of working, different workflows.

Starting point is 00:02:08 So this means that you have in a single end-to-end workflow, you have many tools coming together. So you have to provide a way for the data to flow through all these different tools, all these different teams, different people that have different ways of working. So all this complexity definitely piles on. And this is all about data management, not just data warehousing, but connecting all these different tools together. And beyond that, we're also in this at this junction point where everything is about AI. And before even getting into AIML workflows, we have to make sure that all our data is properly organized, sorted, sorted, cataloged, labeled, even before we can start thinking about AI. So we have these two things that are coming together, right? So one is the complexity of projects, and this is something that we've been observing for a long time,

Starting point is 00:03:02 but also this growing demand to have an AIML roadmap, which creates this pressure to have a properly lined up data infrastructure. This is where KisS comes into play. Can you elaborate a little more on how your product SOS addresses those challenges? Yes, so there are a couple of things that SOS provides. First of all, it provides a centralized point of access for all your knowledge. So you have all these different flows coming together, all these different people collaborating,

Starting point is 00:03:40 and we're talking about people that are collaborating at different parts of the world. But no matter where they are, all their data one way or another, goes to a centralized entity, which is SOS. Now, the. The way the data is consumed from all these different workflows into SOS is via its portfolio of integrations with the different workflows that are available in the industry.

Starting point is 00:04:04 So we don't really force our way into engineers day-to-day. We provide integrations that seamlessly bring data management into the workflows that they already use on their day-to-day. So you have people that keep working just like they do, but the data is being fed into this centralized knowledge architecture, which is SOS. Then of course, SOS is organizing the data, cataloging it and exposing it, essentially creating what we like to call organizational knowledge,

Starting point is 00:04:36 which is nothing more than making the data and knowledge that exists in a company available and visible to everyone. But in practice, what this means is that we're opening avenues for people to consume data, which then creates the demand for governance and security and safety, which SOS also provides. There's a lot of granularity on how SOS provides your data in a secure channel.

Starting point is 00:05:02 And then it's all about performance as well. So SOS scales and distributes your data very efficiently. And then between the user and the SOS infrastructure, there is also a high performance technique that allows the data, which is typically large to flow in and out of SOS without creating constraints. And because we are building this catalog of knowledge that is well organized and identified, we're also preparing our customers for this new area that is coming about, which is all about AI.

Starting point is 00:05:33 So eventually this data that has been warehoused for decades now needs to feed into AIML pipelines, and we are preparing our users to do that. Right. You know, when I started, version control was open source. their abilities open source tools. Why did that go away? Why do we use your type of tools today? Well, the tools are still out there.

Starting point is 00:05:58 And if you want to use them, I mean, you can with, I will advise it against it, however, because it's going to bring you and told suffering and frustration. And the key aspect here is integration, is EBA awareness. So even though SOS can act as a general purpose data manager, and manage any type of data. The problem here is that these generic tools

Starting point is 00:06:24 don't really have awareness and know how on how to integrate with these specialized EDA workflows. So they can still manage their data, but the user would have to do it in a way that becomes intrusive. They would have to probably minimize their virtual as a window and then run some Git commands to push their data into some repository. And these extra steps might seem meaningless.

Starting point is 00:06:48 but when you pile them on into especially global teams, they start to pile on and create attrition. Then, so SOS comes in with this integration into EDA workflows. Then, you know, the second bottleneck or the second major factor here is performance. Tools like Git or SVN simply aren't built to handle the data that our users work on their day to day. So this data is typically large, which means that all this I.O. operations, all this in and out of data coming in and out of user workspaces becomes heavy. If you think of a layout that has a couple of gigabytes, this is starting to get heavy if they need to push and consume this data repeatedly. So SOS has a few capabilities that make this transference of data seamless and painless. So to wrap up, it's mostly two points, one integrations into these EDA workflows and second performance.

Starting point is 00:07:52 So these open source solutions simply cannot compete. If you broaden the question a little bit and you ask me, would it be possible one way or another to use open source solutions? Yes, it would, but you would quickly cross a threshold in which this application would no longer be feasible. So you would probably have to have a dedicated IT team maintaining these tools. You would have to build your own integrations with EDA workflows, and you'd have to work through the scalability of these tools yourself. So if you account for all the investment that would be necessary to maintain this infrastructure that is based on open source solutions, quickly becomes unwieldy and no longer profitable. So, in fact, even though it might be counterintuitive, the total cost of ownership of SOS is lower than open source solutions. Right. Yeah, that was my experience, too. The open source was very slow, by the way. And our open source expert left the company. And, boy, we really had a couple challenging months trying to figure everything out. But we actually, they actually used SOS, so they went to a commercial package and did quite well. Yeah.

Starting point is 00:09:09 Can you share another customer's success story or a use case or something like that? Yes, let me think about something with a few figures that I can point out. Okay, so I have a good one, right? So this is about a big design house that is working on analog and mixing single projects. I'm not sure what is their stance with respect to me using their name in such a situation. such situation, so I will keep the company and named, but it's a big, well-known company that is working on analog mix-it signal projects. So just to give you a rough idea of the size, we're talking about teams of 150 to 200 engineers scattered across multiple sites worldwide. They probably work on

Starting point is 00:09:57 something like 50 to 60 projects a year, and this is big analog, small digital. So their main problem was around promoting a workflow that encouraged and made it easy for IP to be reused. So they had a lot of IP that was being stored in their legacy databases. This IP was useful and obviously they didn't want to reinvent the wheel every time they wanted to reuse what was already available. Secondly, it was about being able to rebuild legacy projects that they had in their infrastructure. So, typically speaking, whenever they wanted to go back to an old project and rebuild it, this would take a week's worth of work or something like that.

Starting point is 00:10:43 And then they wanted to minimize the inefficient communication between engineers. So they estimated that per project they had probably 100 email threads going back and forth to discuss this and that about the project. So the main goals of this deployment was one to promote IPVRIZE. reuse. Second, to eliminate silos. So to make sure that everyone was aware of which IP was available in the company and have a process to consume that IP and to normalize these IP development processes across the company and also to minimize attrition from inefficient communication and bloated processes that they were following. So what we did was deploy SOS in their environment. So the two tiers of SOS. core, which brings the day-to-day version control, data management, and overall lower-level capabilities, and then our enterprise collaboration tier, which brings this layer of IP management

Starting point is 00:11:48 into their infrastructure. So we build an IP catalog where they can publish all their IP, and then others could go in and search for IP for this or that application and easily consume it into their own projects. So the results were very positive. So the key return on investment was that there was a jump of about 50% in IP reuse. So this was 50% more IP that was consumed rather than either acquiring IP from a third party or recreating the IP from scratch. Projects, build of materials were five to seven times faster to build. And they reported if I don't have hard figures on this.

Starting point is 00:12:32 they reported that they had 80% fewer emails going back and forth during these projects. And as a bonus aspect to all of these, the IP producers also gained visibility over where the IP was being consumed. So they could generate what we call an IP consumer's report, where they could see that a given IP was being used in X, Y or Z projects by these people, that particular version of the IP, so they gain much more visibility. and traceability over their IP across the company. That's impressive. Last question, Pedro. How do you see the role of data management evolving in EDA? And talk a little bit about AI as well.

Starting point is 00:13:17 Wow, that's such a good question. So I would say that data management will no longer be deployed in workflows as a point tool to just do version control. And there will be more of a platform approach to this where data management is a substrate that needs to be there and needs to connect to all the tools and all the workflows that exist in the environment. Because all this data needs to be warehoused and cataloged in an organized manner. This is no longer and nice to have. It's actually a necessity for this new world. And because we're opening avenues for data to be consumed, security and, and, you know,

Starting point is 00:13:58 generally speaking governance of this data will no longer also be a nice to have and it will be a mandate coming from board level uh people because the data needs to be secured not just from the sake of security but also because all of this will tie into this new era of a iML pipelines where you're now suddenly feeding this data as context or as the main knowledge into entities like co-pilots and agents that are going to learn about all this data that has been produced over decades. And you definitely want to make sure that you do not have leakage of IP or information that is shared in some prompt that it shouldn't have been. So you need to have all these guardrails around your data to make sure that you leverage the capabilities that AI brings in a safe way.

Starting point is 00:14:55 So definitely it's going to evolve as being this data engine that feeds into these new workflows that are coming into our industry and these AI ML pipelines that we're observing more and more being deployed and we need to be AI ready in order to be able to leverage the value that these workflows bring. I agree completely. Great conversation Pedro, nice to speak with you again. And thank you for your time. Thank you, Daniel. It was great to be here.

Starting point is 00:15:30 That concludes our podcast. Thank you all for listening and have a great day.

SemiWiki.com - Podcast EP317: A Broad Overview of Design Data Management with Keysight’s Pedro Pires

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.