Disseminate: The Computer Science Research Podcast - High Impact in Databases with... Moshe Vardi

Starting point is 00:00:00 Hello and welcome to Disseminate the Computer Science Research Podcast. I'm your host, Jack Wardby. The podcast is brought to you by Pometry. Pometry are the developers behind Raftree, the open source temporal graph analytics engine for Python and Rust. Raftree supports time traveling, multi-layer modeling and comes out of the box with advanced analytics like community evolution, dynamic scoring, and temporal multi-use mining. It is blazingly fast, scales to hundreds of millions of edges on your laptop, and connects directly to all your data science tooling, including Pandas, PyG, and Langchain. Go check out what the Pometry guys are doing at www.rafty.com, where you can dive into their tutorial for the new 0.80 release. So it gives me great pleasure to say that I'm going to be joined today by Moshe Vardy. Moshe is the Karen George Distinguished Service Professor in

Starting point is 00:01:12 Computational Engineering at Rice University. Welcome to the show, Moshe. Thank you for having me. Let's start off with your story then. So kind of, I guess, walk us through your journey from, I guess, day one really when did when did you decide to become a researcher what was it about computer science and databases that attracted you well you have to go back i have to go back to my childhood sometimes in the early teens my parents bought me like a 12 volume series it series. It was called The Young Technologist. It was really, and it used to be a magazine by that name in Hebrew,

Starting point is 00:01:51 was published in the 50s and the 60s. And then somebody did, the magazine stopped, but somebody took all the issues and put them together as a series of 12 volumes. And they gave it to me, and God knows how many times I've read that series. And this is when it become very clear to me that I want to become a scientist. So I had no idea exactly what scientists do

Starting point is 00:02:15 other than they made these amazing discoveries. How does science work? I had no idea. I really don't have any, I didn't have any mentor. I grew up on a kibbutz. So I was mostly focused on farming, hospitality. My father, however, became a teacher. So I had a little more kind of intellectual influence.

Starting point is 00:02:37 But my father went to college just a couple of years before me. He went to college as an adult. But I mean, to me, going to college, I want but i mean to me going to colleges i want to become a scientist that was very very clear so you go to your undergraduate degree and there you get a good sense in fact i major in fees my manager major was physics but somewhere we're talking about now we're talking about you know the 19th early 70s by the time I went back to, I finished my military service, it's now we're talking late 70s, and I had this

Starting point is 00:03:09 inkling that computers are going to be a big deal. That's a good foresight. Now, you know, now you think of course computing is a big deal. You go back to the 70s, you have to remember that the microprocessor is, you know, 1974.

Starting point is 00:03:27 I mean, early 70s. So there was nothing very clear at the time. Computers are a big deal. The information processing, they were all in the glass house. You access them using punch cards. You know, terminal was a big innovation. So there was no inkling that computers are going to be such a big deal. Looking back from where we are now,

Starting point is 00:03:49 it's so hard to kind of believe that was once the reality, but yeah, I mean, and, and yet, yet I had this intuition that computing is the wave of the future. I kind of tried to go back to my, I wish I could go to a time machine and interview,

Starting point is 00:04:02 interview the young Vardy and ask him, how did you have this intuition? What was the base of your intuition? But, you know, I don't know. I enjoyed programming. So I decided to go, when I go back to graduate school, I will leave physics behind and I will go to computer science. And, you know, there is a phrase that says that all the big decisions in life you make when you're young and inexperienced. Yeah, that's very true. And you hope that you're making good decisions, but this was a great decision.

Starting point is 00:04:32 So young Vardy made a good decision. Computing is the future. How did you start with databases? You know, as many things in life, serendipity is a huge factor. There was a graduate seminar. And what the graduate seminar was, the professor assigned papers to read. And you had to read a paper and give a talk on the paper. That's it. And so he gave me a paper that he found interesting.

Starting point is 00:04:59 It was in database theory. And really, my task was to give a talk on that paper. But the very end of the paper, they posed a research problem. And I was intrigued and I told him, no, this sounds really interesting. Maybe I'll think about it. And the rest, as they say, is history.

Starting point is 00:05:16 So unlike computing, which I started with some intuition about the future, database was just pure serendipity. Here is a problem i find it interesting mathematically let's think about it that's really how it started what was the paper i think it was by meyer sagiv and yanakakis and this was a topic that is less active today but but at the time it was people realize that the data that you get, you know, the issue we're struggling with today, you know, you fill in your relation with data. How do you check for errors?

Starting point is 00:05:55 Maybe there's some errors in entering the data or where you got the data. So people came up with what we call now integrity constraints for example if a person work in organization they probably only have one salary they have two salaries something is fishy there okay this is one of the effect of life you only have inside organization you have a single salary okay that's the way they like it so this is the fact that every employee should have a unique salary that's what we call integrity constraints but the way you describe it. So this is the fact that every employee should have a unique salary. That's what we call integrity constraints. But the way you describe it now is there is a dependency between the employee number and salary. So it's not just a dependency. It's what we call a functional dependency. There should be a function from employee number to salary. It's a function.

Starting point is 00:06:39 So this became known as functional dependencies. And all the way from the beginning, the early days of the relational model, Ted Codd, who had the idea of the relational model in 1970, introduced functional dependencies as a way to also suggest to you how to organize your data. And then people started introducing other types of dependencies. And so for a while, it was part of what are dependencies? What kind of other dependency we have? So there was something called functional. There was a function. But then somebody introduced multivalue dependency, which is, in some sense, it's a function from, again, employee number,

Starting point is 00:07:21 but to a set of things. For example, maybe if the database has also children, it's a function from an employee number to the set of, that's a dependence that this employee has. So this became known as functional, as multivalue dependencies. And then people start realizing that once you have these dependencies, you can learn more about the data by making inferences, reasoning about these dependencies. So you mentioned, I'm interested in automated reasoning. This was kind of the first example. So let's suppose that there is a mapping between a functional dependency between employee and department.

Starting point is 00:08:03 Every employee works for a unique department. And every department has a unique manager. Now you can infer that every employee has a unique manager. So people started having reasoning about data dependencies. And so the question that was posed in that paper was, is there a system of rules by which you can reason about a class of dependencies called joint dependencies. And joint dependencies are generalization of multivalue dependencies. I won't get too

Starting point is 00:08:33 technical here. But now it's a question, you know, now one of the things that people who are interested in computing, usually these people love to solve puzzles. So why do we like to solve puzzles? That's an interesting question. Who cares about the puzzle? But somehow the puzzle, it's your brain against the puzzle. Okay. And one of the things that I'm very often, the people who are attracted to computing have kind of a bit of brain yaks.

Starting point is 00:08:59 They like to think about challenges. They enjoy kind of intellectual challenges. And they also like a world where the rules are very, very clear. So when you write a program, there are clear rules. The program has to compile and run correctly. Okay?

Starting point is 00:09:15 So the first time you run a program and you get, I remember the very time you get the program and you get, you think maybe there was a compiler error. And then you realize, no, it's not a compiler error. And you get wrong result.

Starting point is 00:09:28 And eventually you learn all the mistakes are yours. I mean, this is a fantastic, this is a fantastic, I look at it and say, what a fantastic training in humility. You come and you, you know, nobody likes to admit their mistake. That's very true. And so you write a program, it doesn't work. You think work you think you know something must be wrong the computer is buggy the compiler is buggy something is wrong and the answer is you go to your professor said look i wrote a completely correct program i get wrong result the answer is go check it again okay the bug is in the you made it but you're fully responsible for your own bugs yeah i've never encountered a bug where you could say,

Starting point is 00:10:06 well, it's a compiler error. I've never yet. I mean, it's not impossible because a compiler is a piece of software. But it goes through such extensive testing that I have never run into a case where the program was right, but the compiler was in error. So it's also a fantastic training in humility. These are the rules. Live by the rules.

Starting point is 00:10:26 Live and die by the rules. Also, it has, I think, many people that attract to computing, especially when I was 16 when I started playing with computers. And I would say it's probably the people like me who get attracted to computing are not always people with the most elaborate social skills. skills and the nice thing about computers the rules are clear on the other on the other hand when you're age 16s the rules about girls are nothing nothing but clear okay yeah very puzzling very puzzling it's very hard to figure out what are the rules so here is a place where you can challenge yourself and the rules are clear.

Starting point is 00:11:07 So you kind of, you get attracted to just, it's you against the difficulty of the puzzle. And programming by itself is like puzzle solving. But when you come to doing more, what we call especially theoretical research, again, it's just a mathematical problem. You're pitting yourself against a mathematical problem. And so I read this problem.

Starting point is 00:11:28 I said, wow, this sounds like a very interesting problem. And you start thinking about it and thinking about it and thinking about it. Ultimately, that gave rise to my master's thesis and eventually also to my PhD dissertation. Just think about this problem, trying to solve it. And usually when you see thing you say well you know what maybe if we generalize it it will be either easier to solve sometimes because you it's more abstract you can you find a way to easier to solve or by showing that you cannot solve it okay so partly what my dissertation was about was generalizing this class of functional

Starting point is 00:12:07 and multivalued and joint dependencies and coming up with a general logical formulation of such dependencies. Today they're called TGDs and EGDs, and I don't think we need to dive into that. And then I could say, okay, when can we solve this problem? And when is it not solvable? So one of the things that I found when I was a master's student, I was in a class where we learned about unsolvability. There are certain problems that are not solvable. The most famous one, of course, is the halting problem for Turing machines, okay? Unsolvable problem. And I learned it for the first time it's an amazing result negative result in mathematics usually much more difficult and quite amazing to prove that something

Starting point is 00:12:50 cannot be done to prove that something can be done you just show how you do it but to show that something cannot be done that's an amazing intellectual intellectual thing you know it's going back can you square the circle okay can you trisect an angle with a ruler and a compass? Okay. These kind of negative results are some of the most amazing results. Can you solve quintic equations by means of polynomials and radicals? All these are negative results and usually require a lot of advance to show, no, this cannot be done. So when I learn about unsolvability or undecidability,

Starting point is 00:13:27 it's mind-blowing. Now, I remember going to my advisor, Katriel Be'eri, also a very well-known database researcher, and I tell them, I just learned about undecidability today, maybe, or this week, maybe this problem I'm working on, maybe it's undecidable. And he said, oh, oh come on this is this is not some fancy theory this is database this is we are practically practically systems people

Starting point is 00:13:50 but eventually it was shown that i was able to show that this problem was undecidable do you remember the moment when you when you finally managed to demonstrate it how did that what did that feel like and was it the classic eureka moment or was it more kind of it happened slowly over time and then it was done? Well, the way we prove undecidability is today we show if you can solve this, you can solve the halting problem. That's how we show it. Since we know already the halting problem is undecidable,

Starting point is 00:14:21 to show that something else is undecidable, you show if you can solve it, you can solve the halting problem. But usually it's not directly. It's what we call, this is calledecidable. To show that something else is undecidable, you show if you can solve it, you can solve the whole thing problem. But usually it's not directly. It's what we call, this is called a reduction. You saw if I can solve one problem, I can solve another problem. But now we can have a chain of reduction.

Starting point is 00:14:37 You say, what, you know, if you solve B, you can solve A. And if you can solve C, you can solve B. And if you can solve D, you can solve C. So solve B and if you can solve D you can solve C so I was able to show something like well if you can solve Z then you can solve Y and Y was already connected

Starting point is 00:14:54 by a chain of reductions and so partly you kind of have to find the right problem you have to find the right Y you have to find the right problem that on one hand it is already known to be undecidable. And on the other one, it's close enough to your problem that you can do this reduction. So partly

Starting point is 00:15:15 it's really kind of a bit fishing in the dark. You get some flavor, what kind of problem I'm looking for. And I spend lots of hours in the library hunting for problems. And remember, there's hours in the library hunting for problems. And remember, there's no Google search. You want to search, you go to the library, you pull up volumes, and you look through papers. There was one advantage of this. It sounds like today, it sounds like practically medieval. But it has one advantage today.

Starting point is 00:15:39 You do a Google search, and you get your result immediately. There, you had to actually go through many, many journals. And you make a lot of serendipitous discoveries along the way. It's the same thing that when you look for a book in the library, you go to the card catalog and you kind of look through it. But you have to go through the cards. It's called a card catalog. There are cards for each book. As you go through the cards, you see called a card catalog. There are cards for each book. As you go through the cards,

Starting point is 00:16:06 you see other books. Well, that would be interesting. And then off you go. This might be interesting. When you go to the shelf, you look for your book, but there are other books around it. So you make, you know,

Starting point is 00:16:16 we have become, this is another theme of mine. I'll go in a soapbox that computing as a discipline is very focused on efficiency. Let's do things in the most efficient way. And the answer is that, yes, it is efficient. But efficiency is a short-term optimization.

Starting point is 00:16:36 But it's not always a long-term optimization. And in fact, we made this big realization. I got this big insight during COVID. And you can ask, you know, there were lots of warnings about the global pandemic. You can go through the 2010s and you find a book from 2011, The Coming Global Pandemic. And you ask yourself, how come we're caught so unready? And the reason is because it costs money to get ready for a pandemic. That's true. Now, it costs way more money to money to get ready for a pandemic. That's true.

Starting point is 00:17:05 Now, it costs way more money to be caught unready for a pandemic. There's no comparison. It may have cost billions to get ready. But today we know that cost of a pandemic, people have tried to compute the economic cost of the pandemic economically in terms of loss of life. Just the United States, we're talking something like, I don't remember the precise number, but somewhere maybe around $2.5 trillion. So investing $10 billion in pandemic preparation would have been an enormous bargain.

Starting point is 00:17:35 But $10 billion, who wants to spend $10 billion? Maybe a pandemic would not happen. So we tend to be very cheap when it comes. We want to be efficient and also spend money when we don't need it. So again, we've become very efficient with computing, but we're not counting the loss. Okay. And part of the loss is that when I would look for a book in the library, I would discover

Starting point is 00:17:58 other books. When I look at the catalogs, I would discover other books. When I go look for articles, I will discover other articles. And as technology gets better and better now, this is part of my worries. Are we again becoming more and more efficient? So still, when you do a Google search, at least you see the first screen. Nobody look beyond the first screen, but at least you see the top 10 links or something on the first screen. Now, when you go to ask a child GPT, that's it.

Starting point is 00:18:28 You only get the answer. They are never the second and the third and the fourth and the fifth hit. So we're becoming more efficient. What are we losing? Anyway, going back there, eventually I found problem Y that I could reduce problem Z to. And I was able to do it. And I mean, let me kind of regret, digress a little more. When I was about 10 years old, another formative experience, I mentioned this

Starting point is 00:18:54 encyclopedia, The Young Technologist. Another formative experience was I had a math, this is elementary school, I'm in fourth grade, and I had a math teacher who really would like to have been a mathematician. But at this point, he's just teaching math to elementary and high school students. Eventually, he went back to university, finished his PhD, and became a mathematician. But at the time, he was just teaching, but he wanted to more than just teach. So he decided to create a math club to take the kids who are a little better in math and then maybe better than the average and have special

Starting point is 00:19:30 advanced math classes to teach them in the afternoon and the club did not last very much not enough kids were interested in math but i remember the very very first the very first or couple of of classes he talked about matrices and determinants. And my mother told me later, I came home, and she asked me, so how was it, Moshe? And I said, oh, it was beautiful, I said. And my mother had very little formal education. Her math meant she could do arithmetics.

Starting point is 00:20:01 Okay, that was her math. She could add and multiply numbers. That was her math. And so the concept of beauty in math was completely foreign to her so when i said it was beautiful she said do you mean interesting she told me the story later i have no memory she told me later she said you mean interesting i said no i said beautiful i mean beautiful so i found the beauty in math and you can't explain beauty in math the same way you cannot explain beauty in other things. Okay? I mean, most beauties are mystified by the phenomenon of beauty.

Starting point is 00:20:34 Because beauty in people, I think, is somehow one or another related to sexual attractiveness. So it's kind of we're biologically primed to be attracted to or realize attractiveness of other people. If it's someone of the gender you're attracted to, then you're attracted. Or you see someone of your gender, you say, well, that's a competitor. So you are somehow making an adjustment, okay?

Starting point is 00:21:00 But I look at mountain scenery and I'm awed by mountains. And I found a mountain scenery and I'm awed by mountains. And I found a mountain scenery beautiful. And I asked myself, why do I find it beautiful? I don't know. You look at a bevy of butterfly. We have here in Houston, Museum of Natural Science. And they have a butterfly pavilion.

Starting point is 00:21:23 And you go there and it's full of butterfly. And I said, oh, this is so beautiful. And I asked myself, why do I find butterflies? They're colorful, okay, but why do I find them beautiful? And I cannot explain it. It's just, I find it beautiful. And I find math beautiful. And I have no explanation.

Starting point is 00:21:39 It's just the sheer elegance of the abstract structure and the hidden structure you find, you know. Think of something like set theory or graph take graph theory set theory is even more what is set theory what is a set a collection of objects and then you start saying okay if there are two sets i can define a relationship between them one set is contained in another set Every element here is also element there. All apples are fruits. It's a containment relationship. And then you start building on that, and you form mathematically one incredibly deep theory with deep, deep mathematical questions that are still open.

Starting point is 00:22:19 And it's an amazing phenomenon. It's just absolutely amazing phenomenon. And again, there are people who look at math, and they say, well, all you see here is, excuse me, flies making, excuse me, shit on the page, right? All these symbols, okay? And I look at them and say, wow, look at this beauty. So I always find beauty in math. And theoretical computer science is, I don't like to call it is a branch some people call it just math to me it's more a particular type of applied math because we have

Starting point is 00:22:52 an application at the end so math mathematician gets permission from society it's an amazing provision of society it might be we don't know what it's good for, but do it anyway. Why? Because history has taught us that you can do something which seems completely useless, and then later it will become useful. So Hardy famously thought about like analytic numbers. I mean, number theory, prime numbers, totally useless. And he enjoyed the useless beauty of it. And he was very proud. This is of no use to anyone. And

Starting point is 00:23:28 then prime numbers became the foundation of cryptography today. So we are telling mathematicians, go ahead. Go ahead. Do the math. Maybe one day it will be useful. We're just giving you a very long rope. Just do it. We don't know what it's good for. Go ahead with it.

Starting point is 00:23:45 The same thing is true in some theoretical computer science. Just do it. We don't know what it's good for. Go ahead with it. The same thing is true in some sense for theoretical computer science. Just do your theory, but we hope it will be useful. My critique here, and I think that many theoreticians think, okay, I have, just like mathematicians have permission to do just whatever they want to do, we should have permission to do the same. But that mathematician can point out a number of huge success of theory. And I think computer science doesn't have such a long list of theoretical results that end up being very useful.

Starting point is 00:24:19 There are a whole bunch of them. So in fact, we have today a special award, the Kanellakis Award for theory and practice. And this is for theoretical advances that did prove to be ultimately useful. Which were exactly? So there are many such like that. I received one for developing for another line of work that have nothing to do with databases, which look at formal verification. How do we prove properties of systems, hardware or software systems?

Starting point is 00:24:49 And there was a bunch of ideas people came before, which is that, you know, how do you, first of all, how do you prove correctness? What is correctness? So the idea was that to specify correctness, you should completely stay away from how it's being implemented.

Starting point is 00:25:04 You just should talk about how does it behave. So if you're talking about, for example, about a vending machine, if I push the coin and I punch the button for what kind of drink I want, I should get the drink. How it's implemented, most of us have no idea what's inside the machine. Put the coin, push the button of what you want, okay? I'm a black tea drinker, okay? Okay.

Starting point is 00:25:32 Without any milk, okay? Is it Yorkshire tea, though? Because that's the important thing. Yeah, yeah, Yorkshire tea. No milk, just black tea. I like Earl Grey, it's my favourite. Okay, yeah, yeah. I should try Yorkshire, Yorkshire. I'm writing down Yorkshire tea. I like Earl Grey. It's my favorite. Okay. I should try Yorkshire.

Starting point is 00:25:46 Yorkshire. You put it in my list. I'm writing down. Yorkshire tea. So when you say, what does it mean for the vending machine to be correct? It's completely at the interface level. If you put the coin and push the button, and probably in that order, first push the coin, then push the button, then the drink will come out.

Starting point is 00:26:09 Exactly the drink that you ordered. Black box. Yeah, it's a black box. So we said that's the way you specify Grunhoff system. It has to be a black box. Okay? It's just at the level of the interface. And an amazing discovery that was made by Amir Pnoel in the 1970s was to borrow a logic that was developed by philosophers. And by the way, he told me the story was that he was reading a book by a logician. And then as he finished the book and he decided the book was actually

Starting point is 00:26:43 not relevant to what he was interested in. As he closed the book, on the back cover finished the book and he decided the book was actually not relevant to what he was interested in. As he closed the book, on the back cover of the book, it says other books by the same author. And another book was recommended. He said, oh, this sounds interesting. And that book was called Tense Logic. Tense Logic was a bit old name, but today it was called Temporal Logic. And he found the other book and he said, that's what I was looking for. So now

Starting point is 00:27:10 we have a formal logic that can talk about this abstract behavior, the interface level. And that's the logic he proposed, temporal logic. And ultimately he received the Turing Award for this discovery that you can use. And this was, until that time, it was a logic pursued purely by not even mathematicians,

Starting point is 00:27:30 philosophers, because the goal was to try to understand how people use time when they talk about time. How do people reason about time? And so the philosopher did what also mathematicians do. They try to abstract it by creating a temporal logic, a logic just about time. And, you know, there was partly the debate was debated later actually came also to be a debate inside computer science. Is time linear or branching? So linear means, you know, at the end of the day, you know, let's imagine at some point you live your life and then you get to the pearly gates and they say what did you do in your life and the answer would be well you did this and

Starting point is 00:28:10 then you did that and then you did this and you did that will be a one line this is the story of my life on the other hand if we believe that we have free choice then as we look forward we see a structural forking path i may do this or i may do that and if i do this then it will be another choice so as you're comprehending your life now you see lots of possible paths so should we think of of time at the conclusion so to speak so it's just one path or should we think of time as a trio forking path. And this seems a very abstract philosophical problem. Is time linear or branching?

Starting point is 00:28:53 And I use it sometimes. I find myself at some forum in which somebody wanted to debate whether there's really human-caused climate change. And I said, before we discuss climate change, let me pose another question. Is time linear or branching? And I explained the two choices. What do you think about that? And of course, he didn't have the dimmest idea about this. And I said, what do you think?

Starting point is 00:29:21 You can't even answer a very simple question in time. You experience every day. What do you think? You can't even answer a very simple question in time. You experience every day. What do you think you could reason about climate change? But interestingly, this debate about the nature of time ended up being a very pragmatic debate inside computer science. But the point is, I can put it on the stack, that story. But the point was, Pnueli said, this logic developed by philosophers is exactly the logic that I need to describe precisely. What does logic give you?

Starting point is 00:29:54 Logic is a formal language to describe properties of structures. So logic was, for its first more than 2,000 years, was philosopher trying to describe what is correct reasoning. That was the purpose of logic, to try to formalize correct reasoning. If I have an argument, am I, you know, two people arguing and we have to subject the arguments

Starting point is 00:30:17 to the test of logic. Does the conclusion follow from the premises, right? An argument start, this is my premises and I will show you from my premise how the conclusion follows. follow from the premises, right? An argument starts, this is my premises, and I will show you from my premises how the conclusion follows. And so logic was the study of arguments, of correct reasoning. Then something amazing happens in the late 19th century.

Starting point is 00:30:37 First, there is a foundational crisis in mathematics because what happened is that we talk about set theory, and then one of the operations that happens in set theory, you can take a set, and you take the collection of all subsets of that set. So you start, let's say, the set of natural numbers. That seems very intuitive to us. We all know what the natural numbers are. Now let's look at the set, the collection of all sets of natural numbers.

Starting point is 00:31:06 This is called the power set. And then in the 1870s or 18, I forgot the precise year, but sometime in the second half of the 19th century, Georg Cantor shows that the infinitude of the power set is larger than the infinitude of the natural numbers. So there are infinitely many natural numbers. Surely there are infinitely many sets of natural numbers. But he proved that it's a bigger infinity. Yeah, which is always mind-bending. When everyone ever says that to me, it's mind-bending.

Starting point is 00:31:37 It's mind-bending. In fact, he proved there are infinitely many distinct infinities. Okay. So it's really mind-banging. I remember once, try to explain it to my, by now departed brother-in-law. And he just could not comprehend it.

Starting point is 00:31:54 I mean, he is a lay person. And he just could not comprehend it. There are different type of infinities. It's either infinite or not. Come on, what is this? It's like, this is not asking how many angels can dance on a pinhead. What is the infinitude of angels that dance on a pinhead?

Starting point is 00:32:16 Okay. And later on, in fact, we get into paradoxes. The Russell's paradox is seemingly a contradiction in said theory. And people say, look, we have to have put mathematics on solid foundations. What could the solid foundations be? And the answer was it was proposed, and it was good while it lasted, eventually did not succeed,

Starting point is 00:32:38 was logicism. Logic is the absolute investigation of truth. So let's base mathematics on this absolute investigation of true and false. So suddenly logic become, from just talking about reasoning, just arguments, become a language to describe all mathematical structures. And so suddenly we have a language to talk about. And so the answer was, people

Starting point is 00:33:05 ask, what is the language of mathematics? And if you look at, you open a math paper, it's going to be a mixture of English and formulas. And Georg Grotlob said in a book in the 1870s, everything can be formalized in logic, in first order logic. So now we have logic. So what the advantage is, now we have logic so what the advantage is now we have a language unlike english which is incredibly ambiguous just think of the example if i tell you i only bought your car i only bought your car i only bought your car i only bought your car you see just by changing the stress, I'm changing the meaning of the

Starting point is 00:33:47 sentence. The most famous case, the most famous of the ambiguity of natural language was stated by Bill Clinton, who was challenged because he stated about his relationship with Monica Lewinsky, he said, there's no

Starting point is 00:34:03 sex in that relationship. And when he was challenged later, his answer was, it depends on the definition of is. And most people thought that he was some smarmy lawyer. But the word is in English is ambiguous. Turns out it's ambiguous. There are many different meanings for the word is. But logic forces us to be incredibly precise mathematical precision and so i'm really take this attitude you can take

Starting point is 00:34:30 logic from its initial use and you can kind of borrow logic to describe thing instead temporal logic is exactly the language we need to describe abstract behavior of vending machines. Because now I can say very precisely, if you push a button, if you put the coin, and then you push a button, then a drink will come out. So now once you have that, you said, okay, you can now take, you can specify properties of systems. But how do I check that the system satisfied that property? And that was an important progress

Starting point is 00:35:09 of the 1980s that people came up with algorithmic methods to do it. Not just methods to do it, algorithmic methods. That is to say, we now have a system where you give it,

Starting point is 00:35:22 you have tools, but you give it a system. For example, if its hardware will be a Verilog program, Verilog is a hardware description language. You can give it a Verilog program and you can give it a property written in temporal logic. What you'd use today is some industrial variation of temporal logic. For example, there is a language which is in the standard called SVA

Starting point is 00:35:46 for system very log assertions. So system very log is a hard-to-description language and a simulator for it. And now you can write an assertion in SVA and then a tool that would say the system satisfied or doesn't

Starting point is 00:36:02 satisfy. And the question was okay, there was debate, what is a good algorithm for it? And Pierre Wolper and I made a connection through automata theory because automata theory

Starting point is 00:36:15 is also, automata theory define what we call formal languages. And formal language looks at formal words and words are just sequences of characters. So you can think of such as a word as an abstract behavior of a system just a sequence of observations

Starting point is 00:36:32 and in fact the the course if you've taken are you what's your what's your undergraduate degree i i did maths and economics was my maths and economics mainly statistics right because it was like econometrics and then stats. So yeah, that was my major. So if you take a formal languages course in computer science, you learn about automata, which are abstract machines that recognize formal languages. But the typical focus there is on languages that consist of

Starting point is 00:37:01 finite behaviors, finite sequences. But it turns out, already going back to the early 60s, so the automata theory, the foundation was laid out in the mid-50s. But in the early 60s, people said, well, suppose I'm interested in infinitary behavior.

Starting point is 00:37:18 So it's a world that starts at some point, but it just goes on and on forever. Goes on and on forever. Like our system, system you know if you look at the system today in principle you think this laptop of mine should be able to run forever i know that it won't at some point you know the something will happen mechanically will happen to it it will start failing but at least we'd like to think that it's capable of running forever let's imagine that the behavior what is the the behavior? All the inputs it gets, all the

Starting point is 00:37:46 inputs are the keystrokes and the packets that arrive on the network ports and the output is all the pixels on the screen. So now the behavior is an infinite world. And so it turns out that people already

Starting point is 00:38:02 in the 60s came up with an automata theory, what you call omega automata, because it ran on a word is of the same length as the natural numbers. There is a zero slater, the first slater, the million slater, the billion slater, the trillion slater. It just goes on and on forever. And in 1986, Pierre Wolper and I made a connection to this problem of verifying temporal properties via a detour through automata theory. And it was mathematically very elegant. Beautiful.

Starting point is 00:38:36 Beautiful, beautiful, very beautiful. In fact, people say, wow, this is so beautiful. You go through automata theory. And it never becomes very simple if you go through automata theory. And it never becomes very simple if you go through automata theory. In fact, the paper was first rejected because the reviewer said it's too simple. And so Pierre and I added another section

Starting point is 00:38:55 that was complicated. And the paper was accepted. And nobody ever reads the complicated section. The whole point of the paper was it made complicated things simpler. But scientists sometimes would like to see it. It must be,

Starting point is 00:39:07 if it's simple, then anybody could have done it and don't realize that the brilliance is sometimes coming up with a simple solution. The theory in simplicity, right? That's so true. It is simple after you had the idea,

Starting point is 00:39:20 not before you had the idea. Yeah. So, but it turned out in this case, the simplicity meant that this is actually end up being also very practical. So it took some years, I would say, you know, the paper we started working, our first paper was written in 1983

Starting point is 00:39:39 and it took almost 20 years before it gave rise to industrial tools. But today that theory of Omega Automata, that sounds incredibly abstract, is the basis of industrial tools. It's industrial reality. Is that the thing you're most proud of? Kind of all the things you've done?

Starting point is 00:39:58 Has that had the most impact, you think, over all of your research? I would say, okay, so that's the thing that has most impact yeah okay it's you know it was a i kind of remember when when we came up with this observation wow we can use automata and suddenly everything become very simple and both pierre wolper and i thought wow what a what a beautiful idea okay and then the paper get rejected because it's too simple. But ultimately, in this case, simplicity, mathematical simplicity, I mean, it is simple and not simple because it is simple after you laid out an abstraction.

Starting point is 00:40:35 Abstraction is omega words, omega automata. I mean, you kind of have to be very abstract. I mean, I'm explaining to you because you're a little bit of a mathematical background but if i try to take a person that has even that's a college education but not in math that would be very very hard to explain yeah a finite machine running on infinite words it you know the machine has to be finite state that's very very important but the world itself is infinite and it will be very very hard to explain but for people that look at math remember the 10 year old who said this was

Starting point is 00:41:10 beautiful yeah it is i think it's a beautiful idea and i mean it turns out that the connection to automata was essentially followed up from earlier work but the earlier work would have made it was completely impractical in some sense. And we found a very practical way of doing things. And so that ultimately led all the way to industrial tools. And in fact, just last year, Pierre and I received an award for a high-impact paper in electronic design automation because this paper that was written almost 40 years ago is now the basis of industrial tools.

Starting point is 00:41:52 So theory can have lots of industrial impact, but partly the impact happened not just because it was possible, but because, for example, I spent much of the following years working on making it practical. It didn't just happen. So a colleague approached me, Doron Peleg, deserved the credit for coming to me, and said, why don't we try to reduce it to practice? And this was more than a decade after the initial paper. And I said, yeah, let's try that. And he did the coding also.

Starting point is 00:42:29 He took one to do the coding. And it proved to be, while he was exponentially in the worst case, it proved to be actually well-behaved in what we call in the typical case. And so that, first of all, then start other people start doing research on on improving the performance of the tool and by the late 90s i started consulting with intel and at that point academic research showed that this tool can be quite practical so i was able to propose to intel let's implement this approach so in this case i would say say I accompanied the paper all the way from the mathematical conception to the empirical evaluation and then through the tech transfer process. So it didn't just happen because, you know, I could have published a paper and just say, OK, I'm done.

Starting point is 00:43:19 OK, which I have to say, which was the first decade after I did it, I said, prove the theorem, I'm done. Okay. But then I was inspired by, you know, I mean, a good scientist is inspired by people around, lots of smart people around you all the time. So you should take advantage of it and be open to good ideas. And so, I mean, sometimes, you know, I was, you know, as I said, Ron Peleg proposed it to me and I said, let's work on it. And then Intel asked, would you like to come to consult for us? And I said, yeah, that sounds interesting. And then when you're consulting, you have to, I went there with an attitude of, I'm going to go and listen to their problems.

Starting point is 00:44:01 I know people who try consulting, selling, well, I have a hammer. I'm going to sell my hammer to industry. But you have to come and listen to their problem and see, no, I think I can help you with your problems. So that was a very, very productive line of work. I said it took

Starting point is 00:44:17 to get it from conception to practice took about 20 years. It takes some work and willingness to accompany the work along different stages of the work. So that is probably my most influential work in terms of all the way to industrial tools. And now the lesson from that is, and especially today,

Starting point is 00:44:41 when I do even most of my, when I have students who do, I wouldn't call it theoretical work, but they do algorithmic work. And most of them, I tell them, develop the algorithm, then implement it and evaluate it and see how well it works. And very often we find that this is not just a way to make it closer to practice, Kulesil does doing research. But we actually, sometimes we look at the performance and sometimes we implemented something and the performance is pretty bad. But then we ask, why is the performance so bad?

Starting point is 00:45:16 Then we go back and we improve the theory. So this feedback, I mean, this is if you think of science, but if you look very often how natural science is made, okay? Physicists, there are theoretical physicists, and they will just do theory. But everybody understands at the end of the day, the theory has to be subject to experiment. And the experiment will either be consistent with the theory, which will reinforce the theory, or it will say, no,

Starting point is 00:45:45 the theory doesn't match what we are observing. And we go back to that again. And if you have a theory and you say, well, there's no way to test it, then we're saying, well, it's a nice story, but it's not science. I think the same kind of thinking can reply to not everything in theoretical computer science. I think there is room for theory that just, you know, people who prove lower bounds

Starting point is 00:46:06 or undecidability, it's also useful. But I think a lot of theory could benefit from this feedback loop of theory, experiment, back to theory, back to experiments. And sometimes the theory community,

Starting point is 00:46:21 I find, is a bit too elitist and to say, no, no, I just want to prove theorem, you know. And people use the the phrase gets my hand dirty, which I think it's an obnoxious phrase. There's nothing, you know, there's nothing dirty about writing code. You know, in fact, even writing proof is like writing code in some programming language. okay so I think computer science could benefit from more cooperation between theory and experiments

Starting point is 00:46:49 and even in some cases tech transfer fantastic well that's a great message to end on there Moshe thank you thank you so much for this chat

Starting point is 00:46:56 but yeah fantastic I think where can we find you on socials for the listeners if they want to go I mean if you go

Starting point is 00:47:03 at Vardy that works on all social media perfect well i'm on facebook and twitter and mastodon and linkedin great okay thank you very much and um yeah we'll see you next time for some more awesome computer science research Thank you.

Disseminate: The Computer Science Research Podcast - High Impact in Databases with... Moshe Vardi

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.