ACM ByteCast - Edward Y. Chang - Episode 50

Episode Date: March 20, 2024

In this episode of ACM ByteCast, Rashmi Mohan hosts 2021 ACM Fellow Edward Y. Chang, an Adjunct Professor in the Department of Computer Science at Stanford University. Prior to this role, he was a Dir...ector of Google Research and President of HTC Healthcare, among other roles. He is the Founder and CTO of Ally.ai, an organization making groundbreaking moves in the field using Generative AI technologies in various applications, most notably healthcare, sales planning, and corporate finance. He’s an accomplished author of multiple books and highly cited papers whose many awards and recognitions include the Google Innovation Award, IEEE Fellow, Tricorder XPRIZE, and the Presidential Award of Taiwan. Edward also also credited as the inventor of the digital video recorder (DVR), which replaced the traditional tape-based VCR in 1999 and introduced interactive features for streaming videos. Edward, who was born in Taipei, discusses his career, from studying Operations Research at UC Berkeley to graduate work at Stanford University, where his classmates included the co-founders of Google and where his PhD dissertation focused on on a video streaming network that became DVR. Later, at Google, he worked on developing the data-centric approach to machine learning, and led development of parallel versions of commonly used ML algorithms that could handle large datasets, with the goal of improving the ML infrastructure accuracy to power Google’s multiple functions. He also shares his work at HTC in Taipei, which focused on healthcare projects, such as using VR technology to scan a patient’s brain; as well as his current interest, studying AI and consciousness. He talks about the challenges he’s currently facing in developing bleeding edge technologies at Ally.ai and addresses a fundamental question about the role of human in a future AI landscape.

Transcript
Discussion (0)
Starting point is 00:00:00 This is ACM ByteCast, a podcast series from the Association for Computing Machinery, the world's largest educational and scientific computing society. We talk to researchers, practitioners, and innovators who are at the intersection of computing research and practice. They share their experiences, the lessons they've learned, and their own visions for the future of computing. I am your host, Rashmi Mohan. If you've been using the popular generative AI tools in the market only to craft perfect
Starting point is 00:00:35 emails or grammar check your documents, you're selling yourself short. Having been in this field for much longer than the recent popular wave of interest, our next guest can tell us a thing or two about large-language models and their applications. Professor Edward Y. Chang is an adjunct professor in the Department of Computer Science at Stanford University since 2019. Edward has been a director at Google Research in the past, and President of HTC Healthcare, amongst many other roles. He is the founder and CTO of Ally.ai, an organization that is making groundbreaking moves in the field, using generative AI technologies in various applications, most notably healthcare, sales planning, and corporate finance. An accomplished author of
Starting point is 00:01:23 multiple books and many highly cited papers, he has also won numerous awards, including the Google Innovation Award, the coveted X-Prize Tricorder, and the Presidential Award of Taiwan for his work. He is a fellow of ACM and IEEE, recognized for his work and contributions to scalable machine learning and healthcare. We are so lucky to have the opportunity to speak with him. Edward, welcome to ACM ByteCast. Thank you, Rajmi, for your invitation. It's my great pleasure to join the podcast.
Starting point is 00:01:58 Likewise, we're really excited to speak with you. I'd love to lead with a simple question that I ask all my guests. Edward, if you could please introduce yourself and talk about what you currently do, as well as give us some insight into what drew you into the field of computer science. Okay. I was born in Taipei and came to the US to attend college. And I first attended UC Berkeley, majoring in operations research, which is kind of optimization techniques. And then I worked in a software company intrigued by programming. And then I consider my background was insufficient. So I went back to school and joined Stanford to receive my MS and PhD. And the timing was perfect. When I was in Stanford, I was classmates of Larry and Sergey, and they are founders of Google. And after I spent about seven years in UC Santa
Starting point is 00:02:47 Barbara, received my tenure, then I joined Google. And because they have so many machines, can do parallel processing. And at the time, I worked with Fei-Fei Li to annotate ImageNet. And once we have received so many kind of images and my lab started to parallelizing some mission critical machine learning algorithms, including sub-effector machines and LDN and so on and so forth. And it has been a very exciting journey. And recently I started to study consciousness because as Yusha Banjo mentioned about four or five years ago, and the current AI pretty much focuses on computation, which is modeling humans' unconsciousness. To be able to do reasoning and planning, we have to think about how to model human consciousness.
Starting point is 00:03:34 And GAI seems to attend, kind of get into the realm of human consciousness. So this is really a very exciting era. That's great. I mean, I think our audience would be super excited to hear about this topic. I don't think we've covered this in great detail, and it's so relevant in today's day and age. But I want to go back a little bit, Edward, to what actually was the most driving force
Starting point is 00:03:56 for you to pick computer science, even in your undergraduate education? Were you exposed to it when you were younger? I think initially I was just receiving a job, so I needed to start coding. So that was pretty straightforward. And I really got so interested in coding because computing is really the foundation of science. Many different sciences, we need to collect a lot of data, analyzing the data, and to be able to get some insights. So information processing is a big application which drives me to dive even deeper into computer science methodologies. Yeah, I mean, I hear you. I think many of us at the time when we probably picked computer science,
Starting point is 00:04:36 early exposure and definitely the excitement of an up-and-coming field that had a lot of jobs was the motivator to sort of get into it. But it sounds like you found a lot of very interesting areas to delve deeper. As I was reading about your previous work, one of your very, very early innovations, I was very surprised to read was you're credited with inventing the DVR, the digital video recorder, which really transformed the way in which, you know, we created and saved content. So I would love to hear more about that phase of your work. Okay.
Starting point is 00:05:08 When I was in the PhD program at Stanford, my advisor, Hector Garcia-Marina, asked me to work on infrastructure very similar to Netflix, pretty much just a streaming video. And this was about 1995. And the bottleneck is not really on the server side. The bottleneck is really the internet to the home, the last mile. At the time, 1995, and we are kind of accessing internet using telephone line.
Starting point is 00:05:36 So even I have developed this kind of media server technologies. It was not practical at the time. So the second part of my thesis, and we say, well, if we really cannot do kind of real-time information streaming, can we buffer some data in the local devices? And the best devices to do buffering would be the disk, hard disk, right? So if we buffer some information already on the disk with some initial latency, we can do streaming at home. And for real-time TVs, we can pause the TV and the TV program can save on local disk. And then we come back to resume and we can fast forward. So that was a very simple idea.
Starting point is 00:06:18 And the implementation itself was not extremely hard. And basically then we kind of replacing this disk technology using disk instead of tape to revolutionize the VCR technologies. And Professor Patrick Harahan was also a major pusher behind this situation because I took his course and he encouraged us to come up with this kind of interesting devices to help the world. At the time, there are multiple companies, participants come to Stanford to see the demo. And one of the visitors later started a company. I think people probably still remember TiVo was started after about two years after I published my paper.
Starting point is 00:06:57 Yeah, no, of course. I certainly remember TiVo and what a revolution that was. Which is also interesting, Edward, is that as somebody who was in academia and, you know, working on your thesis and your PhD program, did they just take your paper and kind of like, you know, run with it at that point, think about how you could, they could productionize it and make it into a product. Were you involved in that process at all? Did you have to collaborate with companies outside that were trying to make a product out of your idea? I think unfortunately at the time, because we don't have the sense of following a pattern, and I think those founders just took the idea and they started the company.
Starting point is 00:07:30 So I really didn't get heavily involved in the development of the product. But subsequently, after I joined UC Santa Barbara as a faculty member, Sony was very interested in collaborating. They planned to enhance this VCR only supporting one TV to be able to support multiple TVs at home. And we work on a prototype to be able to support
Starting point is 00:07:51 up to 20 devices at home. But I consider because the device is very cheap and end up all the family today, they only have kind of every TV, they have one digital VCR. But technologies wise, actually one VCR can support 20 TVs. So that's the situation. Then after that, I think the internet bandwidth becomes really, really high, getting increased
Starting point is 00:08:15 faster and faster. And this bandwidth problem gets resolved and the research issue is no longer changing. So then I switch my focus to machine learning. Got it. I was going to ask you, how did that transition happen? So the next phase of your career was your work that you did at UC Santa Barbara and then at Google. Is that right?
Starting point is 00:08:36 Yes. When I joined Santa Barbara, I said, well, to pursue my tenure, I need to have a very exciting research topic. And digital VCR definitely is something in the past. And I say, well, maybe using machine learning to identify photos or kind of processing video will be interesting. And after working on the application for some time, then I consider I really need to get into machine learning because that's the foundation of object detection and object recognition.
Starting point is 00:09:08 And then I started working on the topic for some time. As I mentioned, I knew Fafeli pretty well. And when I joined Google, I collaborated with Fafeli and sponsored Fafeli about $250,000 to sponsor the project. The reason I joined Google at the time was with so much data, right? In university, you just have no machine, no devices, no resources to process those data. And Google has so many machines, so many CPUs. At the time, we haven't started using GPU. So with the MapReduce, and MapReduce ran on so many GPU at the same time. So I joined Google to start develop parallel machine learning algorithms.
Starting point is 00:09:46 And that was a transition in about 2006. And between 2006 and 2012, my major focus was making those mission-critical machine learning algorithms to be able to run maverick deals on Google infrastructure. Got it. What better place? And at least at that time, the scale of data as well as the infrastructure that Google had could not be rivaled anywhere else. So was your primary focus at that point improved performance of the machine learning algorithms? Is that what you were sort of focused on or was it accuracy? I'm trying to understand what are the sort of key problems that you were trying to solve? Yeah, my focus was to improve the machine learning infrastructure accuracy to try to power Google's different applications.
Starting point is 00:10:31 And myself focus working on the Q&A system. So we need to do semantic parsing, natural language processing, natural language understanding. So with a robust machine learning algorithm will be extremely helpful. But at the time, this is about 2008, most of the algorithms Google employed, they use a linear algorithm or sublinear algorithm, which means they only want to process the data once. And my colleague told me, you don't want to use a much more complex algorithm. Like, suppose by the machine,
Starting point is 00:11:03 the computation complexity is n squared. And n squared means if you can process 1 billion training instances in one second, and n squared means now you need to spend 1 billion seconds to process all the data. And you just cannot do that in Google. Google has a lot of data. So when I started to do a parallel machine learning on this quadratic machine learning algorithm, my colleague actually advised me not to do it because they say, well, you cannot work on something which is very time consuming. But really, machine learning with big data was a trend. So eventually, AliceNet was extremely successful. Then people say, well, the accuracy is so drastically improved. So now we are willing to put into a lot of money
Starting point is 00:11:45 to kind of parallel our computation to improve accuracy. And the GPU then was started to be utilized and the cost was not as high as using a lot of CPUs and the entire field of using a CPU to process big data just took off around year 2014. And if you look at NVIDIA's stock price, and you could see after the ImageNet was published, AlexNet was very successful.
Starting point is 00:12:14 Then about two years after that, and NVIDIA's stock was increased by 10 folds because of this paradigm shift. Yeah, no, absolutely. I've been on a rocket ship since. But it's amazing that you were literally at the inception of this whole transformation and the use of a GPU for processing. It's pretty fascinating that you were there.
Starting point is 00:12:36 What was the next transition like for you, Edward? What took you to your next role after Google? Yeah, the next role was, I consider it was a time for me to maybe contribute to my birthplace. So I said, well, maybe I should go back to Taipei and try to educate or maybe mentor the local students, youngsters. So also at the same time, can you improve the infrastructures of Taiwan's computation? So I went back to Taipei to join HTC. At the time, HTC was a very good cell phone manufacturing company. I recall HTC was a
Starting point is 00:13:13 manufacturer of the Pixel, maybe Pixel 2 or Pixel 3. Of course, later the competition becomes extremely tough. HTC was no longer competitive and we sold our entire cell phone division to Google. So during the time when I was at HTC, initially, I contributed to these mobile phone applications. One notable application, even today, iPhone still doesn't have that application, was the 360-degree kind of panorama. So you just take 20 some photos, you capture this entire sphere of 3D. And that innovation was we use some sensors on the camera to try to capture the movement of the cell phone. And then we instruct the user the direction
Starting point is 00:14:02 they need to point to, to take picture. So this way, when they take a spherical picture, there wouldn't be holes in the middle, right? Because we know exactly which direction the photo has been taken, and we know exactly where the information needs to be acquired. So with about 20-some shots directed by gyroscope and a thermometer, we can capture precisely the entire sphere. And that was extremely well done. And the post-hoc has not featured today, but because we found a patent, so iPhone, we haven't seen that. And then I moved on to work on healthcare. The motivation of healthcare was Taipei is really, Taiwan has a very good healthcare system. And in about 30 years,
Starting point is 00:14:46 they have collected so many medical records. So as we already learned at the time, with big data, right, you can really improve the accuracy of many things. And healthcare diagnosis is one application that later I focused working on. And that was my major focus on HTC during the second half of my tenure over there. And then we can probably discuss about the IoT devices I worked on at the time.
Starting point is 00:15:15 And then we entered a competition called Tricorder and won the second place in the world. Wow, that's amazing. I mean, I like how your work around image processing and machine learning led you to HTC. I mean, and of course, the motivation sort of to do more in Taiwan, but really led you to improving the quality of the camera and pictures that an HTC phone could take. But also the transition to healthcare was a very interesting one, driven mostly by what you're saying, the record-keeping in the overall healthcare system in Taiwan. That sounds fascinating.
Starting point is 00:15:55 So moving into healthcare, what was the work like when you were, I mean, did you do more healthcare-related work while at HCC? Or by then, had you sort of started to think about doing things on your own or getting back into academia? Yeah, I started the healthcare project by an accident because at the time, a professor at Harvard University, and he would like to join the competition hosted by S-Prize. S-Prize is a foundation that encourages blue sky innovation. So some well-known competition they hosted early days was self-driving vehicle. And the latest competition was sending human to Mars or sending robots to Mars. So in about 2010, and they started a project saying, well, you know, there are a lot of remote areas. They are lacking medical devices and the doctors cannot do precise diagnosis.
Starting point is 00:16:42 So can you put together a device which is very, very light in weight? So their constraint to us is like five pounds. And with five pounds, you need to be able to detect or diagnose about 15 diseases, including HIV, liver problems, and diabetes, and those kinds of diseases. And the challenge at the time, of course, the weight is a big problem. And the second challenge is if you want to do those kinds of diagnoses at home, definitely you need to have some machine learning algorithms and to collect data to do supervised learning, and you can do classification in a remote area. And once you have done the classification, the data can be sent to the cloud. And a doctor, let's say, do a remote diagnosis on a patient in Africa, and the data can be sent to the cloud. And a doctor, let's say, do a remote diagnosis on a patient in Africa
Starting point is 00:17:28 and the data, after once the diagnosis has completed, the data can be sent to the cloud. Any doctor in the world can review the diagnosis and assess the quality. So we consider that a very kind of formality kind of project.
Starting point is 00:17:42 So we started working on that. And since at HTC, we are really good in making devices light, right? Like you make a cell phone very light. So we have the edge in putting devices together. And also because of my machine learning background and this Harvard professor, Professor Pan, was delighted to work with us.
Starting point is 00:18:02 So he kind of built a consortium with all the hospitals in Taiwan. So we get a lot of data. And on my side, I focus on machine learning and also device manufacturing. And at the end, although we won the second prize, and we consider actually we should have won the first prize. And the reason we won the second prize was the first prize winner, they came from a family of five. And they use a very rudimental kind of devices and the methods. They came up with the tricorder at home using on their dinner table, using 3D printers. And we spent so much money and effort. So I actually raised a lot of money
Starting point is 00:18:39 from Taiwan government. So at the end, the foundation may be saying, well, this is Goliath fighting with David. So that doesn't make any sense. So they couldn't give us the first prize. But anyway, the whole process, we learned a lot of good experience. And that paved my way back to academia. We had a good collaboration with UC Berkeley at the time. And HTC, in about 2020, started working on virtual reality. So using virtual reality, we can scan a patient's brain. And a surgeon before the surgery can fly into the brain
Starting point is 00:19:15 to see the detailed structures. And you know, the brain surgery or any kind of surgery, a surgeon wants to remove tumors, at the same time wants to keep the benign tissues intact. And in the past, without this virtual reality visualization, a surgeon needs to kind of imagine what a 3D structure is like by taking a look at this kind of 2D MRI images. And oftentimes, they could make some mistakes or suboptimal surgery planning. And suboptimal means you have a path you go in, but you have to destroy some benign neurons. So after surgery, the patient may be malfunctioning in speech or other functions.
Starting point is 00:19:55 We definitely want to avoid that. So that was a kind of collaboration with Berkeley and with the virtual reality. And Stanford invited me to host a panel. And eventually I moved to Stanford to start my adjunct professorship. Oh, amazing. I mean, I love the story
Starting point is 00:20:13 that you talked about, David and Goliath. Congratulations on the Tricorder victory because it sounds like an amazing innovation. And I completely hear you, right? I mean, when you talk about somebody who has built a very homegrown solution to a problem ingrown solution to a problem in comparison to like a corporation that does it, I can see how that vote may have swayed,
Starting point is 00:20:30 but it sounds amazing. So going into a little bit more about the device that you built as a part of the tricorder, I mean, was that used for more commercial usage as well, or is it mostly just a POC? I think at the time it was a POC, but the challenge to us to kind of make it commercialized was twofold. One was we have to go through FDA, right? Every single device we have to go through FDA to get approval. And the second interesting thing is whenever we have a software update, like say on our cell phone, we can just upload a new version without any trouble. But in medicine, you cannot do that. For every new version, either a minor revision, again, you have to go to FDA.
Starting point is 00:21:10 So this entire process is really not scalable. So we worked with FDA to try to have this kind of regulation changed, but it was very, very slow. So we couldn't commercialize in time. But at the time, the foundation of Gates, Bill Gates Foundation, and also from India, we have some colleagues, they say, well, India FDA was not as strict. And so therefore, we actually transferred the technologies to other entities. And they started to working on those devices. I know China has a company called iHealth. They also have this kind of healthcare IOTs and try to do diagnosis and try to do a kind of a treatment in the rural areas. Got it.
Starting point is 00:21:52 Yeah, no, and that's great. I mean, and certainly, you know, I know countries like India obviously need it as well because there is a lot of rural population and access to healthcare is a challenge. So I can imagine that this would be widely used and appreciated. So yeah, it's great that you were able to find a home for it, even if the FEA process was cumbersome. ACM ByteCast is available on Apple Podcasts,
Starting point is 00:22:16 Google Podcasts, Podbean, Spotify, Stitcher, and TuneIn. If you're enjoying this episode, please subscribe and leave us a review on your favorite platform. But your part, which is almost like the next phase of your career when you were back at Stanford and back in academia, is that when you started to sort of get back or into looking at artificial intelligence and LLMs? Yes. In the beginning, we didn't really pay attention about LLM, right? We work on natural language processing kind of technologies, but always we run into a lot of corner cases because during semantic parsing or language understanding, even we have a perfect model, often we encounter exceptions, right? So let's say we have an airline reservation system,
Starting point is 00:23:03 but the customer may say something we didn't expect it. Then the robot just cannot continue. So we always run into limitations until GBT-3 launched last year. And we say, wow, GBT-3 functions much better. And since at the same time I was working on this conscious modeling and GBT-3 was able to do reasoning, to me, it was a really remarkable situation. And then I look at GPT-3 and now GPT-4, and we know a lot of people saying, well, they still have some limitations, especially the model has some biases because of training data. Suppose all the training data we input is from CNN instead of Fox News? Of course, the answer would be probably
Starting point is 00:23:45 tilt to the left, right? So the model has inherited biases because of training data. That has to be addressed. And the second issue people understand is this hallucination. Hallucination can be randomly or could be kind of non-logical expression. So how to mitigate those? And I think a lot of method has been devised like chain of thought, tree of thought. But I think we came up with an interesting breakthrough. We named the platform Soccer Thing. The idea was initially very simple.
Starting point is 00:24:18 We say, well, LLM is so powerful, so knowledgeable. And also its knowledge representation is polydisciplinary. So when we import data into LLM training, we don't say this is a physics kind of book and that's biology. We don't. When LLM was trained, they have no boundary of knowledge, right? They just put everything together. So even today we ask the question, it's about computer science. LLM doesn't know this is a computer science problem. It just synthesizes the answers to answer
Starting point is 00:24:51 our question. So in this kind of representation without disciplinary boundary or we call polydisciplinary, it actually can synthesize new knowledge. It can synthesize something we call unknown unknowns. Suppose we agree LLM can synthesize unknown unknowns. That is a big problem because human's knowledge is limited. If we don't know, we don't know. How can we even ask questions? So I use an analogy to describe a situation. Suppose I'm a 10-year-old kid. I go to this Nobel Laureate award ceremony, and there are 1,000 Nobel Laureates sitting there. If I'm a person asking questions in front of this panel, this means I can ask any interesting questions and get insightful answers. So the solution is a following. Human is not qualified to ask questions. If we want to get insightful information, we can only be a
Starting point is 00:25:47 moderator. We put a subject matter on the table, then ask Nobel laureates. They can do a debate. They can do discussion. We are just sitting there to listen. So that's a key insight we obtain. We say, well, you get hallucination. Yeah, maybe the algorithm may not be robust, but most of the time we ask the wrong questions or our question, the context we provide to LLM was not precise enough. So therefore we didn't get good answers. So that was the motivation and end up we have four algorithms together with this kind of debating setting, I think we make really interesting progress. My gosh, that sounds absolutely fascinating.
Starting point is 00:26:29 I think the part that was particularly interesting to me is when you're talking about the boundaries, the interdisciplinary work that, you know, there are areas because of the fact that LLMs are not classifying the content into specific human-defined subjects, if you will, there is no boundary in terms of how you can bring two concepts together. Versus as humans, we may think of something, like you said, classified as biology or physics or psychology. But really, the areas of intersection that we don't expect at all is something that the system that you're building can uncover for us. It kind of blows my mind. So in terms of Socrates itself, Edward, what stage of the product or the idea are you at now? And what is the role of human beings in this process?
Starting point is 00:27:16 Yeah, it's already kind of ready to be utilized by various applications. In fact, I'm consulting multiple companies. It has been used in healthcare, used in sales planning, and also investment banking. So give a very quick example. Suppose we want to diagnose disease, and we can actually get a lot of ground truth from US CDC. So they have this mapping of symptoms and then diseases, right? But it's interesting when we input a set of symptoms into BART and also GPT at the same time, we say, okay, can you diagnose this patient with the following symptoms? They actually came up with different answers. And I said, well, you can have a debate, right? Why you come up with these answers and why your predictions are different from each other.
Starting point is 00:28:03 So they actually provided their justifications. They also say, well, because the information you provided, the symptoms, they are not sufficient. So you have to ask more questions. And finally, they say you have to conduct certain lab tests like blood tests and so on and so forth. So that was interesting in results I obtained because whatever we obtained from CDC was called ground truth. But the ground truth, actually, they have mistakes. And this is really tied into this very good paper published by John Hopkins' physician last year. Analysts were saying in the US, about 5% of diagnosis is erroneous or misdiagnosis. And this is a huge problem,
Starting point is 00:28:43 not only liability, but human's health. And so you cannot take CDC's data as ground truth. There are some errors in there. And I have done this AI research in healthcare for 10 years. I always treated those data as ground truth. Now I open my eyes and I say, oh, if I input to Sogathin, allow agents to debate, they will have new answers and tell me the previous diagnosis may be wrong. And human role in this situation is zero, just a moderator. Because we are so limited and if people consider we are smart, it will be counterproductive. And Dmitri is the CEO of Dmyte and gave us three examples saying humans should get out of the way.
Starting point is 00:29:25 The first example is AlphaGo. You know AlphaGo compared to AlphaGo Zero? AlphaGo Zero is scared of all the human experiences. AlphaGo Zero wins over AlphaGo. Another example is AlphaFold, protein folding. AlphaFold 1 uses human heuristic to build a model in the middle. And AlphaFold 2, DeMille is saying, oh, let's get rid of human heuristic to build a model in the middle. And AlphaFold2, they mean saying, oh, let's get rid of human heuristic. And the score of AlphaFold is much higher than AlphaFold1. It's like AlphaFold scored something like 50, AlphaFold2 scored like 90 something. And the last example is
Starting point is 00:29:56 self-driving vehicle. If you have human knowledge, like put a map in the middle, try to instruct the driver how to drive. No, it's not going to be very effective because human sensors and human heuristics always encounter exceptions. So that tester will say, no, forget about human in the loop. We just do end-to-end training. So they got a much better, much more effective self-driving algorithm. So in short, human, because we are limited in knowledge, even when we have one or two PhDs, we cannot compete with LLM, which has multiple PhDs at the same time synthesized into this kind of polydiscipline representation. Therefore, the treaty of Sarkozy is human can only be a moderator. You'd be kind of in a very passive role. We can evaluate their reasonableness, their logic, whatever, but we better don't contribute our ideas.
Starting point is 00:30:49 That's kind of very sad, but that's what we have learned so far. Wow. Okay. So many pieces in there that I want to dig into a little bit more, right? Let's start with the simplest. simplest is, have you found that, and I don't know how you would evaluate this, but the precision in terms of the diagnosis is better by using a system such as Sucrasan? Yes, that's true. So I'll just give a quick example. So let's say 14 symptoms, one is headache, the other one is kind of fever, right? And then GPT saying, well, you should ask additional question like, do you have those two symptoms happening simultaneously? And that I have never thought about. And also they say, well, you should ask, is your headache kind of periodic or only happens once a day or whatever? And is your headache getting better or getting worse?
Starting point is 00:31:35 So all those kind of refined questions, most of the physicians didn't even think of. They just said, do you have headache? Do you have fever? you have a runny nose? But the machine even asks for correlation, timing, duration, severity, all those details. So it's interesting. Really, I mean, during the process, I was impressed. This is just one example. There are many other examples, like sales planning and investment banking. Investment banking, suppose you say, oh, I want to invest in this company. I want to buy stock. And it has this SEC, they're filing the petition. You want to say whether their petition is accurate or not. And based on that, you want to make a judgment whether invest or not to invest. And we really get a lot of insights from this circus and debate process.
Starting point is 00:32:20 In the monologue kind of Q&A, you may not be able to get good answers because during the monologue discussion, the system like GBD will give you the default biases, the default biases of the model. take positive position. You force another agent to take negative position. You are forcing them to be biased according to your will, rather than taking the model's bias. Who push into positive and push into the negative, then we have very intelligent kind of modulate with contentiousness. And eventually after they kind of debate for a while, they reduce their contentiousness,
Starting point is 00:33:03 come up with some compromise conclusions. And that conclusion can provide much more insights than the default monologue kind of Q&A session. Yeah, no, that was going to be my exact next question, which is inherently, it sounds like a system like this would remove bias or at least reduce bias in a significant way. And that's kind of what you were just talking about. Yes, it also reduces hallucination because you say, during the debate process, the two agents will keep on arguing with each other.
Starting point is 00:33:34 During the argument, your QA is extremely focused. The QA pair will be formulated by the agent, not by the human. When human form a question, it can be very fussy. And the second thing is, when you started to do this debate on and on, after rounds and rounds, the contextual information is improved. The context is getting better and better and better.
Starting point is 00:33:57 So the debater on the two sides can delve into deeper and deeper, deeper insights. So therefore, hallucination has no room to exist. Also, I have a treaty. I say, well, do you ever have the same nightmare twice, exactly the same nightmare twice? The caution is probably not, right? So if you don't have the same hallucination twice, it means you will not have the same bad argument twice generated by LLM. And then after this, the refutation debate process, all the hallucinations will be disappeared. Got it.
Starting point is 00:34:30 It sounds like a perfect system, Edward. I'm just curious. What are the risks? What are the downsides? What are the challenges that you're facing? Yeah, right now we are doing evaluation because every round of debate, we want to make sure the quality is very good. And luckily, we use the Socratic method. Socratic method has been there for 2,000 years, but strangely, people don't use
Starting point is 00:34:52 it. And we use the Socratic method to evaluate every argument's reasonableness. Here, we say reasonableness is basically evaluating its logic. And we face a very interesting problem. A lot of people are saying, no, I want to evaluate whether it's fact or it's truth. And unfortunately, after doing the research for some time, I actually considered there's no facts in the world. I mean, the same event happens in somewhere in the world, right? If you look at different newspapers, the stories, narratives can be different. So there's no way I can know the facts unless I go to the news location to eyewitness what happens. Even so, maybe I won't be able to see the whole thing. I couldn't understand the causality, for example. I cannot tell who is right and who is wrong. So then I say the only
Starting point is 00:35:39 thing we can do is to evaluate reasonableness, the argument's quality. I think it turned out to be reasonable. So we have GBT and bar, for example, doing debate. Then we have kind of inferior LLMs that do evaluation. Because when you do evaluation, you don't need to have knowledge too much. You just need to make sure the logic is extremely tight. And we published a paper, we showed the evaluation is quite consistent. So we are pretty happy about not only generating content, also we are able to evaluate
Starting point is 00:36:11 the quality of the content. Of course, there are much work to be done in the future. Yeah, yeah. No, it sounds great though. Is this work that you also sort of push forward via Ally, the company that you are the CTO at?
Starting point is 00:36:26 Yes. A company came to look for me because they considered the idea to be intriguing. Actually, it took them some time to really understand the powerfulness of the soccer swing system. There's a great potential to apply to many areas. Got it. I understand. One of the things that I was reading was that there's a program called Stanford Oval, I think. Yeah, the program was established by Professor Monica Lam, who was my partner when I even I was back in HTC, and she would visit our company for collaborations. And when I joined Stanford at the beginning, because Monica was interested in healthcare, so we see a synergy there. So I joined her lab. So yeah, I think right now we are taking different approaches to address the problem of semantic parsing. So we had the same mission
Starting point is 00:37:14 and we try to address semantic parsing quality, but we are using different methods to address the challenges. Got it. It's absolutely fascinating work. I'm just curious, what's next as you sort of make progress? What are the next set of goals that you have? On the side, I know you asked me about my hobbies. I write a lot of poetry and I also generate a lot of art and taking a lot of photographs. And so I'm actually using GPT currently with the daily to create some interesting art pieces. And I just published a poetry book. And so this multi-agent scenario, right? We talk about soccer scenes, they can debate.
Starting point is 00:37:53 You can have three or four agents and each agent is a persona. And you can ask them to have a dialogue and you can create a fiction, right? Create a novel out of this kind of focusing platform. And so I say, well, maybe after my scientific endeavor, when the system become more mature, and I'm getting into writing a novel using the platform. At the same time, when you start working on different applications, and we will discover additional research challenges need to be resolved. And the final grand challenge is,
Starting point is 00:38:31 if we really consider a polydisciplinary kind of representation, have new insights into the knowledge, if we know LLM may have some unknown unknowns humans cannot tap into, then my final goal is to be able to, even if I don't know how to ask questions. Will there be a trick which I can trick LLM to tell me something which I don't know. Maybe I don't
Starting point is 00:38:53 even understand. And he can teach me to explore some unknown unknowns. And that will be really remarkable. This is maybe changing the world of research. That's my final dream. Yeah, it truly is.
Starting point is 00:39:08 I do have to ask though, I mean, because often the most colloquial question is, oh, AI is going to take over the world and take over all our jobs. And, you know, in a lot of the way
Starting point is 00:39:17 you describe the product, if, you know, the human is out of the loop, how do you see human involvement in products and systems like this in moving forward? What kind of role would we play or what kind of jobs would we have?
Starting point is 00:39:31 So a human can probably be only the moderator, right? But a moderator, a more skillful moderator or a more knowledgeable moderator can get much better results and much better insights. And so humans still need to improve our own knowledge. And hopefully because we can work with LLM, so our knowledge acquisition can be much more efficient. So hopefully in the future, we will see some new careers or new pathways to be able to support our own livings. In the short term, for example, data science, computer science, some new careers or new pathways to be able to support our own livings. In the short term, for example, data science, computer science,
Starting point is 00:40:12 AI really can do a lot of work to replace human beings. So in the short term, it can be pessimistic. But in the long run, there should be some new applications, some usages, and the human can be employed. Yeah, for sure. Like you said, areas that we've not thought of combining or bringing together, maybe those will be uncovered through these systems and give opportunities for us to kind of explore new jobs and new careers and passions. It's fascinating, can be unnerving at times, but definitely very exciting. It's amazing that the work that you're doing in this field is going to uncover some of that. But speaking about passions, I spent some time on your YouTube channel, Edwin.
Starting point is 00:40:47 I was fascinated with your photography work and you were just talking about it. I saw the one with the bald eagle and the snake and it was really quite something. So I was just wondering, what are your other hobbies and passions? You spoke about poetry, which is quite unique. I've never heard of a computer science researcher also being interested in poetry. I haven't come across anybody. So what else do you do? Yeah, poetry is really my biggest subject. And since I was a child, I was extremely interested in literature and philosophy. And poetry is a very good kind of platform for a busy person to write thoughts, because fiction would take a long time. And poetry typically is much shorter than the fiction.
Starting point is 00:41:26 So for me, I think this is a much better kind of media compared with other medias. Amazingly, with GPT, and GPT can help me to write even better poetry, for example, collecting historical facts, and maybe sometimes can help me to have better rhyming, better choices of words. So this is really kind of interesting situation. Very cool. Did I also see that you, did you climb EBC? Were you at the Everest Base Camp?
Starting point is 00:41:54 Almost two years ago, yes. That was a very good experience. And because my oxygen level was deprived, that was when I was studying consciousness very, very intensively. And when the oxygen level gets deprived, you are pretty much in an unconscious situation. And that experience was interesting. Schrodinger once said, well, a lot of situations we are unconscious. Even let's say our vision, right? We look at a person, we pay attention to some object we are focusing on, but our peripheral vision still is processing data, but it's under our unconsciousness.
Starting point is 00:42:30 But if our peripheral vision suddenly sensing a car driving toward us, then there will be a quantum jump to elevate that event from our unconsciousness to consciousness, right? So just like Einstein has always said, innovation doesn't come from consciousness. Innovation comes from your preparation in consciousness and push all the information into unconsciousness. And one day when you are whistling on the Everest,
Starting point is 00:43:00 your unconscious suddenly something pop up. And then you say, oh, I realize what you just told me from unconsciousness. So a lot of people have that kind of experience. You cannot wheel into innovation. You can wheel into preparation. And when the time is right, unconsciousness will tell you innovation has arrived. That's such a great way for us to sort of bring this interview to an end. I mean, I love the analogy that you drew there. But for our final bite, Edward, what are you most
Starting point is 00:43:31 excited about in this field of using generative AI and the work that you're doing with Sucresyn and healthcare over the next five years? Yeah, so far, very critically, the OEM has improved my productivity by 10 times. And that means the next five years will be equivalent to 50 years of my previous years, previous endeavor. So this means I cannot squander any minute of my day because it's become 10 times more precious. And I'm going to move forward to continue polishing soccer things. Interestingly enough, I was writing a scrapper to scrape information for my investment banking company. And of course, soccer thing and also JetChat GPT helped me to write code be very effective.
Starting point is 00:44:16 But then I say, okay, now you scrap my soccer thing site, getting all my papers and GPT tell me what are the major shortcomings of my current algorithm or systems. And remarkably, GPT gave me three new assignments, new insights, which I will be working on in the next few years. Wonderful. Well, thank you so much for taking the time to speak with us. It's been a wonderful conversation, Edward. Thank you for speaking with us at ACM Bypass. Thank you. Bye-bye. Bye-bye.
Starting point is 00:44:49 ACM ByteCast is a production of the Association for Computing Machinery's Practitioners Board. To learn more about ACM and its activities, visit acm.org. For more information about this and other episodes, please visit our website at learning.acm.org. That's learning.acm.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.