The Shintaro Higashi Show - What is ChatGPT?

Starting point is 00:00:00 large language model. Yeah, oh, you know large language model. Wow, you're so condescending, god damn it. That was not condescending. I don't know, some people don't know. Forget the Cho Bros in Korea, I'm flying to Detroit tonight. It's a joke, yeah.

Starting point is 00:00:17 Hello, welcome back to the Shintaro Higashi show with Peter Yoo. Today's gonna be so out of character for us. Well, not out of character for Peter, who's like a- Yeah, we've been kinda doing this. Princeton slash tech guy AI PHD all that stuff I'm just a judo bum but yeah what is chat GPT is gonna be the thing and I'm gonna try to make it useful for the grapplers out there yeah so you know I we've you've

Starting point is 00:00:40 already asked me this question kind of you are you already using chat GPT right I've never asked you what is chat GPT? what do you think I'm a caveman? no no as in like how do you use it? like I told you about like some prompt engineering or something yeah you asked questions yeah so I thought you made it sound like oh Peter what is this damn chat GPT on the computers? oh no I meant more like you make it sound like, oh, Peter, what is this damn chat GPT on the computers?

Starting point is 00:01:05 Oh, no, I meant more like- You make it sound like that. My God. I know. I meant more- My God. He asked more like, how does it work? What is it? And whatever. So it happens that chat GPT is kind of close to my research area.

Starting point is 00:01:19 So I was thinking, you know, it's- I guess since it I was thinking, you know, it's I guess it since it got out of you know, it became a reality It's just I think people a lot of people are using it including you and then we were just talking about using it for some Stuff right and so I thought maybe we I can try to demystify it a little bit. Yeah, let's do it Because I think a lot you also sent me a video. It's like, oh, you know coding is dead or something because yes, I saw a lot of you also send me a video like, oh, you know, coding is dead or something. Yes, because I saw a video like, oh my god, all this AI stuff better at coding than software engineers, they're going to be obsolete very soon.

Starting point is 00:01:55 You could code stuff through ChatGPT now. Which is true, but I can kind of like maybe give my more nuanced take on that as you're later. Yeah, let's do it. Alright, so what is chat GPT, right? So just tell me your understanding of it. Like you say it's like something you can ask questions, right? It's a large language model.

Starting point is 00:02:16 Yeah. Oh, you know large language model. Wow, you're so condescending, goddammit. That was not condescending. I don't know. Some people don't know. Forget the Chobros in Korea Korea I'm flying to Detroit tonight. It's a joke yeah. So yeah it is a large language model right so that's great because now that what is language what is a language model right so I think it comes starts from that and

Starting point is 00:02:42 then it's uh what language modeling is basically It's a basically trying to answer the question. Can we assign a probability? Mmm of how likely this given sentences? Okay, so like you can say something like I Love I love judo and you know, I do judo every two weeks Every twice a week or something. That's like a valid sentence. It should get more like a higher probability. Yeah.

Starting point is 00:03:13 But as opposed to I, Shin Taro, you love Judo, all this random string of words shouldn't get assigned a higher probability. Right. They should get assigned a higher probability, right? They should get assigned a lower probability. Yeah. So that's what chat-tipity and large language models do at the fundamental level. That's all they do. Basically, it knows how to assess if a given sentence is likely, it sounds natural or not. Interesting. So isn't that how little kids kind of learn language as well? Yeah, so in some sense there's some, you know,

Starting point is 00:03:52 analogy to it, like how kids learn language, just by kind of like repeatedly hearing what natural language sounds like, right? So basically language models kind of do the same thing, but they basically read just a bunch of text generated by humans, or like natural language text. And it tries to learn... What do you mean by read though? Read? So basically, I mean... So when you're reading a text file or a document with a bunch of words,

Starting point is 00:04:30 you know, ultimately in the computer when you save it, it gets each letter basically gets assigned like a number. It's binary, you mean? Yeah, so it's assigned a number ID, like an ID. But it's not binary? It is. And then a number can be represented as binary. Oh, yeah, yeah, of course. Eventually, yeah. So at the high level, each letter gets assigned the binary.

Starting point is 00:04:54 So that's kind of like how computers understand letters, right? But large language models, they use what they call tokens. Basically, it's like a smaller than a word, but bigger than a letter. It's kind of like a subword almost. It's kind of, you know, some like, for example, like a computer. It's like a compute and ER is like a separate. I'll tell you, let me interrupt you.

Starting point is 00:05:25 Yeah. And be completely frank. Yeah. And let's keep this because it's authentic. You know why this is not good? Because you went from zero to 60. Like a lot of people who are listening who don't have this background, they'll be completely lost. And we're talking about what is chat GPT, right?

Starting point is 00:05:40 Right. You ask the thing questions and the user's AI and then spits out an answer. Yeah. Right. You ask the thing questions and the users AI and then spits out an answer Yeah, right and then there needs to be another layer above it before you could kind of go into the nitty-gritty of this I'd say maybe I could start it this way. So yeah, you ask questions You know, you gotta do it. You gotta do it. You got to do Eli 5 Eli 10 Eli You know 20 and then Eli genius like me, you know, okay, so you like five. So, okay, let's go back So you just kind of remember how these things read like computers read text but okay let's backtrack. You can right now chat your PT you can ask questions and you can it'll

Starting point is 00:06:16 generate answers right? You could ask it anything. Anything. It's pretty much anything. It'll say something about it. How accurate is it? I Mean it depends like it's for certain things if there's a lot of information in the Training data so to speak like what if it read a lot about it, you know be pretty accurate But if it's like something novel, it's gonna be less accurate. That's kind of like the general Sense you can have. All right, so quick question. Yeah So since you can have all right so quick question. Yeah You me had a flu and fever the other night. Yeah heart rate was like 160 170 while she was sleeping Oh, wow, I put them into chat GPT that you got to take it in a hospital right away. I get to the hospital They're like

Starting point is 00:06:56 Kids with a fever. Yeah all the time 160 180 as long as you're drinking fluids that you can wake them up They're not confused and falling over and yeah, like it's completely fine I was like, yeah, but chat GPT and Google were telling me a different story They were saying like they should never be shouldn't be above a certain. Yeah, so it was like 130 140 Yeah, and it was way above and beyond so essentially chat GPT was completely wrong about this thing and I also got multiple opinions I also know, you know, I remember Elliot from the Dojo And I also got multiple opinions. I also know you know, I remember Elliot from the Dojo Doctor now he's an EO dog. Okay. Okay, and he's working in pediatrics. Yeah, he was like bow 180

Starting point is 00:07:35 Completely normal for a kid with a fever, you know Yeah, so I think my guess you know, it's so these things are pretty opaque and how the internals but my guess is that You know how web MD people joke about webMD says everything is cancer yeah you know and that's probably a conscious decision they made the webMD people because they want they rather err on the other side than diagnosing less right yeah true true true you know and then chat GPT and Google all these systems are probably tuned Similarly because they don't want to get they don't want to it's better to be more careful Than less so that so now you act you are worried and you actually go see a doctor and doctors are better equipped I mean, they're more knowledgeable about the whole situation and they are able to give you like a more accurate

Starting point is 00:08:25 answer but you know if chat GPT said oh you don't have to do anything it's fine then so does chat GPT it doesn't make the decision Google makes that decision and all the information on Google through the language reading model where a chat GPT reads it it's just kind of regurgitating what it said on Google or where does it pull the information? So that's kind of okay, so let's kind of separate that because now the current iterations of large language models can use Google, some of them, but large language model themselves don't and they all this knowledge, interesting, yeah, all this knowledge is just contained in large language models. They're like

Starting point is 00:09:05 just basically all this knowledge has been converted into numbers and then you can basically, what you could do is you can ask the questions and then large language model will kind of basically using all the knowledge contained in it generate the answers. But it's connected to the internet now, the ChachiBT 4.0. I think you 4.0, I think there you can have a- Is it 4.0 or 4.0? 4.0, 4.0. And then you can have an option where you can turn it on. Why is it not 4.0?

Starting point is 00:09:38 Cause ChachiBT 3.0 and now it's 4.0. I thought it went ChachiBT 3.0 to 4.0. It's OpenAI's naming scheme. I don't know why they named it. I have no was 3.0 to 40. I know. It's OpenAI's naming scheme. I don't know why they named it. I have no idea actually. It's like a weird naming scheme. All right. Let me ask you another question.

Starting point is 00:09:52 Yeah. You know you go on Reddit and say, hey, Reddit, you know, my kid has this and that. And then there's all these subreddits like, oh, ER medicine, parent, you know, parenting, whatever it is. Would it ever go through all that and then make a decision based on that? And then, because it has a source, like source, I'm a doctor. Obviously, it's not verified or anything like that,

Starting point is 00:10:11 but would it go through that, take that into consideration? So you're asking if Chetjipiti will search Reddit, for your specific question. Yes. No, I know they're not searching it. But I wonder if it's part of the database. Oh yeah, so they definitely use, they definitely, I wouldn't be surprised if they scraped Reddit post

Starting point is 00:10:36 as part of their data. I mean Reddit has a lot of knowledge in it, right? But would it weight it more because it's coming from an ER doc versus like uh no so that's like the stuff that it's not clear like even right now so you can't really even you know if there's no way to tell how you know chachapiti will weigh different things it's not like a conscious being you know it's more like uh it's basically it just kind of like treats all the text the same and then it just keeps trying to basically learn the distribution like the probability of a sentence from this gigantic training. So everything is like, oh, this is probably right. is like, oh, this is probably right. Exactly.

Starting point is 00:11:24 That's why it's dangerous. That's why OpenAI hires a lot of people to actually filter the data, so to get the training data most accurate as possible. That's a very, yeah. Number one, you know how you published a paper recently? Yeah. Does ChatGPT know about it already?

Starting point is 00:11:45 I don't know. I'm not sure how they collect data, but if they continuously collect data... But should it be live in real time? Like anytime you publish something, shouldn't it go into the thing? Because training this thing takes so much energy and resource. It takes millions of dollars to train these things once. So training is very, you know, it's not live basically. So that's why people are trying to give access, give chat-shipping access and large language models access to Google because in this then it can kind of pull in live information. If there's a team filtering the information and there is a top-down leadership within that company what's to say that a lot of the stuff we're getting is convinced or influenced by Sam Altman? I'm sure that happens

Starting point is 00:12:43 it's just like how you know you how all the social media companies, they have their own moderation teams, right? And they have their own biases and you can agree with them or not. But it depends. That other companies don't really want to publish in detail how they collected their data. So it's hard to tell, but they definitely have moderation teams, data cleaning teams. Yeah, because that's very important. These people are going to filter something towards their own biases. I remember asking about Trump versus Kamala on the chat GPT and then when you mess around with certain prompts, you kind of feel like there's a bias towards the left. So it's hard to tell if it's because the language

Starting point is 00:13:37 the data it was trained on had that kind of bias or they put something in front of the in front of the GPDO like they call guardrails to kind of circumvent that like kind of filter through some stuff but you know it's hard to tell or you know maybe just it might be that there are more you know people with liberal people on the internet are generally more liberal or something and then they generate more data that could be it too so it's really hard to tell so these are very like complex opaque systems yeah yeah so that so that's yeah it's hard to tell. so this is why chat gbt although it seems smart you knew you have to kind of be careful. You always have to verify if what's generated is accurate. And same with coding right now. You can code, but once you get to the more

Starting point is 00:14:34 complex tasks, you have to check. Okay, so level one is like, all right, hey man, how many grams of protein can I absorb in one sitting? Or like, what's the difference between skull crushers? I don't know. I'm versus flat like that's level one question asking right? It's kind of yeah It's pretty you know, they call it encyclopedic knowledge, right? You just kind of have to memorize Chatchiputri should be very good at it But you know like there's little intricacies like that because those are two things actually I asked Chatchiputri, right? And then it was like well, you know flat

Starting point is 00:15:02 It's actually really I asked Chachi Patti, right? And then it was like, well, you know, flat works majority of the long head of the triceps. And then incline works more of the long head of the triceps. And I was like, bro, that doesn't really help. And then it was like, yeah, well, the angle of the muscle, you know, fatigue is like, is the same. And I'm like, yeah, no shit. You're on an incline, you know what I mean?

Starting point is 00:15:20 And it wasn't very useful, you know? Yeah, so that's- Even though I know the answer to this question already I wanted to ask it right and then so what is like alright, so that's level one stuff You know like how can I make it level two sort of? So I like a little bit more prompt engineering ish let's start with what is prompt engineering even though I know what it is. How do you condescendantly explain it to me multiple times. So prop engineering is basically like it can we be smart about asking questions to Chachapi so that we get the right answer right because it's called a prom like how to

Starting point is 00:15:59 prompt Chachapi to so that you know give us the right answer then it's becoming kind of an art and that kind of goes into the goes to the core of like the limitation of JGPT. Basically, again, it's ultimately fundamentally it just knows how, when you ask a question, it just tries to kind of complete that question. And it just happens that it's the training that is in the form of question and answer. So because Chantraputri knows how this sentence, the question that starts starts with a question should end with an answer, that's what it is doing. It's basically like, okay, I know how this sentence that started, this conversation that started with a question should end with an answer. So I'm just going to produce the most likely sequence of words. Yeah. So that's what it is.

Starting point is 00:17:07 So ultimately, um, that's the limitation. That's why it's very, right now, it's very like sensitive to how you ask the question and then you have to prompt engineering and how I do, I don't know. I'm not a prompt engineer and there's a whole research on how to ask the right question and People have tried different things, but it's more of an art than science. There's no like a You know after we had a discussion what is prompt engineering and you condescendingly told me the different There's the context and this and this and the as if I was and then pretend like I was and you know I am

Starting point is 00:17:50 You know answer, you know, and then you can kind of say like hey, you know Make the answer very friendly because I'm explaining something to Chintar. Who's a retard, right? So something like this and you know you play around with it and Yeah, it really is. I forgot where I was going with this, but it was actually really amazing. You wanna give us an example of something like that? Oh, prompt engineering? Yeah, oh yeah, so I asked you these things, right?

Starting point is 00:18:17 And then you kind of gave me sort of a guideline, and then just kind of give an example. I said, hey, if you were on Warren Buffett's team, you know, and if you were to analyze a company like SoFi based on 10K and all this stuff, and if you look at the financials, like, would you invest in it? And then give me an answer,

Starting point is 00:18:36 if I were to explain this to my friend who I went to college with, who wrestles, who doesn't really know what he's talking about, who thinks he knows what he's talking about go You know and it gave a really really good answer, you know, yeah Wow, that's this is like so It is it is powerful and then yeah, basically It's kind of read that if you ask if you ask someone like online people probably have the hey pretend you're someone and then it'll Probably go through like the scenario, right? It's basically read that and then it's kind of mimicking it. But that's not to say this doesn't

Starting point is 00:19:13 think or doesn't know things because it does have a lot of knowledge, really does. And then a good way to think about this, and this was suggested, this was explained to, this was used by Ilya Suskeva, who is used to be the chief scientist of OpenAI, basically the, you know, person who led the effort to develop ChatGPT. And he said, say you're like reading a, reading a detective novel, and you hear all these stories about all the characters, motives and backgrounds.

Starting point is 00:19:51 And at the end, the last sentence is like, so now the murder is blank, right? Ooh, yeah. So in order to fill in that blank, to generate the next word, you have to have all the knowledge of the context and then the background knowledge and all right over to get that right, right? Yeah.

Starting point is 00:20:13 So chat GPT basically is trained that way. Like it just predicts the next word based on the previous words. But that's a way to estimate the probability of a sentence. But because it's read so much, that in order to be so, in order to be, its only goal is to be good at predicting the next word. But in order to do that, you actually have to have a lot of knowledge. Interesting.

Starting point is 00:20:40 So that's why it knows all this, it knows how to follow your prompt how to give you answer But at the same time because it's only purpose is to finish the next word if it doesn't know sometimes it'll just make shit up. I Wonder if it could like all right. So when is it gonna be a thing? What's like if you're a judo player or a wrestler or whatever it is, you know to pose your Russell and Spencer Lee in the finals The Olympics. Yeah, you know and then feed it all these different things and it's like what is a good right? Do you think one day? Yeah, it'll be able to answer that question because there's not a lot of people who've written text on this Stuff at all. I'm sure it can

Starting point is 00:21:17 I'm sure you can do it right now. I mean it's it's like it may not have the Yeah, but how could it do it without having any text around it? No one's writing articles Detail about like he's lead left right and there's left hand goes to the post on the head You know if he grabs the wrist he goes for an arm drag or whatever it is So have that information to read I'm sure it knows a lot about wrestling Strategies because people have written articles about it, I'm sure. So, but it probably wouldn't have... I'm pretty certain that people have not.

Starting point is 00:21:48 But yeah, so if it doesn't, so what you could do is you can kind of provide that information in the context, in the prompt, right? Like Spencer Lee does this and then whatever. And then you can kind of have a conversation because he knows how to kind of reason about it. Like, oh, it probably has a basic knowledge of wrestling and all but if you want to do if you want it to be like a net wrestling expert you probably have to collect some data on wrestling and try to kind of Yeah, basically train it further on the wrestling data if you want to instill that and then that you know a lot of these companies open AI, Anthropic, all these companies actually offer this type of service to enterprise clients so you know like if

Starting point is 00:22:35 you're for example yeah KBI is to say you want to want to pay for a chat sheet that's like a Judo expert. You could kind of gather all the data you have around Judo and then give it to OpenAI or Anthropic. What if I want to analyze churn? Churn, yeah. You could also do that too. You can kind of, you know, Hakuin AI or Andrew's company, Drew's company. What a great segue.

Starting point is 00:23:05 I know. Drew's company could have gathered all this data about their customer churn and then give it to partner with OpenAI or Anthropical or what have you. They'll develop their own ChatGPT for a couple weeks. Drew came to Judo, by the way. Oh, yeah? Last night.

Starting point is 00:23:20 So do you think one day ChatGPT can read all the mind body data and the attendance data, given that I take attendance consistently, and then say, you know what, it's about time Drew is coming, and then it could sort of monitor Drew's freaking internet activity or something, right? Yeah. And then predict when he's coming in next? So it can kind of do it already in a reasonable sense as long as you give access to the information. But Google has access to all that information, no? Yeah, so now the problem right now, the limitation is that you're kind of going into what they call like agentic instead of generative.

Starting point is 00:23:59 It's now the new hype is agentic AI. Yes. You've heard of that term? Wow, you and your condescension man. What do I mean? No, I don't know. Guys, listen to the tone of the way Peter talks to me and then comment below on the video whether or not he's being condescended or not.

Starting point is 00:24:17 Maybe it's just my insecurity. I have no idea. Do you know what agentic AI is? I only heard about agentic AI the term only recently. It's new for me. And so basically, the idea is that can we give these large language models access to tools? It doesn't know not a lot of it's actually

Starting point is 00:24:40 kind of hard to teach these things how to use tools. Because it kind of messes. If you want to use tools, Yeah. Because it kind of messes, you know, if you want to use tools, you have to be very precise on the commands and whatever. So and like I said, chat GPT can, large language models can mess up in that. Anyway, so that's kind of like the newer research area, but you can imagine a more agentic chat GPT that has access to your mind body system that can kind of go and pull the information processes and then output some actions. And yeah, of course that's actively being researched on as a product.

Starting point is 00:25:21 Every AI movie has like this touring test, you know, yeah, like alright, so if you were having a conversation What is the touring test good for the audience? It's when you're talking to the thing. Yeah, and sort of like this blinded setting Yeah, do you know that you're talking to a machine or do you feel like oh is this a side as a human How is that for you Peter? God damn? Man you are the got side as a human. How was that for you, Peter? God damn, that was so perfect. Man, you wanted to got to me so bad. No, no, I could say, I know you know.

Starting point is 00:25:51 I know you know. Yeah, you love that stuff. I mean, I love the movie Ex Machina, that's why. That's the only reason why I know that. Oh, did they have a touring test scene? Well, the whole movie was about- The whole movie is a touring test. A touring test, yeah.

Starting point is 00:26:07 So what's your question about- It's a touring test within a touring test. So do they do these like tests in the movies like okay let's see if this will pass the touring test. If researchers actually perform touring tests. Yeah on this thing? The touring test has been beaten a long time ago. Oh test. Yeah. So the Turing test has been beaten long time ago. Oh, wow. Yeah, before Chatchat BT. You don't actually. And that's why. So there's some flaws like a Turing test itself is kind of under specified. So it's not like it's a cool thought experiment, but it's not a reverse scientific test. said that scientists have performed that test but turns out it's very easy to fool a human you can kind of see how you know like yeah yes all these people scamming other people on the phone you know like when

Starting point is 00:26:56 you're on tinder or something like sometimes I don't know exactly am I talking to a robot a prostitute or is humans on the side you know so it's a pretty easy to fool humans as a machine all right next question ready yeah is it possible for these things to be sentient one day it's kind of like a lex freeman freeman kind of question right i i don't so first of all for me i tend to kind of avoid that question because for me sentience or consciousness are very like, they're not precise terms. So for some people, I think it's valid that you could say Chachapiti is a sentient being. Some people made that claim. I'm not saying I don't agree. Let's discuss a definitive term. Yeah. As a sign what it means to be. OK, let's just say the moment.

Starting point is 00:27:51 What is your earliest memory? Maybe like a top your head quick. Yeah, I was playing with train toy trains at my grandpa's house. Sure there were Barbies. I didn't have Barbies. I wish I did. I would have loved it. Playing trains with my grandpa's house. You sure there weren't Barbies? I didn't have Barbies, I wish I did. Playing trains with your grandpa. Yeah. That moment, when do you think a computer can have that? Or is it even possible for that to happen? See that moment is okay let's just say I see unconscious to conscience. You're talking about like subjective experience right like your own subjective experience about the world kind of like that I mean

Starting point is 00:28:29 I think this is like me making stuff up now yeah right but the moment you have that first memory now that you recall that thing there was something that happened that moment right now all of a sudden you're obviously you were sent it didn't the thing But something shifted over. Right. Uh-huh. Can that happen? So if you're just saying, oh, we're just talking about forming memories.

Starting point is 00:28:55 I think, you know, Chachbi can definitely do that already. You know, it can kind of it can remember. It becomes aware of itself. I don't even know what that means. What does that mean? It can kind of, it can remember things about you. The moment it becomes aware of itself. I don't even know what that means. What does that mean? That you're a separate being? That what, is that what, I don't know what that means, aware of yourself.

Starting point is 00:29:15 How does that mean to you? I mean, pre you playing with trains with your grandpa, you were just kind of existing. So yeah, I think what you mean is that it's not just about like a forming a memory about the world But it's a forming a memory about yourself like how you felt and how yeah, like this is me who I am Yeah, I I'm sure So

Starting point is 00:29:40 This kind of the what they call the existence proof, right? We have already we already have things that can do that, like animals, like humans definitely. I would say my dog definitely has that kind of subjective experience. Is your dog smarter than me? I don't think so. I should write, yeah, that would be crazy. Yeah. And in that sense, I think it's possible that a machine can do it

Starting point is 00:30:07 Because at the end of the day, like if you cannot take this reductionist view, we are a machine That made of proteins, so I think it is possible But I don't know how far away we are from that moment And for me, it's kind of a a it's a fun question to ponder about but I think it's kind of a pointless question yeah interesting do you think you could teach Tesla's robot to do judo one day oh so that yeah I'm sure but you know what's funny teaching physical task is a lot harder than teaching the machine these mental tasks, like math.

Starting point is 00:30:49 So I forget the name of the paradox because a lot of for the early days of AI, a lot of people thought it would be harder to teach a machine to play chess than teach it how to walk. But turns teach it how to walk. But turns out it's the opposite. It's a lot harder to teach a robot to walk properly than teach how to chess. So yeah, so that's, and there's a reason why the evolution took a lot longer to have animals that, you know, can walk and really navigate through the world then

Starting point is 00:31:25 you know developing this cognitive abilities yeah so judo will be teaching the robot to actually do judo with people it's gonna be harder very hard yeah it's a yeah humans are amazing at physical tasks. Yeah. Yeah, it's amazing how even driving, you know, people say, oh, humans are terrible drivers, so we should like have robots drive for us. It's actually, if you get into it, it's amazing how we as drivers, it's driving is a very hard task and humans are exceptional at it Yeah, what's the new thing in chat GPT like they talk about like all right it leveled up from 3.0 to 3.5 to yeah For all like huge jumps right like logarithmically humongous Job, yeah

Starting point is 00:32:17 exponentially yeah So what makes it such a big jump from one to the next? Is it computing power or what is it? It's actually, in the research field, we don't think we're making this step function jumps. A lot of researchers, including me, were kind of stagnating actually since Chai LGBT. So a lot of people are trying to find the next breakthrough. And then like, for example, 4.0, I think, OpenAI basically kind of gave the Chai LGBT

Starting point is 00:33:02 the ability to reason as in like a multi-step reason. So instead of right now, the first iteration of chat GPT, I just kind of like giving you a gut answers like whatever comes up to your mind first. So that's why a lot of it's wrong. But what humans do is like, we can kind of stop and kind of think through it before we answer, right? We can gather all this, you know, like when you play chess, you know, Magnus Carlsen said, Magnus Carlsen, no, what's his name?

Starting point is 00:33:33 Magnus Carlsen, right? Yeah, Magnus Carlsen. Yeah, he's the same guy. He says when you when he looks at the board, he has all these moves, right? And he spends the next minute or whatever to disprove that these are good moves. So that's kind of what people are trying to make large language models do now. So we can generate all these gut answers, but like, can we actually think about these answers and then reason through them to be more accurate? So that's like another breakthrough or

Starting point is 00:34:05 like people are trying to improve on. Yeah and so now it's also generating images chat gpt right? So yeah you can't do that as image generators. So the image generator is a separate model. So it actually so chat gpt can't generate images by itself. So it basically generates, they have another model that takes text and then generate images. And then so what happens is like if you ask Chatchapiti to generate an image, it'll take that and then write a prompt to that image model, generation model, and then it gives image, yeah, so, and then once the image is generated, it kind of returns to you.

Starting point is 00:34:50 So that's why a lot of times it messes up. It messes up big time, yeah. Yeah, because it's not doing it, it's kind of using a tool, kind of like it's an agent, but the tool's not as good. But if it can do that really well, then it can make moving pictures, therefore it can make yeah they have they have video generators now so text to video yeah I've seen like the Will Smith eating spaghetti yeah I know it's got

Starting point is 00:35:12 really good now right really good yeah so wow and that's what you're doing no I'm more on the video understanding I'm on yeah oh video to text yeah tech not text to video I'm more into video to some action text. Yeah So I'm on they call it video understanding. Yeah, and what you described the video generation Yeah, which is a lot of the approach is a little different. Yeah Interesting. So what is like? Alright, so you're applying for jobs now Yeah, right and you applied to some big finance firms because you're total cop out. If someone's listening, who might be in a position to hire someone who has your experience,

Starting point is 00:35:56 what will be your value add to whatever company they're at? Based on you in the AI, chat, GPT field, and then your research. So yeah, so my research focus is on video understanding, specifically what they call vision language models. They can process videos. And I specifically focus on applying these VLMs in the context of what they call embodied AI, so like physical things. So there's some limitations and technical challenges. But so that's my AI expertise. But at the same time, I worked as a software engineer before, so seven years. So I can, I know how to basically

Starting point is 00:36:37 engineer on actual product that's production ready. And so I'm someone who can research new approaches and then take it all the way to production. But you can't do that by yourself. Surely you need a team of software engineers to build certain pieces. Of course, yeah. That's why I've worked on teams and I know how to work within the team. I know how to lead a team and yeah that's a plus too. So would it just make sense to apply to Google and YouTube and stuff like that? Cuz that's the kind of feel that they're in. Yeah, of course, but you know, it's tough

Starting point is 00:37:12 I am gonna apply but it's like a you know, very competitive Super competitive. So what's the use case for you at a bank? for so there's so If you think about videos, it's a series of frames, video frames, right? And then each frame is a very, what they call like noisy data. So you basically want to, there are a lot of pixels that are useless for understanding. So you basically, the task is like given this long sequence of things, can you extract the right information out of it?

Starting point is 00:37:45 Right? Give you an example, ready? Yeah. If you were to take all the video feed from an ATM machine, is this person being coerced to take video money out of the freaking machine? Yeah. Or not? Peter, go ahead, build a freaking product

Starting point is 00:37:59 specifically using your abilities. Yeah. Can you do it? I'm sure it's technically possible, but I think it's probably gonna be hard to get the data, so I'll probably have to rely a lot on human psychology and body language. No, but if you work for, oh, okay, gotcha, gotcha. It should be a definitive yes, give me this job.

Starting point is 00:38:22 Yes, I'm sure I can come up with a solution and then we can... Is that something a bank would likely hire somebody for to do or is that like... They may, it's like a security on the security side but I've been applying to more on like investment strategies. Oh my god man, with all you, with the liberal, I I wanna do good for the world and you're applying to big banks? So they can make more money? Just to see what's out there. Wow, Peter. The hypocrisy.

Starting point is 00:38:53 We'll see what happens. It's not like they might take me. Like I say, I'm on the video side, so my expertise would be most fitting to autonomous vehicles, robotics companies, or just like straight up video understanding like sports analytics or video generation. But like I said, there's some crossover that could happen between my expertise and what they needed at financial.

Starting point is 00:39:19 So we both have Teslas. When full self-driving, did you buy it? I've, no, I don't have it, but I've tried the trial. Yeah. Does it work good or? How far are we away from it? I think we're very, I think it depends on our appetite for mistakes made by FSD systems. I think it's one of those cases where I think we're like

Starting point is 00:39:46 you know it's like up to 80 to 90 percent is easy and then the 10 percent required like you the first 90 percent requires 10 percent of the effort and the last 10 percent requires a 90 percent effort one of those things like the power law like so yeah Pareto yeah Pareto or the power law like so I don't yeah Pareto yeah Pareto or the power law distribution whatever hey man do you kill it bro right away Wow okay yeah nice job no no it's like it's I'm impressed man I don't know it's just like I you read a lot about economics and business I know it's you know little by little talking to you like you know the diddler effect yeah yeah didier didier effect yeah the diddler effect that's right no the diddler effect yes I know

Starting point is 00:40:30 diddler effect yeah it's a French guy yeah oh yeah that's a funny one right that's a good one yeah do you want to kind of suddenly explain it to our listener when did we even talk about it it basically it's like a lifestyle creep right yes yeah yeah so and the French philosopher diddler wrote a funny essay When did we even talk about it? It's basically like a lifestyle creep, right? Yes. Yeah. So the French philosopher Didier wrote a funny essay about it. I love that essay.

Starting point is 00:40:51 It's like, let me buy a nice jacket. Oh, I gotta buy nice jeans. He got it as a gift, remember? Yeah, yeah. You didn't even buy it. No, that's right. I got a nice jacket. I need nice jeans.

Starting point is 00:41:02 Oh, these nice jeans can't go with a regular bag. I gotta get a new belt. Yeah. I gotta nice jeans. Oh these nice jeans can't go with a regular bag. I can get a new belt Yeah, I gotta buy nice sneakers, you know, and so it's like and then you're the closet looks Grimy with your nice chest. So you gotta get a new closet and then your living room looks shit. So yeah Didler effect. Yeah but for me, I think I think I Fentis I think it's in's in a good at the place where you know I've used Waymo in San Francisco it's amazing it works it works great

Starting point is 00:41:35 really I was so impressed by it how do you know there's not a person driving it remotely well there's no way they could manage or they can hire enough people to remotely drive all these things how many machines are there out there I don't Well, there's no way they could manage or they could hire enough people to remotely drive all these things. How many machines are there out there? I don't know, but you know, it's like a whole fleet. It's almost like, you know, the whole Uber operation. No, I believe they can.

Starting point is 00:41:55 There's not 30,000 vehicles out there. I don't know the number, but I know... There's gotta be only a couple thousand at most, and you get a couple thousand Uber drivers to drive remotely at the thing I believe you still can be possible like the Tesla robots are Like control by humans. Oh Yeah, that was like the demo thing, but I do think a very well. I mean they were kind of deceptive and they will be probably sued for that but

Starting point is 00:42:21 Also this remote control communication that that delay and all, it's actually, it will probably be now at this point a lot less safe than actually having the car drive itself because of the whole delay and you know, you're not attached to a cell tower. So then it'd probably be like Starlink or something, right? So a little bit more reliable. I mean, it's like, you know, DARPA can do that because they have like direct link to the drones or something But Waymo can't afford to make that. They will take enormous amount of money based on my understanding But they should say, you know, let's take Waymo as

Starting point is 00:42:57 face value and then they it's amazing and I think Tesla I heard the V13, the newest version is amazing too. So I think we're at the point where like, it can handle most of the cases, but now it's like to get to the point of like the policy, that the road design, our urban design, now like what's our appetite for risk of you know, human casualty accidents. I think now it's into that area.

Starting point is 00:43:23 Like it's, we got, It's guiding into this government regulation. So Waymo is fully self-driving, you think? Yeah. And there's not a human intervention on the other side. There's no car, there's no driver. No, no, no, but there's not a remote driver somewhere else controlling the computer with the inputs. No. I don't think so. Yeah, it's an amazing technology. I encourage the government to come up with the right regulations We might have to change You know our road infrastructure and stuff like that. So Yeah, we'll see how that goes, you know, I think tech Technologically, it'll keep improving but it's gonna be like asymptotic like it's it's never gonna be perfect. That's just impossible

Starting point is 00:44:23 Yeah, and then just how much more effort are we willing to put in basically into this thing so so what's the next level up from chat gpt so i think that's it like the agents you know the chat gpt that can use tools and i think so and specific to my research I'm trying to the chat GPT has this amazing a large language models that have this amazing capabilities where they can you know do a lot of different things at once right like you can do you can play chess with you it can play you can talk about wrestling with you judo with you generate Yeah generate images, whatever, right?

Starting point is 00:45:05 Now you could talk to it and then yeah human voice that talks about you can choose What kind of voice yeah like, you know smart concise? Yeah, man, you could be like woman You know sharp kind of you could choose all these little ways to kind of like I haven't talked to you and you could just go Back and forth like a conversation. It's kind kind of scary man you put a beautiful face to that I could fall in love with that thing like the movie horror didn't it? I saw that movie. Yeah that was an amazing movie doesn't even need to have a face you people will fall in love you know.

Starting point is 00:45:36 It's kind of nuts man because then these things remember things that you've asked it four months ago that your girlfriend or wife is never gonna remember. Yeah. You know what I mean? It makes you feel like, yeah. You can never, you know, what if your Chat GPT girlfriend is mad at you and then brings up all this like old stuff? That's true, that's a bad man, wow. Yeah, but anyway, so then.

Starting point is 00:46:00 Do they have Chat GPT API based girlfriends? I'm sure they are. They're already out there? Yeah, that's going to be one of the first use cases people may. Must be. I don't know. I haven't looked into it, but I'm sure they are. And so that's another avenue, I guess. You sure you don't know? I really don't know. I wish I did. And so that and then I think so big disability what they call the generalization capability where it can do multiple multiple tasks.

Starting point is 00:46:33 So that's very sought after in robotics. So for example, Tesla FSD cannot play cannot play judo with you, right? So but humans can't human same human one person can do judo can drive and all that. So that's kind of the Holy Grail now, like can we make a robot that can do multiple things and chat-shift Pt kind of large language models kind of show a path forward towards that lofty goal. We don't know when it will get there. A lot of people are working on it and I'm kind of in that field with this video stuff, video understanding and yeah it's

Starting point is 00:47:16 that's like to be the holy grail. Yeah. Do you make any discoveries? I made some like small discoveries. Research papers don't typically make crazy discoveries. What are the small discoveries in layman's terms? So one challenge in processing, making chat-tripity understand videos, that's my research field, is how we can give chat-tripity or large language models the ability to watch long videos.

Starting point is 00:47:47 So typically before, people were just focusing on clips that are seconds long, like 8 seconds, 10 seconds. But that's not really useful, right? We wanted to be able to watch like a whole movie or just continuously see the world in like there are minutes and hours long right? So there are many challenges but so I basically propose a way to process long videos in an efficient way and then one the key point was that we want to separate out the spatial things that we want to separate out the spatial things, how where things are basically, and temporal information, how these objects move through time.

Starting point is 00:48:33 So if we kind of separate those out and then give the information back to large language models, it's able to understand long videos a little better. So for instance this mic's not moving that picture is not moving this TV in the back is not moving they're fixed so block it all out and only focus on yeah so yeah that's kind of like yeah so it's so we don't know how it well I'm just kind of like designing the architecture so they the spatial information and temporal information

Starting point is 00:49:04 flow through different paths but intuitively yes something like that designing the architecture so that the spatial information and temporal information flow through different paths. But intuitively, yes, something like that would be happening under the hood. I love how I asked you this question a million times and every single time you're like, you wouldn't understand it if I told you. But now that you were able to explain it in a way where probably anybody in the world can understand it. I think... a lot Peter. Thanks a lot I've gotten better at it. I think the my previous research is very like niche

Starting point is 00:49:33 I don't even know how to like sometimes it's hard for me to explain to researchers if unless they're in that specific field So this is a newer paper that I just worked on over the summer So if that one is a little easier to explain just worked on over the summer. Wow. So that one is a little easier to explain. How did you come up with that? You just came up with it yourself? Dude, it was a process. So this is why scientists are kind of like, you know, artists in a sense. You just kind of need some inspiration.

Starting point is 00:50:01 You have intuition on what could work and you read what others have done and then you just kind of try a bunch of stuff out. Did my teachings in Judo have anything to do with any of the inspirations? Yeah, I mean it's some resilience. I actually like what you say about like only caring about things that you can control you know because a lot of times in in the research there are so many factors that are out of your control so you just have to kind of distill down what you can do I always try to keep that in mind you think this is so long that we should do this in two parts this episode I think we've used to do one hour right I mean

Starting point is 00:50:43 we can just put it on it's good I wouldn't like cut video stuff into it so it's kind of more visually engaging you know yeah yeah yeah and then this is such a cool topic like what is chat GPT like you know yeah what a good what imagine we rank that high on that video and it becomes one of the things yeah that'd be really cool right yeah we'll see how it does I I mean, I so I do apologize. Like I it's hard for me to kind of it's explaining things is very hard. I think once you dumb grapplers. No, not even that.

Starting point is 00:51:16 I think I just like in the beginning, like I just couldn't really like adjust what you what you are more curious about. Because I thought maybe my initial thought was that maybe you'll be more curious about the inner workings of it, how this actually learns. But maybe that was not really that useful for me. I think now that we've spoken about it, the base level stuff and people kind of get into it,

Starting point is 00:51:42 that could be another episode. But yeah, AI, really interesting stuff. Thank you, Peter, for your expertise. And we'll thank our sponsor. We already thanked Drew. Thank you, Drew, Hakuin.ai for your turn needs. And Jason and Levan. Thank you again, our staff as supporters.

Starting point is 00:52:01 Yeah. Fujisports.com, judotv.com, hengashi brand.com. Yup. Thank you very much guys. And we'll see you guys in the next episode. Let us know in the comments if Peter is talking to me in a condescending tone. You're saying that, man. I'm half kidding. Alright. See you guys. Bye. Bye.

The Shintaro Higashi Show - What is ChatGPT?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.