The Shintaro Higashi Show - Judo Bots! | The Shintaro Higashi Show

Starting point is 00:00:00 The Shintaro Higashi Show is sponsored by Judo TV, your premier destination for live and on-demand judo coverage. Never miss a throat. Hakuen AI. Hakuen AI helps you measure, predict, and solve customer churn. Visit Hakuin AI and start your free churn on it today. Higashi brand. Train hard.

Starting point is 00:00:20 Live strong. Wear Higashi brand. All right. Everyone, it's a little bit of a different episode today. It is the Shintaro Higashi show. But this time with David Kim and Peter U and not Shintaro Vagashi, partially for scheduling reasons, you know, no big deal. But you're just going to get more podcast goodness now that we are dividing and conquering. But, you know, this is a promise that I made to myself a couple episodes ago because I wanted to talk to the newly christened Dr. Peter U.

Starting point is 00:00:55 As he so painstakingly reminded us. You know, you got to call me doctor now. And I respect that. I respect that. So we got him back and we're going to talk about what many call a game of inches. They talk about judo and grappling in general as a game of inches or even maybe less. But my question has always been, well, how many inches is it? You know, is it one inch?

Starting point is 00:01:25 Is it half an inch? Is it three inches? And that's what got me thinking about a more quantitative approach to grappling. And this was actually inspired by, do you ever remember this show that was on TV? I think it was called Fight Science, but I'm not sure. Oh, yeah, I'll discover channel. I remember. Yeah, yeah.

Starting point is 00:01:45 And they had their like 3D, like, CGI. They had a judo episode. I'm pretty sure. Yeah, yeah, yeah. Like every martial art they had on there. Like how much force does it take? to like break a man's ribs with your elbow or you know and they would have all those stupid like sensors visualizations of like what's going on and this is an old show i mean this is like i'm sure

Starting point is 00:02:09 you can catch some of it on youtube yeah the mid 2000 the data uh i think though the animation that they made about uchibata is still floating around the web somewhere oh really i see it sometimes oh okay but that sort of was the kernel here i saw that and i was like oh that's sort of interesting through life, it seemed like we should have the technology to sort of combine video and even just high resolution pictures with like a physics-based model of the human body. And so the basic idea here is that the way it would work is you can model in gravity. You can model in the basic environment. You can model the bones to some degree the elasticity, which way your joints is supposed to move, all that kind of stuff, right?

Starting point is 00:02:58 And so the thought was that given certain pictures and a sequence or frames of a video, of a clear video, you could then infer the motion the body has to go through to get in those positions. So that's point number one. Point number two would be if you have two of these models, like interacting models, you could then do a ton of calculations on like, okay, what force is being applied to this part of the body, How is the center of gravity moving, you know, for a particular player versus what the other guy is doing? With the idea being you could create almost like this theoretical understanding of grappling, using like a notional grappler at first, sort of like the platonic ideal of a grappler.

Starting point is 00:03:48 But then you could change the constraints for like a specific person, right? So like, oh, no, I'm short and fat. I have no flexibility. What is my sort of local optimal? Like, what game should I be playing? Oh, so it's more, so you develop the understanding of a grappling art, and then eventually you can use it to like maybe simulate situations and then try to develop your own style. But that's sort of the basic idea, and I wanted to run it by my newest Ph.D. friend.

Starting point is 00:04:22 So, newly minted. So this is a, so my dissertation was on. Yeah, some, like how to combine video understanding and language understanding together and then try to use that. So this is a little different. But, you know, since I do work for a Thomas vehicle company, I have to pick, a lot of the ideas actually apply to what they call embodied AI. It just means that AI has a body. They interacts with the physical world. But, you know, embodiment.

Starting point is 00:04:54 And in your case, it would be a car. Or, yeah, so exactly. Yeah, yeah, right, yeah. And then, you know, you can kind of like fuss, make the line fuzzy, and then the embodiment can be like, you know, the current buzzword is agentic AI, right? Like, you know. Yeah, of course.

Starting point is 00:05:11 And that could be considered as an embodiment, too. Essentially, it does interact with an external system. But I think in the strictest sense, embodied AI refers to AI system that interact with the physical world. And then to do that, you need to develop this intuitive understanding of physics, how things work. And that's kind of what you alluded to in this idea,

Starting point is 00:05:39 you know, basically like a model of grappling. And then you mentioned that there are two systems that could, you know, one that learns the mechanics, the physics from videos. And another is more about, like, based on the laws of physics and then, you know, basically numerically simulate things, right? So that is, those are actually approaches that embodied air companies, like robotics companies that focus on learning, like based approaches or autonomous vehicles. like famously Tesla uses these two approaches to trade their model. Oh, okay, I didn't know that. So basically, just to give you some, the reason, just background on why people do this both ways,

Starting point is 00:06:37 is because the success of like LLN hinged upon the fact that we had a lot of training data, the text training data readily available online. But the problem with embodied AI is that it's really hard to collect such data. I mean, you're probably, while you're thinking about this, you know, you probably realize, you know, IJF does have a lot of huge swats of judo videos, but ultimately, videos are harder medium to deal with in terms of engineering than text. And not only that, it's the volume just doesn't match the textual data and then that kind of goes to show how incredible

Starting point is 00:07:22 written language is it's like uh i always try to like to say that's like nature's way like the nature's way of the better nature came up with the best compression algorithm really like to yeah through millennia of evolution they came up with this like brilliant system that could compress knowledge down to very, you know, digestible bits. The first abstention. Yeah, basically. Yeah. It's incredible.

Starting point is 00:07:51 I mean, but, you know, it's, it's, nothing like that existed in embodied AI. So, you know, to overcome that, one possible way is to create like a simulator based on the laws of physics, like you said, and try to simulate the interactions or try to predict what's going to happen i mean the the simulations have been used uh for the long time that's how we got to the moon you know um but word yeah and then but now the new approach is the other one that they're like learning from videos like data yeah for that and then i mean uh so now before we go before we keep going though when we talk about the video and we talk about like two judo guys trying to throw each other can you like what are some of that when you say it's more difficult you know it's not

Starting point is 00:08:52 maybe not as you know dense in terms of information like what are some of the problems with video that you know because like something when I think about it you remember the matrix yeah yeah yeah and that that bullet time right and have you ever seen the rig that they had to set up it was like but to make that shot bunch of cameras around the yeah it was just like a ring of cameras like 360 degrees and that's sort of what I imagine it's like you would need to have like a bullet time set yeah it's to record something it's a video is very uh it's a very light information a light medium actually I mean it's kind of a paradoxical because there you know there's a thing like what a picture is what thousand words

Starting point is 00:09:36 but it's like it's actually not that simple yeah like that like that sometimes a word does carry more information than the visual and because of that we we think that because we intuitively understand the world better because we spend both of our childhood interacting with the physical world the best possible simulator which is the real world so because of that I think visual information is a lot easier for us to digest and also visual cortex was developed first before the executive function, right? Like, for a note, code text came way later, that's like the language understanding.

Starting point is 00:10:20 But so the video, the problem is, yeah, that it's not even just like the angle. It's just that it's, you basically have to learn the movement of pixels from very similar frames. Like if you think about it, if you, I don't know if you, if you're like kind of.

Starting point is 00:10:41 swipe through the video, you'll notice that video frames are very similar to each other. And so because of that, there's a lot of noise. So the key is like you have to, because when the conventional way of feeding videos into a model, a deep neural network model is just to give raw pixels the values. So every single pixel has the same. weight basically initially same significance but the neural network has to learn to ignore what to ignore and then all humans are I think the brain is very like we are born with the ability to kind of ignore a lot of different things and we also

Starting point is 00:11:30 learn to ignore a lot of things as we experience the world but you know it's really hard to teach them from scratch to a system that's never really done that's never done that so i think yeah in that way images and videos are a lot harder but then then language so language actually is an easier medium in that sense i think because it's so information it dense yeah yeah i was thinking about more trivial things like how do you tell two people apart yeah i mean that's that's this all i think those things are all hinge upon this you know right they're all just symptoms yeah and then basic problem and then like a cool like when things are and also you have to understand that like for us occlusion is

Starting point is 00:12:15 easy to understand when we see in the video because we know what happens in the real world we know we understand that videos are just to represent to the projection of what's happening in the real world so right but these systems don't have that at all we have to kind of teach them and there are ways people have come and we figured out a way to do so and i can go into details about some of the models that learn this type of physical physics of the world just from video but again you just have to understand that you know they call the prior right there's no prior knowledge about the world in these systems these are like neural networks are very generic i mean they used to put a lot of inductive bias they say like and try to like

Starting point is 00:13:09 Like, they used to use architectures to call convolution neural networks to process images. And convolution comes from the, like, was kind of, it's not, it's just a pure mathematical concept, but it has some bearing into how visual cortex works, the human visual cortex. So you can kind of see that, oh, you know, it's useful to process images. But now there's, we don't really use, I mean, In certain applications, we still use convolution in your workbooks, but the architecture we use now are very generic now. Any type of information can be.

Starting point is 00:13:47 So, like, mathematically, very simple. There's no. So that's what that sci-pi method is, convolve. Yeah, convolve. Yeah, that's the verb convolution. Yeah, convolve, it comes from. Right. I never used it.

Starting point is 00:14:02 It's from, yeah, like digital signal processing. And then later, they kind of figured out. there's some connection, similarities with how visual cortex works. What about mocap? What is... What do you think about motion capturing? Motion capture as like a base, a basis for sort of seeding this more physics-based model. Because like we're getting away from...

Starting point is 00:14:30 Yeah, I mean, information is more dense, but it's harder to... It's just the quantity is not there. You're going to hire a bunch of people. You're going to pay Judo people to wear mochape gears. Yeah. Well, we're just talking about in theory. Yeah, I mean, in theory, if you can scale up, yes, it'll be too learned about the... Well, the idea here...

Starting point is 00:14:54 Yeah. Yeah, the idea here would be to get almost like a... Not sort of like a full distribution of like everything that could ever happen. It's more like templates or like canonical movements, right? just enough for to get to maybe some kind of reinforcement learning situation right where you have the model you have enough of these yeah I don't know what you would call them templates or something such that it has enough reference to yeah you know do the simulation so so now we're learning that in order for it's better to have a build a bottle that has a lot of general knowledge and then focus on the specific

Starting point is 00:15:43 so oh so you're saying the opposite it's actually the opposite just because you were focusing on only judo moves ultimately if you want to accurately model physics

Starting point is 00:15:56 of judo you probably have to know about the world I mean that will always help so that kind of goes into like now it's called nowadays people call it world models because you're learning about the world and there are many

Starting point is 00:16:14 ways to teach a model do that but one way is to one popular way is to have the model predict the you know subsequent frames right you know and then i mean that's kind of what you are alluding to right like from videos and then basically through that and then you just show the model bunch of videos and eventually, in order to be good at predicting, you know, frames that are seconds later, it has to learn about the world. Yeah, yeah, yeah, yeah. Just like an LLM does. Yeah, it sort of predicts that next word, right?

Starting point is 00:16:55 And, you know, there's a lot of discussion about if this is the best task and whatnot, and then whatever. This is all like boil the ocean kind of. Yeah, it becomes academic, but then the. I mean, boil the oceans, yeah, but turns out that's kind of like the best approach because the other way is just building a simulator. And then the simulator is kind of getting a falling out of favor, falling out of, you know, because ultimately it's so brittle.

Starting point is 00:17:28 So even though, even if you can kind of account for every single variable, I mean, it's impossible to basically account for every single variable and they're pretty, the next frame. So it's actually sometimes more accurate to build a generative video model than trying to simulate it, numerically simulate, what's going to happen next in terms of physics. And I was, you know, we we talked, I was, I had been thinking about it before this episode because so that I can talk more about it. But one good analogy is like, like we don't really know the laws of physics like we don't know we don't have to know the formula to predict where the ball will go right yeah so it's kind of like the same idea it's the

Starting point is 00:18:21 answer it might be actually more computationally efficient to just kind of imagine where things what will happen after watching just having seen it yeah exactly that's not that's actually why why we're so good at that because we all we we're always like experimented we always have we always get immediate feedback from the world right so it's like a never-ending a B test exactly exactly and then so I think simulations are I mean I based on the public information Tesla apparently still uses it to you know test and train their driving model And I'm sure it has some utility, but it's just so labor intensive. You just need to hire so many people.

Starting point is 00:19:15 And it's always like never, like you're chasing the dragon. You will never be able to catch it. You'll never be able to. I mean, that's why physics, yeah, your idea about like how to guide the model with the laws of physics to learn about the world or at least the judo world. So these are some of the things. that you might have to think about because a building of physics simulator might be a lot harder than just trying to have a model learn from videos.

Starting point is 00:19:49 Yeah, the hope was that it would be kind of the opposite, right? Because you're going to have this environment, obviously, physics-based to get back to the original question that's like, you know, it's a game of inches, how many inches, right? Like, that should be the next logical question, right? Like, how many inches is it? So when I pull on that collar, you know, the ultimate feedback would be you needed to get under him another two inches or you didn't you didn't apply enough, right? Like how far is far enough? Did I not?

Starting point is 00:20:19 Like his center of gravity didn't move. Yeah. Yeah. No, Kizushi, right? Like it's putting a number against the things we talk about in very general terms like Kizushi. Like, well, what, I mean, we all understand what Kuzhi is, right? But what is enough? becauseushi like and am i even taking advantage of it when i get yeah right um and so and you

Starting point is 00:20:45 if you just take that idea and apply it to any grappling situation right because it's all just manipulating you know force here and reaction there and you know center of gravity moves yeah yeah right that's that's that's that's basically everything in grappling yeah and then if you take that a step further in terms of whether you're an athlete, you're a coach, you're a commentator, right? Imagine the telestrator, you know, you would have on judo TV, right? You could, oh, this is why that failed. This is why they succeeded. I remember we were talking about, I was talking about it with Shantaro once about like, sometimes things happen so fast. Yeah. It's hard to tell what happened. Yeah, yeah. Right? Well,

Starting point is 00:21:31 Well, a model like this, you could sort of infer, you know, like, okay, well, this is what must have happened in order for that to work. And finally, like, you look at an Olympic athlete or something like that. He's like, why does his, why is his Uchimata so special? Why is his Osorogari so special? And just like to use your golf analogy, right? Yeah. I'm sure they have, like, golf swing analysis, like, what's your swing? look like compared to died what's your swing

Starting point is 00:22:04 look like next to Jim Furik or you know whoever you know Jim Jury okay yeah well I'm old I know Jim Fear I'm sure they have those comparisons yeah they always yeah library of those guys yeah yeah they have that you could totally do that yeah in a judo I think I was gonna about mention golf or like skiing like this individual sports it is easier to model so you don't need to have this learning based method i don't think uh yeah and because it's just you're just in yourself it's more like uh yeah it's not like a what the second order system right like it's just action reaction but judo it is a sec

Starting point is 00:22:45 and very position yeah exactly like a golf swing is very position golf even easier i think i mean they're just literally like drawing lines on a screen you know yeah and then uh and then they have 3D physics of 3D models of players like they reconstruct I don't know how they really reconstruct maybe it's manual but they actually just have a video is it a marketing thing or is it real like I've seen it like a video game no it's more like it's like a day market as like a teaching assistant tools like some coaches will use that it's like hey you're compared of like we model you based on the video I don't know if they do mocap or whatever but yeah they construct this like 3d model of your

Starting point is 00:23:34 swing and then overlay with someone else's like a Rory McElroy or something and just say hey you see like you are casting McElroy so yeah casting too early you gotta like keep the like yeah thank you for not calling me out I thought I was Paul like I thought you're there maybe there was a Rory MacDonald I don't know But, you know, I'm a huge McElroy fan. But anyway, see? Yeah. So people do do that for judo.

Starting point is 00:24:06 And then even with skiing, they have, that's another individual sport I do. And they do that. But I think with judo, it's like a second order system. So there's action, reaction. And then, like, that action changes based on the reaction, you know? Yeah. Yeah. So it's more dynamic.

Starting point is 00:24:28 dynamic system. So I think there's a combinatorial explosion happening. That's probably why it's hard to model the actual Randoi or matches in this way. Right. So I think that's why learning-based approaches, just purely based on videos, it's more scale, but it has a higher ceiling, right? Yeah. Yeah.

Starting point is 00:24:53 I mean, that would be, that would definitely be better. But would that yield the same kind of like, It won't be like, oh, inches, I mean, so now there's, so immediate, like, say we built a video, essentially a world model for judo. That's more fun, yes, let's just assume. Then you could have, you could use this system to basically give all the context so far, and then someone's trying to throw. but then you can imagine you know like you could like basically give the model up until like the execution of the throw and then say hey what's going to happen next and then you can then compare with what actually happened and then you can kind of see where things went wrong

Starting point is 00:25:43 like that would be like one way to right use this type of model and then say now the whole a lot one other thing like if we bring in language maybe we can have the model explain it to us you know but this is my question though that's protecting it's let's say it's predicting the next yeah 24 frames right or whatever yeah you fail yeah it seems to me this model would not be able to um postulate the reason why though yeah so then now this is if it's a purely visual physics model like world model it won't be able to explain it to you I mean that's the whole thing right explainability so now another approach would be like for example it's to overlay some kind of you know visualization on top of that so for example

Starting point is 00:26:43 this I recently learned this you know have you been on the Tesla FSD so no I have no I have So they have, if you turn it on, it drives, and then you can kind of see what the car supposedly sees. Like it has like a reconstruction. Oh, no, I take it, but I have seen, you can see like the semi-drucks going on. Yeah, yeah, yeah, yeah, yeah. And this is, you know, the kind of like the industry knowledge. I don't actually know if they actually do it this way, right? Like, you know, I don't work for Tesla.

Starting point is 00:27:16 But based on what I gather from, you know, talking to people, they, basically have a model with different heads basically they call like a hydranet whatever like the monster hydra so one head so they get the same all the sensor information gets processed by one's gigantic pre-trunk basically and this branch focuses on driving and another branch focuses on rendering things based on like so now you can already that it kind of gives you that feedback so you could in this if you apply this approach to judo we could have this gigantic trunk that process all the videos and then one had predicts the next 24 frames like you said and the other maybe could generate

Starting point is 00:28:08 commentaries or like what why he thinks uh the throw will fail right which is okay the plausible so the problem is So far, the system, there's no guarantee that the explanation will match up with the actual frame prediction. Right, right, right. It could be completely different. Yeah, so it's, you know, there's, it's, uh, I mean, I think human brains kind of do this too, like, kind of like we do this one thing and then our, we make up the justification after, you know, like kind of thing. That is the guy who stole my car. Yeah, yeah, it's just like, yeah, exactly.

Starting point is 00:28:50 And so maybe that's just a problem that can never be solved but I'm not saying this approach always fails Yeah Well that's the Optimization process right

Starting point is 00:29:05 You're trying to reduce that error So that if you are trying to I mean I'm sure Tesla engineers and Tesla Put a lot of work to minimize that But yeah so you don't probably have to be something like that So like one head

Starting point is 00:29:18 would predict the next frames and then another component will try to explain I suppose it would have a notion of distance like if you change the scenario a little bit possibly I don't know how you would do it but maybe it's like oh if he was one inch

Starting point is 00:29:41 in this direction now what do you predict yeah then it's like the like you could jitter the jitter the input kind of right or you could do that or you could do you know the video generation models that are all text condition now so you could describe the change you like to make oh now we're yeah i mean so that's like kind of the dream right like yeah exactly i mean yeah i'm not this is actually i mean we we got we've gotten very good at generating videos with text as an input like a text

Starting point is 00:30:13 condition video generation that's uh it's been there's been expulsion of things but you know, applying that to embodied air. The physics don't look right there. Yeah, exactly. I mean, that's a challenge, yeah. It's gotten a lot better. It's obviously gotten a lot better. But every now that you just see, like, the human eye is so good.

Starting point is 00:30:31 You're just like, that is weird. Because we, yeah, we are extremely well trained in spotting these things out. So, I mean, that's the ultimate challenge. But you can kind of see this the way I'm going with this, right? Like a lot of these problems, like autonomous vehicles, robotics, even your idea about judo are kind of related i think ultimately uh if in order to build a robust system you should probably follow something like this but then again if you go into the practicality like is it even possible with judo i don't know it's you know like with the videos

Starting point is 00:31:06 we have um right so to to to sum this up a little bit it sounds like okay the original idea with a physics-based model. It sounds like maybe technically, it might be possible. It just economically would never make any sense because you just would never be able to or the economic motivation to tackle this problem is just so much beyond probable demand for this kind of solution. On the other hand, we've got this sort of more AI-based, you know, text versus video kind of play but that's still it's it's sort of a little bit beyond a bleeding edge still yeah that's still

Starting point is 00:31:52 i mean that yeah that was my dissertation you can still get a PhD with it that's pretty yeah yeah exactly it's not just something you just signed up for it's like yeah let's do this man you can still get a PC and the researching it so it's pretty bleeding edge and bleeding it yeah yeah and but there's a lot of money going into it i think that that's the thing I think a lot of these the research applied research the application of this type of research

Starting point is 00:32:23 goes towards automas vehicles because there's money in it. There's more data. Because even Boston Dynamics, I mean, how many times have they been bought and sold by now? It's amazing technology, but it's just not making it. And then it's hard to find

Starting point is 00:32:39 a use case. And that Boston Dynamics data, they actually don't use this type of method. They're more on the classical robotics it's a very math habit classical planning it's an algorithmic and yeah it's uh i know they but it looks so good it looks good but yeah it takes a lot of work to get there and yeah those people are i worked with some of those people in the classical planning and they're brilliant they're like brilliant mathematicians and yeah and but it's uh it it's completely different i'm not good at math, so I didn't even dream of, you know.

Starting point is 00:33:18 The irony. Right. It's a, you don't, you don't require, yeah, they, they have to like. You're not good at PhD. Yeah. Yeah. But normal math. I'm pretty sure you're okay.

Starting point is 00:33:29 I guess I could get. Everybody listening, do not believe this guy. That, that robotic, the math they do is like, you need, like, type, actual guarantee. You have to actually do a lot of drugs. Yeah, like, yeah, you know, you probably do, like, a lot of cocaine going in there, I guess. yeah right or something some shrooms something

Starting point is 00:33:48 keep that keep that dream but yeah so I think that this is yeah but it is viable I think you know you can kind of see

Starting point is 00:33:58 where this could go right like you know yeah basically a lot of the capabilities we've seen with chat chip BT along text and you know

Starting point is 00:34:08 combine that with judo videos maybe something could happen I mean that would be the dream right like you just put in video of your match and it tells you it breaks down like all the significant situations and this is why you failed this is why you succeeded you know X degrees or you know whatever you were

Starting point is 00:34:27 able to having a personal coach basically yeah yeah yeah yeah that would be the ultimate in your pocket kind of right right well actually I think about Chris round you know oh yeah yeah yeah yeah I've talked to him yeah and he had and he does all that like crazy scouting stuff for his athletes oh okay I mean, that would be a dream for him, right? Like, you feed in everybody's matches, you know, all their tendencies, you know, you know, like they like to go left here, they go right here, they, you know, like this Kosoto, they like this Daashi in these situations. And, you know, if you can stay on this angle, like, his job would be to create the drill

Starting point is 00:35:02 to get them to respond, right? Like, they stay outside, like, this danger zone. Maybe that's the economic angle, scouting, then, whether that this coach. The thing is judo is always hard. because I feel like in other sports, you have this abstraction called a ball. Oh, yeah. Soccer, it would be a little easier, I guess. Yeah, soccer would be a good one.

Starting point is 00:35:23 F-1 is already up to the kills in technology. I don't think they need anybody else's money. They just spend, they did burn sacks of money every second, it seems like. But these other sports with the ball, you have the separation. So some of these other approaches, I feel like, are more, they're easier in a way, right? But even then, if you go to like the Sloan analytics, sports analytics conference, you know, I haven't been recently, but, you know, there used to be a lot of talks on, like, machine learning and, you know, breaking down plays, you know, all that kind of stuff.

Starting point is 00:35:55 And I'm sure people are using it, but I'm not sure it ever met the promise that people were hoping for. It's a, I forget what, man, I recently read something interesting. It basically, it classifies, like, why multiple works so well in certain. sports and not oh it was that like why bodyball why didn't moneyball work that well in soccer that was what it was and right it's because it's a very different yeah baseball it's easy to it's a very stats statistics heavy game right like it's easy turn base it's basically easy to keep track but soccer is very fluid there's no turn yeah uh because of that it's even the hard to

Starting point is 00:36:40 you kind of have to have a feel for it you know it's so yeah i think moneyball the impact wasn't as great you know it's yeah it certainly did help to try to find cheap play cheap good underpriced players yeah it wasn't to the point where one team like you know like the athletics just kind of went to the world see yeah never happened like that i was going to put a challenge out yeah to the audience to all any of the uh ambitious software engineers out there. Yeah, I'm telling a lot of software engineers do BJJ judo now. Yeah. Yeah. Oh my God. We'll leave that for another conversation because I'm just, this is your

Starting point is 00:37:23 mission, I guess, since you're in Silicon Valley. What is my mission? You can actually find out. Yeah. Oh, my mission. If it is, if it is getting big or not. I'm just curious. It's like anecdotally, I've heard that it's getting more popular in the My Circle book. Yeah. Yeah, which is very surprising to me because Rebecca, I started at, like, mid-2000s, like, wrestling and just, it was not a sport for... Attractive thing. Yeah, it was not an objective thing to do.

Starting point is 00:37:54 Yeah, yeah, exactly. So it's like, I happen to be in Michigan, and wrestling is very popular in Michigan. That's why I, like, started doing it. Oh, did you grow up? You actually grew up there? Yeah, I went to high... I moved to Michigan when I was 15 from Korea. Oh, okay.

Starting point is 00:38:09 So this is almost like you're going home. Yeah, yeah, exactly. Yeah, I still, yeah, yeah, like I could reconnecting with my high school friends and, yeah, all that, yeah. Oh, you're such a good guy. Yeah, so, yeah, I'm a Midwestern boy. Yeah, I was going to, yeah, stereotypical Midwestern boy. Who wrestled and now did you know, yeah. That's right.

Starting point is 00:38:32 Yeah, I was going to put out a list of some of these open source packages. I guess I'll still put them into the, into the description and just sort of see. But there's just been. much development and just, you know, in machine learning and this modeling and now in this sort of neural networks and LLMs that you can't help, but, you know, and it's like, is it getting cheap enough and widespread enough and understood enough to apply this to not like going to the moon problems? You know what I mean? But it seems like we're not quite there yet, but you never know. There might be a clever solution out there. I wasn't like trying to say this is like,

Starting point is 00:39:11 I wasn't trying to be like, oh, it's not there. Yeah, I think this type of constraint actually gives it's a fraud of ground for creativity. So maybe, you know, you have to kind of strike a balance. Of course, you can have this whole deal. Maybe if you really want to solve this, maybe this type of crazy approach, learning-based approach should be applied. But the reality is, you know, you have to consider the practicality. And maybe this type of your idea of kind of like using the. physics simulation as a crutch to get the model to some usable space.

Starting point is 00:39:46 And then they get into that loop of like, you know, virtual cycle of improvement. Yeah, yeah, people can definitely start. You should start like that, yeah. Well, before we close things down, there is one for our software and technical people out there. There is one other idea that you could explore that is far more practical. know is possible right now and that is the butt yeah so if you imagine even someone like chantar right and this is something i'm actively messing around with right now shantaro has an inordinate amount of videos on youtube and he's also recorded his fair share of an instructional

Starting point is 00:40:33 material both on its own side and uh for bjj fanatics and and such so uh the idea here would be create like a shin bot shin bot right that would uh encompass all the material he's ever put out that's sort of the idea and people are using these systems you know people call them rag systems or kag systems or whatever where you're using an lLM to essentially um aggregate i guess i mean it's not really you're doing the aggregating as a developer but you don't know what i'm saying it sort of aggregates all the information on the subject and you can ask you questions and all that kind of stuff and it'll sort of serve as you're like, you're a conigliary, you know, on a particular knowledge base of material. And so this is something that I'm just sort of messing around with, with Shintaro's material,

Starting point is 00:41:23 just to sort of see how good you can get it. I would never, never try to simulate Chintaro himself because I think he's too unique for that kind of thing. But I could create sort of maybe an autistic, you know, assistant. Yeah, there's like a, that sort of knows everything. anything that he said and done. And it's sort of interesting. It's sort of working. Oh, really?

Starting point is 00:41:47 What did you use? Oh, yeah, yeah. Right now I'm using, I think I'm using Gemini. Okay, okay. Right now, like this sort of, you know, standard stuff, you know. I've used, I've used all three of them, but, you know, just for the context window and stuff. I'm just using Gemini for convenience.

Starting point is 00:42:03 Nice, nice. But, so there's that. But then to take it a step further, for old guys. like me who have some income, disposable income, I'm sure many of us have an embarrassingly large library of these instructions. And there are some instructors that you like and some of them that you like but are just, you can't bear to watch. I won't, you know, I won't name mine by name because I'm sure somebody else would love them. But you know who I mean. So, you know, imagine being able to synthesize and distill everything in your library.

Starting point is 00:42:40 But it would be on demand just in time. He'd say, hey, I'm having problems from this position in half guard or whatever position. And you would have access to the exactly relevant material from your different instructions. Now, practically speaking, it is a lot of work because you've got to strip the audio. You got to, you know, some of these transcription packages are amazing. They just like zip, just an hour of audio in a second is transcribed. Pretty well, pretty well. So that's pretty amazing.

Starting point is 00:43:14 So that's something that you may see. I'll invite you to the report. Oh, okay. Okay, I'll take a look. So you can take a look. Oh, so this official, it's like a repository and everything. Yeah. Oh, yeah, we got a repo, baby.

Starting point is 00:43:27 Everything's got need to be organized in this house. Plus, it's a lot easier when it's digital. If you look at my room, there's socks everywhere. But that's sort of one idea to throw out to people who are technically technically united and very desperate and very desperate about their own technical development I bet I bet this could be a good addition to your resume if you if you want to like oh it's too late for my resume well now I bet you as it like audience oh yeah for sure yeah for sure yeah for sure and it

Starting point is 00:44:02 would be you know and who knows you can open it up to your best buddies only your best buddies because you only you know you don't want to spread this stuff around but a lot of us own a lot of this material and uh it's just sitting there because we're just like i'm not watching this i'm not going to watch this again you know no way but if you could query it if you could um yeah query it at will why why would i ignore the half of my body and thing you know yeah yeah 50% of the body why ignore 50% of the body yeah and it's just hard man to watch hours and hours of material guys it's just hard just give me this the golden nuggets, for God's sake.

Starting point is 00:44:39 Man's a serious academic. So he has to work within this tightly formulated framework, you know, where everything is well defined. Yeah. But the nice thing, too, would be like if you could take it a step further, just aside from just pure recall, right? If there are drills or exercises or stuff like that, you could combine that kind of information and, you know, create your own plan, create your own, you know, something tailored

Starting point is 00:45:07 to you. Yeah, right. Yeah, the synthesis will be amazing. Yeah. I'm sure it'll happen. I mean, with just the base LLM, the knowledge already in LLM. The technology is, I think, is there with a little bit of extra, you know, spice. Yeah, yeah, yeah, yeah.

Starting point is 00:45:26 You could do it. I think you could do it. Maybe this will be, inspire a lot of people to actually set up this judo bot or BGJ bot. Whatever. If you haven't gotten involved with this stuff for your own personal projects, people, even if you don't know anything about writing software, I highly encourage you to check it out because my wife, who is not technical at all, has written, I would say, like three different applications for our own use.

Starting point is 00:45:58 Oh. Yeah, in the workplace, like her own sort of dashboards and stuff. I mean, with a little bit of coaching for me. Wow. Because I love my wife. When you're in a corporate environment, you don't need to worry about like millions of people, serving millions of people. You're just serving yourself and maybe a few of your colleagues.

Starting point is 00:46:16 So you are the, you like that old saying about like you are not the user, you are not your customer. So you need to like think more broadly. In this case, it is absolutely true. You are the user and you know exactly what you need. You know? So I would highly encourage people to check it out. Yeah. That's my soapbox for today.

Starting point is 00:46:34 thank you everyone for listening if you're even listening this far we'll talk to you again all right thanks guys thanks peter

The Shintaro Higashi Show - Judo Bots! | The Shintaro Higashi Show

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.