The Shintaro Higashi Show - Judo Bots! | The Shintaro Higashi Show
Episode Date: July 28, 2025In this episode of The Shintaro Higashi Show, David Kim and newly minted PhD Peter Yu explore the intersection of judo and cutting-edge AI. With Shintaro away, they dive deep into what it would take t...o quantitatively model grappling as a "game of inches"—from motion capture to video-based simulation, and the promise of building smarter, more personalized training tools using machine learning and embodied AI. 🚨 LIMITED-TIME OFFER: 40% OFF 🚨The All-in-One Instructional Bundle just got even better.Every major instructional. One complete system. Now at our biggest discount yet.Grab yours now at 40% off : https://higashibrand.com/products/all-instructionalsThis won’t last. Build your game today.🔥 Get 20% OFF FUJI Gear! 🔥Looking to level up your judo training with the best gear? FUJI Sports has you covered. Use my exclusive link to grab 20% OFF high-quality gis, belts, bags, and more.👉 https://www.fujisports.com/JUDOSHINTARO 👈No code needed – just click and save!Links:🇯🇵 Kokushi Budo Institute (The Dojo) Class Schedule in New York, NY 🗽: https://www.kokushibudo.com/schedule🇯🇵 Higashi Brand Merch & Instructionals: https://www.higashibrand.com📚 Shintari Higashi x BJJ Fanatics Judo Courses & Instructionals Collection: https://bjjfanatics.com/collections/shintaro-higashi/David Kim YT/Insta: @midjitsu 
 Transcript
 Discussion  (0)
    
                                        The Shintaro Higashi Show is sponsored by Judo TV, your premier destination for live and on-demand
                                         
                                        judo coverage.
                                         
                                        Never miss a throat.
                                         
                                        Hakuen AI.
                                         
                                        Hakuen AI helps you measure, predict, and solve customer churn.
                                         
                                        Visit Hakuin AI and start your free churn on it today.
                                         
                                        Higashi brand.
                                         
                                        Train hard.
                                         
    
                                        Live strong.
                                         
                                        Wear Higashi brand.
                                         
                                        All right.
                                         
                                        Everyone, it's a little bit of a different episode today.
                                         
                                        It is the Shintaro Higashi show.
                                         
                                        But this time with David Kim and Peter U and not Shintaro Vagashi, partially for scheduling reasons, you know, no big deal.
                                         
                                        But you're just going to get more podcast goodness now that we are dividing and conquering.
                                         
                                        But, you know, this is a promise that I made to myself a couple episodes ago because I wanted to talk to the newly christened Dr. Peter U.
                                         
    
                                        As he so painstakingly reminded us.
                                         
                                        You know, you got to call me doctor now.
                                         
                                        And I respect that.
                                         
                                        I respect that.
                                         
                                        So we got him back and we're going to talk about what many call a game of inches.
                                         
                                        They talk about judo and grappling in general as a game of inches or even maybe less.
                                         
                                        But my question has always been, well, how many inches is it?
                                         
                                        You know, is it one inch?
                                         
    
                                        Is it half an inch?
                                         
                                        Is it three inches?
                                         
                                        And that's what got me thinking about a more quantitative approach to grappling.
                                         
                                        And this was actually inspired by, do you ever remember this show that was on TV?
                                         
                                        I think it was called Fight Science, but I'm not sure.
                                         
                                        Oh, yeah, I'll discover channel.
                                         
                                        I remember.
                                         
                                        Yeah, yeah.
                                         
    
                                        And they had their like 3D, like, CGI.
                                         
                                        They had a judo episode.
                                         
                                        I'm pretty sure.
                                         
                                        Yeah, yeah, yeah.
                                         
                                        Like every martial art they had on there.
                                         
                                        Like how much force does it take?
                                         
                                        to like break a man's ribs with your elbow or you know and they would have all those stupid like
                                         
                                        sensors visualizations of like what's going on and this is an old show i mean this is like i'm sure
                                         
    
                                        you can catch some of it on youtube yeah the mid 2000 the data uh i think though the animation that they
                                         
                                        made about uchibata is still floating around the web somewhere oh really i see it sometimes
                                         
                                        oh okay but that sort of was the kernel here i saw that and i was like oh that's sort of interesting
                                         
                                        through life, it seemed like we should have the technology to sort of combine video and even
                                         
                                        just high resolution pictures with like a physics-based model of the human body. And so the basic
                                         
                                        idea here is that the way it would work is you can model in gravity. You can model in the basic
                                         
                                        environment. You can model the bones to some degree the elasticity, which way your joints is
                                         
                                        supposed to move, all that kind of stuff, right?
                                         
    
                                        And so the thought was that given certain pictures and a sequence or frames of a video, of a clear video,
                                         
                                        you could then infer the motion the body has to go through to get in those positions.
                                         
                                        So that's point number one.
                                         
                                        Point number two would be if you have two of these models, like interacting models,
                                         
                                        you could then do a ton of calculations on like, okay, what force is being applied to this part of the body,
                                         
                                        How is the center of gravity moving, you know, for a particular player versus what the other guy is doing?
                                         
                                        With the idea being you could create almost like this theoretical understanding of grappling,
                                         
                                        using like a notional grappler at first, sort of like the platonic ideal of a grappler.
                                         
    
                                        But then you could change the constraints for like a specific person, right?
                                         
                                        So like, oh, no, I'm short and fat.
                                         
                                        I have no flexibility.
                                         
                                        What is my sort of local optimal?
                                         
                                        Like, what game should I be playing?
                                         
                                        Oh, so it's more, so you develop the understanding of a grappling art,
                                         
                                        and then eventually you can use it to like maybe simulate situations and then try to develop your own style.
                                         
                                        But that's sort of the basic idea, and I wanted to run it by my newest Ph.D. friend.
                                         
    
                                        So, newly minted.
                                         
                                        So this is a, so my dissertation was on.
                                         
                                        Yeah, some, like how to combine video understanding and language understanding together and then try to use that.
                                         
                                        So this is a little different.
                                         
                                        But, you know, since I do work for a Thomas vehicle company, I have to pick, a lot of the ideas actually apply to what they call embodied AI.
                                         
                                        It just means that AI has a body.
                                         
                                        They interacts with the physical world.
                                         
                                        But, you know, embodiment.
                                         
    
                                        And in your case, it would be a car.
                                         
                                        Or, yeah, so exactly.
                                         
                                        Yeah, yeah, right, yeah.
                                         
                                        And then, you know, you can kind of like fuss, make the line fuzzy,
                                         
                                        and then the embodiment can be like, you know,
                                         
                                        the current buzzword is agentic AI, right?
                                         
                                        Like, you know.
                                         
                                        Yeah, of course.
                                         
    
                                        And that could be considered as an embodiment, too.
                                         
                                        Essentially, it does interact with an external system.
                                         
                                        But I think in the strictest sense, embodied AI refers to AI system
                                         
                                        that interact with the physical world.
                                         
                                        And then to do that,
                                         
                                        you need to develop this intuitive understanding of physics,
                                         
                                        how things work.
                                         
                                        And that's kind of what you alluded to in this idea,
                                         
    
                                        you know, basically like a model of grappling.
                                         
                                        And then you mentioned that there are two systems that could,
                                         
                                        you know, one that learns the mechanics, the physics from videos.
                                         
                                        And another is more about, like, based on the laws of physics and then, you know, basically numerically simulate things, right?
                                         
                                        So that is, those are actually approaches that embodied air companies, like robotics companies that focus on learning, like based approaches or autonomous vehicles.
                                         
                                        like famously Tesla uses these two approaches to trade their model.
                                         
                                        Oh, okay, I didn't know that.
                                         
                                        So basically, just to give you some, the reason, just background on why people do this both ways,
                                         
    
                                        is because the success of like LLN hinged upon the fact that we had a lot of training data,
                                         
                                        the text training data readily available online.
                                         
                                        But the problem with embodied AI is that it's really hard to collect such data.
                                         
                                        I mean, you're probably, while you're thinking about this, you know, you probably realize,
                                         
                                        you know, IJF does have a lot of huge swats of judo videos, but ultimately,
                                         
                                        videos are harder medium to deal with in terms of engineering than text.
                                         
                                        And not only that, it's the volume just doesn't
                                         
                                        match the textual data and then that kind of goes to show how incredible
                                         
    
                                        written language is it's like uh i always try to like to say that's like nature's way
                                         
                                        like the nature's way of the better nature came up with the best compression algorithm really
                                         
                                        like to yeah through millennia of evolution they came up with this like brilliant system that
                                         
                                        could compress knowledge down to very, you know, digestible bits.
                                         
                                        The first abstention.
                                         
                                        Yeah, basically.
                                         
                                        Yeah.
                                         
                                        It's incredible.
                                         
    
                                        I mean, but, you know, it's, it's, nothing like that existed in embodied AI.
                                         
                                        So, you know, to overcome that, one possible way is to create like a simulator based on the laws of physics, like you said, and try to
                                         
                                        simulate the interactions or try to predict what's going to happen i mean the the simulations
                                         
                                        have been used uh for the long time that's how we got to the moon you know um but word yeah and then
                                         
                                        but now the new approach is the other one that they're like learning from videos like data yeah
                                         
                                        for that and then i mean uh so now before we go before we keep going though when we
                                         
                                        talk about the video and we talk about like two judo guys trying to throw each other
                                         
                                        can you like what are some of that when you say it's more difficult you know it's not
                                         
    
                                        maybe not as you know dense in terms of information like what are some of the problems
                                         
                                        with video that you know because like something when I think about it you remember the
                                         
                                        matrix yeah yeah yeah and that that bullet time right and have you ever seen the rig that they
                                         
                                        had to set up it was like but to make that shot bunch of cameras around the yeah it was just
                                         
                                        like a ring of cameras like 360 degrees and that's sort of what I imagine it's like you
                                         
                                        would need to have like a bullet time set yeah it's to record something it's a video is
                                         
                                        very uh it's a very light information a light medium actually I mean it's kind of a
                                         
                                        paradoxical because there you know there's a thing like what a picture is what thousand words
                                         
    
                                        but it's like it's actually not that simple yeah like that like that
                                         
                                        sometimes a word does carry more information than the visual and because of that we we think that
                                         
                                        because we intuitively understand the world better because we spend both of our childhood interacting
                                         
                                        with the physical world the best possible simulator which is the real world so because of that
                                         
                                        I think visual information is a lot easier for us to digest and also visual cortex
                                         
                                        was developed first before the executive function, right?
                                         
                                        Like, for a note, code text came way later,
                                         
                                        that's like the language understanding.
                                         
    
                                        But so the video, the problem is, yeah,
                                         
                                        that it's not even just like the angle.
                                         
                                        It's just that it's,
                                         
                                        you basically have to learn the movement of pixels
                                         
                                        from very similar frames.
                                         
                                        Like if you think about it,
                                         
                                        if you, I don't know if you,
                                         
                                        if you're like kind of.
                                         
    
                                        swipe through the video, you'll notice that video frames are very similar to each other.
                                         
                                        And so because of that, there's a lot of noise.
                                         
                                        So the key is like you have to, because when the conventional way of feeding videos into a model,
                                         
                                        a deep neural network model is just to give raw pixels the values.
                                         
                                        So every single pixel has the same.
                                         
                                        weight basically initially same significance but the neural network has to learn to
                                         
                                        ignore what to ignore and then all humans are I think the brain is very like we are
                                         
                                        born with the ability to kind of ignore a lot of different things and we also
                                         
    
                                        learn to ignore a lot of things as we experience the world but you know it's really
                                         
                                        hard to teach them from scratch to a system that's never really done that's never
                                         
                                        done that so i think yeah in that way images and videos are a lot harder but then then
                                         
                                        language so language actually is an easier medium in that sense i think because it's so
                                         
                                        information it dense yeah yeah i was thinking about more trivial things like how do you tell two
                                         
                                        people apart yeah i mean that's that's this all i think those things are all hinge upon this
                                         
                                        you know right they're all just symptoms yeah and then basic problem and then like a cool
                                         
                                        like when things are and also you have to understand that like for us occlusion is
                                         
    
                                        easy to understand when we see in the video because we know what happens in the real world
                                         
                                        we know we understand that videos are just to represent to the projection of what's happening in
                                         
                                        the real world so right but these systems don't have that at all we have to kind of teach them
                                         
                                        and there are ways people have come and we figured out a way to do so and i can go into details about
                                         
                                        some of the models that learn this type of physical physics of the world just from video
                                         
                                        but again you just have to understand that you know they call the prior right there's no
                                         
                                        prior knowledge about the world in these systems these are like neural networks are very
                                         
                                        generic i mean they used to put a lot of inductive bias they say like and try to like
                                         
    
                                        Like, they used to use architectures to call convolution neural networks to process images.
                                         
                                        And convolution comes from the, like, was kind of, it's not,
                                         
                                        it's just a pure mathematical concept, but it has some bearing into how visual cortex works,
                                         
                                        the human visual cortex.
                                         
                                        So you can kind of see that, oh, you know, it's useful to process images.
                                         
                                        But now there's, we don't really use, I mean,
                                         
                                        In certain applications, we still use convolution in your workbooks, but the architecture we use now are very generic now.
                                         
                                        Any type of information can be.
                                         
    
                                        So, like, mathematically, very simple.
                                         
                                        There's no.
                                         
                                        So that's what that sci-pi method is, convolve.
                                         
                                        Yeah, convolve.
                                         
                                        Yeah, that's the verb convolution.
                                         
                                        Yeah, convolve, it comes from.
                                         
                                        Right.
                                         
                                        I never used it.
                                         
    
                                        It's from, yeah, like digital signal processing.
                                         
                                        And then later, they kind of figured out.
                                         
                                        there's some connection, similarities with how visual cortex works.
                                         
                                        What about mocap?
                                         
                                        What is...
                                         
                                        What do you think about motion capturing?
                                         
                                        Motion capture as like a base, a basis for sort of seeding this more physics-based model.
                                         
                                        Because like we're getting away from...
                                         
    
                                        Yeah, I mean, information is more dense, but it's harder to...
                                         
                                        It's just the quantity is not there.
                                         
                                        You're going to hire a bunch of people.
                                         
                                        You're going to pay Judo people to wear mochape gears.
                                         
                                        Yeah.
                                         
                                        Well, we're just talking about in theory.
                                         
                                        Yeah, I mean, in theory, if you can scale up, yes, it'll be too learned about the...
                                         
                                        Well, the idea here...
                                         
    
                                        Yeah.
                                         
                                        Yeah, the idea here would be to get almost like a...
                                         
                                        Not sort of like a full distribution of like everything that could ever happen.
                                         
                                        It's more like templates or like canonical movements, right?
                                         
                                        just enough for to get to maybe some kind of reinforcement learning situation right where you have the model you have enough of these yeah I don't know what you would call them templates or something such that it has enough reference to yeah you know do the simulation so so now we're learning that in order for it's better to have a build a bottle
                                         
                                        that has a lot of general knowledge
                                         
                                        and then
                                         
                                        focus on the specific
                                         
    
                                        so
                                         
                                        oh so you're saying
                                         
                                        the opposite
                                         
                                        it's actually the opposite
                                         
                                        just because you were focusing
                                         
                                        on only judo moves
                                         
                                        ultimately if you want to accurately model
                                         
                                        physics
                                         
    
                                        of judo
                                         
                                        you probably have to know
                                         
                                        about the world
                                         
                                        I mean that will always help
                                         
                                        so that kind of goes into
                                         
                                        like now
                                         
                                        it's called
                                         
                                        nowadays people call it world models because you're learning about the world and there are many
                                         
    
                                        ways to teach a model do that but one way is to one popular way is to have the model predict the
                                         
                                        you know subsequent frames right you know and then i mean that's kind of what you are alluding to right
                                         
                                        like from videos and then basically through that and then you just show the model bunch of videos
                                         
                                        and eventually, in order to be good at predicting, you know,
                                         
                                        frames that are seconds later, it has to learn about the world.
                                         
                                        Yeah, yeah, yeah, yeah.
                                         
                                        Just like an LLM does.
                                         
                                        Yeah, it sort of predicts that next word, right?
                                         
    
                                        And, you know, there's a lot of discussion about if this is the best task and whatnot,
                                         
                                        and then whatever.
                                         
                                        This is all like boil the ocean kind of.
                                         
                                        Yeah, it becomes academic, but then the.
                                         
                                        I mean, boil the oceans, yeah, but turns out that's kind of like the best approach
                                         
                                        because the other way is just building a simulator.
                                         
                                        And then the simulator is kind of getting a falling out of favor, falling out of, you know,
                                         
                                        because ultimately it's so brittle.
                                         
    
                                        So even though, even if you can kind of account for every single variable, I mean,
                                         
                                        it's impossible to basically account for every single variable and they're pretty,
                                         
                                        the next frame. So it's actually sometimes more accurate to build a generative video model
                                         
                                        than trying to simulate it, numerically simulate, what's going to happen next in terms of physics.
                                         
                                        And I was, you know, we we talked, I was, I had been thinking about it before this episode
                                         
                                        because so that I can talk more about it. But one good analogy is like,
                                         
                                        like we don't really know the laws of physics like we don't know we don't have to know the formula
                                         
                                        to predict where the ball will go right yeah so it's kind of like the same idea it's the
                                         
    
                                        answer it might be actually more computationally efficient to just kind of imagine where things
                                         
                                        what will happen after watching just having seen it yeah exactly that's not that's actually why
                                         
                                        why we're so good at that because we all we we're always like experimented we always have
                                         
                                        we always get immediate feedback from the world right so it's like a never-ending
                                         
                                        a B test exactly exactly and then so I think simulations are I mean I based on the public
                                         
                                        information Tesla apparently still uses it to you know test and train their driving model
                                         
                                        And I'm sure it has some utility, but it's just so labor intensive.
                                         
                                        You just need to hire so many people.
                                         
    
                                        And it's always like never, like you're chasing the dragon.
                                         
                                        You will never be able to catch it.
                                         
                                        You'll never be able to.
                                         
                                        I mean, that's why physics, yeah, your idea about like how to guide the model
                                         
                                        with the laws of physics to learn about the world or at least the judo world.
                                         
                                        So these are some of the things.
                                         
                                        that you might have to think about because a building of physics simulator might be a lot harder
                                         
                                        than just trying to have a model learn from videos.
                                         
    
                                        Yeah, the hope was that it would be kind of the opposite, right?
                                         
                                        Because you're going to have this environment, obviously, physics-based to get back to the
                                         
                                        original question that's like, you know, it's a game of inches, how many inches, right?
                                         
                                        Like, that should be the next logical question, right?
                                         
                                        Like, how many inches is it?
                                         
                                        So when I pull on that collar, you know, the ultimate feedback would be you needed to get under him another two inches or you didn't you didn't apply enough, right?
                                         
                                        Like how far is far enough?
                                         
                                        Did I not?
                                         
    
                                        Like his center of gravity didn't move.
                                         
                                        Yeah.
                                         
                                        Yeah.
                                         
                                        No, Kizushi, right?
                                         
                                        Like it's putting a number against the things we talk about in very general terms like Kizushi.
                                         
                                        Like, well, what, I mean, we all understand what Kuzhi is, right?
                                         
                                        But what is enough?
                                         
                                        becauseushi like and am i even taking advantage of it when i get yeah right um and so and you
                                         
    
                                        if you just take that idea and apply it to any grappling situation right because it's all just
                                         
                                        manipulating you know force here and reaction there and you know center of gravity moves yeah yeah
                                         
                                        right that's that's that's that's basically everything in grappling yeah and then if you
                                         
                                        take that a step further in terms of whether you're an athlete, you're a coach, you're a commentator,
                                         
                                        right? Imagine the telestrator, you know, you would have on judo TV, right? You could,
                                         
                                        oh, this is why that failed. This is why they succeeded. I remember we were talking about,
                                         
                                        I was talking about it with Shantaro once about like, sometimes things happen so fast. Yeah.
                                         
                                        It's hard to tell what happened. Yeah, yeah. Right? Well,
                                         
    
                                        Well, a model like this, you could sort of infer, you know, like, okay, well, this is what must have happened in order for that to work.
                                         
                                        And finally, like, you look at an Olympic athlete or something like that.
                                         
                                        He's like, why does his, why is his Uchimata so special?
                                         
                                        Why is his Osorogari so special?
                                         
                                        And just like to use your golf analogy, right?
                                         
                                        Yeah.
                                         
                                        I'm sure they have, like, golf swing analysis, like, what's your swing?
                                         
                                        look like compared to died what's your swing
                                         
    
                                        look like next to Jim Furik or you know whoever you know Jim
                                         
                                        Jury okay yeah well I'm old I know Jim Fear I'm sure they have those
                                         
                                        comparisons yeah they always yeah library of those guys yeah yeah they have that you
                                         
                                        could totally do that yeah in a judo I think I was gonna about
                                         
                                        mention golf or like skiing like this individual sports it is easier to
                                         
                                        model so you don't need to have this learning based method
                                         
                                        i don't think uh yeah and because it's just you're just in yourself it's more like uh yeah it's
                                         
                                        not like a what the second order system right like it's just action reaction but judo it is a sec
                                         
    
                                        and very position yeah exactly like a golf swing is very position golf even easier i think
                                         
                                        i mean they're just literally like drawing lines on a screen you know yeah and then uh and then they have
                                         
                                        3D physics of 3D models of players like they reconstruct I don't know how they
                                         
                                        really reconstruct maybe it's manual but they actually just have a video is it a
                                         
                                        marketing thing or is it real like I've seen it like a video game no it's more like
                                         
                                        it's like a day market as like a teaching assistant tools like some coaches will
                                         
                                        use that it's like hey you're compared of like we model you based on the video I don't
                                         
                                        know if they do mocap or whatever but yeah they construct this like 3d model of your
                                         
    
                                        swing and then overlay with someone else's like a Rory McElroy or something and
                                         
                                        just say hey you see like you are casting McElroy so yeah casting too early you gotta
                                         
                                        like keep the like yeah thank you for not calling me out I thought I was
                                         
                                        Paul like I thought you're there maybe there was a Rory MacDonald I don't know
                                         
                                        But, you know, I'm a huge McElroy fan.
                                         
                                        But anyway, see?
                                         
                                        Yeah.
                                         
                                        So people do do that for judo.
                                         
    
                                        And then even with skiing, they have, that's another individual sport I do.
                                         
                                        And they do that.
                                         
                                        But I think with judo, it's like a second order system.
                                         
                                        So there's action, reaction.
                                         
                                        And then, like, that action changes based on the reaction, you know?
                                         
                                        Yeah.
                                         
                                        Yeah.
                                         
                                        So it's more dynamic.
                                         
    
                                        dynamic system.
                                         
                                        So I think there's a combinatorial explosion happening.
                                         
                                        That's probably why it's hard to model the actual Randoi or matches in this way.
                                         
                                        Right.
                                         
                                        So I think that's why learning-based approaches, just purely based on videos,
                                         
                                        it's more scale, but it has a higher ceiling, right?
                                         
                                        Yeah.
                                         
                                        Yeah.
                                         
    
                                        I mean, that would be, that would definitely be better.
                                         
                                        But would that yield the same kind of like,
                                         
                                        It won't be like, oh, inches, I mean, so now there's, so immediate, like, say we built a video, essentially a world model for judo.
                                         
                                        That's more fun, yes, let's just assume.
                                         
                                        Then you could have, you could use this system to basically give all the context so far, and then someone's trying to throw.
                                         
                                        but then you can imagine you know like you could like basically give the model up until like
                                         
                                        the execution of the throw and then say hey what's going to happen next and then you can then
                                         
                                        compare with what actually happened and then you can kind of see where things went wrong
                                         
    
                                        like that would be like one way to right use this type of model and then say now the whole a lot
                                         
                                        one other thing like if we bring in language maybe we can have the model explain it to us
                                         
                                        you know but this is my question though that's protecting it's let's say it's predicting the next
                                         
                                        yeah 24 frames right or whatever yeah you fail yeah it seems to me this model would not
                                         
                                        be able to um postulate the reason why though yeah so then now this
                                         
                                        is if it's a purely visual physics model like world model it won't be able to explain it to you
                                         
                                        I mean that's the whole thing right explainability so now another approach would be like for
                                         
                                        example it's to overlay some kind of you know visualization on top of that so for example
                                         
    
                                        this I recently learned this you know have you been on the Tesla FSD so no I have no I have
                                         
                                        So they have, if you turn it on, it drives, and then you can kind of see what the car supposedly sees.
                                         
                                        Like it has like a reconstruction.
                                         
                                        Oh, no, I take it, but I have seen, you can see like the semi-drucks going on.
                                         
                                        Yeah, yeah, yeah, yeah, yeah.
                                         
                                        And this is, you know, the kind of like the industry knowledge.
                                         
                                        I don't actually know if they actually do it this way, right?
                                         
                                        Like, you know, I don't work for Tesla.
                                         
    
                                        But based on what I gather from, you know, talking to people, they,
                                         
                                        basically have a model with different heads basically they call like a hydranet whatever like
                                         
                                        the monster hydra so one head so they get the same all the sensor information gets processed by
                                         
                                        one's gigantic pre-trunk basically and this branch focuses on driving and another branch focuses on
                                         
                                        rendering things based on like so now you can already
                                         
                                        that it kind of gives you that feedback so you could in this if you apply this
                                         
                                        approach to judo we could have this gigantic trunk that process all the videos and then one
                                         
                                        had predicts the next 24 frames like you said and the other maybe could generate
                                         
    
                                        commentaries or like what why he thinks uh the throw will fail right which is okay the
                                         
                                        plausible so the problem is
                                         
                                        So far, the system, there's no guarantee that the explanation will match up with the actual frame prediction.
                                         
                                        Right, right, right.
                                         
                                        It could be completely different.
                                         
                                        Yeah, so it's, you know, there's, it's, uh, I mean, I think human brains kind of do this too, like, kind of like we do this one thing and then our, we make up the justification after, you know, like kind of thing.
                                         
                                        That is the guy who stole my car.
                                         
                                        Yeah, yeah, it's just like, yeah, exactly.
                                         
    
                                        And so maybe that's just a problem
                                         
                                        that can never be solved
                                         
                                        but
                                         
                                        I'm not saying
                                         
                                        this approach always fails
                                         
                                        Yeah
                                         
                                        Well that's the
                                         
                                        Optimization process right
                                         
    
                                        You're trying to reduce that error
                                         
                                        So that if you are trying to
                                         
                                        I mean I'm sure
                                         
                                        Tesla engineers and Tesla
                                         
                                        Put a lot of work to minimize that
                                         
                                        But yeah so you don't probably have to be
                                         
                                        something like that
                                         
                                        So like one head
                                         
    
                                        would predict the next frames
                                         
                                        and then another component
                                         
                                        will try to explain
                                         
                                        I suppose it would have a notion of distance
                                         
                                        like if you change the
                                         
                                        scenario a little bit possibly
                                         
                                        I don't know how you would do it
                                         
                                        but maybe it's like oh if he was one inch
                                         
    
                                        in this direction
                                         
                                        now what do you predict
                                         
                                        yeah then it's like the
                                         
                                        like you could jitter the
                                         
                                        jitter the input kind of right or you could do that or you could do you know the video generation
                                         
                                        models that are all text condition now so you could describe the change you like to make oh now we're
                                         
                                        yeah i mean so that's like kind of the dream right like yeah exactly i mean yeah i'm not this is actually
                                         
                                        i mean we we got we've gotten very good at generating videos with text as an input like a text
                                         
    
                                        condition video generation that's uh it's been there's been expulsion of things but
                                         
                                        you know, applying that to embodied air.
                                         
                                        The physics don't look right there.
                                         
                                        Yeah, exactly.
                                         
                                        I mean, that's a challenge, yeah.
                                         
                                        It's gotten a lot better.
                                         
                                        It's obviously gotten a lot better.
                                         
                                        But every now that you just see, like, the human eye is so good.
                                         
    
                                        You're just like, that is weird.
                                         
                                        Because we, yeah, we are extremely well trained in spotting these things out.
                                         
                                        So, I mean, that's the ultimate challenge.
                                         
                                        But you can kind of see this the way I'm going with this, right?
                                         
                                        Like a lot of these problems, like autonomous vehicles, robotics,
                                         
                                        even your idea about judo are kind of related i think ultimately uh if in order to build a
                                         
                                        robust system you should probably follow something like this but then again if you go into the
                                         
                                        practicality like is it even possible with judo i don't know it's you know like with the videos
                                         
    
                                        we have um right so to to to sum this up a little bit it sounds like okay the original idea
                                         
                                        with a physics-based model.
                                         
                                        It sounds like maybe technically, it might be possible.
                                         
                                        It just economically would never make any sense because you just would never be able
                                         
                                        to or the economic motivation to tackle this problem is just so much beyond probable
                                         
                                        demand for this kind of solution.
                                         
                                        On the other hand, we've got this sort of more AI-based, you know, text versus video kind of
                                         
                                        play but that's still it's it's sort of a little bit beyond a bleeding edge still yeah that's still
                                         
    
                                        i mean that yeah that was my dissertation you can still get a PhD with it that's pretty yeah
                                         
                                        yeah exactly it's not just something you just signed up for it's like yeah let's do this man
                                         
                                        you can still get a PC and the researching it so it's pretty bleeding edge and
                                         
                                        bleeding it yeah yeah and but there's a lot of money going into it i think that that's the thing
                                         
                                        I think a lot of these
                                         
                                        the research
                                         
                                        applied research
                                         
                                        the application of this type of research
                                         
    
                                        goes towards automas vehicles
                                         
                                        because there's money in it.
                                         
                                        There's more data.
                                         
                                        Because even Boston Dynamics, I mean, how many
                                         
                                        times have they been bought
                                         
                                        and sold by now? It's amazing
                                         
                                        technology, but it's just not
                                         
                                        making it. And then it's hard to find
                                         
    
                                        a use case. And that Boston Dynamics
                                         
                                        data, they actually don't use this
                                         
                                        type of method. They're more on the
                                         
                                        classical robotics it's a very math habit classical planning it's an algorithmic and yeah it's uh
                                         
                                        i know they but it looks so good it looks good but yeah it takes a lot of work to get there and yeah
                                         
                                        those people are i worked with some of those people in the classical planning and they're brilliant
                                         
                                        they're like brilliant mathematicians and yeah and but it's uh it it's completely different i'm not good at
                                         
                                        math, so I didn't even dream of, you know.
                                         
    
                                        The irony.
                                         
                                        Right.
                                         
                                        It's a, you don't, you don't require, yeah, they, they have to like.
                                         
                                        You're not good at PhD.
                                         
                                        Yeah.
                                         
                                        Yeah.
                                         
                                        But normal math.
                                         
                                        I'm pretty sure you're okay.
                                         
    
                                        I guess I could get.
                                         
                                        Everybody listening, do not believe this guy.
                                         
                                        That, that robotic, the math they do is like, you need, like, type, actual guarantee.
                                         
                                        You have to actually do a lot of drugs.
                                         
                                        Yeah, like, yeah, you know, you probably do, like, a lot of cocaine going in there, I guess.
                                         
                                        yeah right or something
                                         
                                        some shrooms
                                         
                                        something
                                         
    
                                        keep that
                                         
                                        keep that dream
                                         
                                        but yeah
                                         
                                        so I think that
                                         
                                        this is yeah
                                         
                                        but it is viable
                                         
                                        I think you know
                                         
                                        you can kind of see
                                         
    
                                        where this could go
                                         
                                        right like you know
                                         
                                        yeah
                                         
                                        basically
                                         
                                        a lot of the capabilities
                                         
                                        we've seen
                                         
                                        with chat chip BT along text
                                         
                                        and you know
                                         
    
                                        combine that with
                                         
                                        judo videos
                                         
                                        maybe something could happen
                                         
                                        I mean that would be
                                         
                                        the dream right
                                         
                                        like you just put in
                                         
                                        video of your match and it tells you it breaks down like all the significant situations and
                                         
                                        this is why you failed this is why you succeeded you know X degrees or you know whatever you were
                                         
    
                                        able to having a personal coach basically yeah yeah yeah yeah that would be the ultimate in your
                                         
                                        pocket kind of right right well actually I think about Chris round you know oh yeah yeah yeah yeah I've
                                         
                                        talked to him yeah and he had and he does all that like crazy scouting stuff for his athletes oh okay
                                         
                                        I mean, that would be a dream for him, right?
                                         
                                        Like, you feed in everybody's matches, you know, all their tendencies, you know,
                                         
                                        you know, like they like to go left here, they go right here, they, you know, like this
                                         
                                        Kosoto, they like this Daashi in these situations.
                                         
                                        And, you know, if you can stay on this angle, like, his job would be to create the drill
                                         
    
                                        to get them to respond, right?
                                         
                                        Like, they stay outside, like, this danger zone.
                                         
                                        Maybe that's the economic angle, scouting, then, whether that this coach.
                                         
                                        The thing is judo is always hard.
                                         
                                        because I feel like in other sports,
                                         
                                        you have this abstraction called a ball.
                                         
                                        Oh, yeah. Soccer, it would be a little easier, I guess.
                                         
                                        Yeah, soccer would be a good one.
                                         
    
                                        F-1 is already up to the kills in technology.
                                         
                                        I don't think they need anybody else's money.
                                         
                                        They just spend, they did burn sacks of money every second, it seems like.
                                         
                                        But these other sports with the ball, you have the separation.
                                         
                                        So some of these other approaches, I feel like, are more, they're easier in a way, right?
                                         
                                        But even then, if you go to like the Sloan analytics,
                                         
                                        sports analytics conference, you know, I haven't been recently, but, you know, there used to be a lot of talks
                                         
                                        on, like, machine learning and, you know, breaking down plays, you know, all that kind of stuff.
                                         
    
                                        And I'm sure people are using it, but I'm not sure it ever met the promise that people were hoping
                                         
                                        for.
                                         
                                        It's a, I forget what, man, I recently read something interesting.
                                         
                                        It basically, it classifies, like, why multiple works so well in certain.
                                         
                                        sports and not oh it was that like why bodyball why didn't moneyball work that well in soccer that
                                         
                                        was what it was and right it's because it's a very different yeah baseball it's easy to it's a
                                         
                                        very stats statistics heavy game right like it's easy turn base it's basically easy to keep track
                                         
                                        but soccer is very fluid there's no turn yeah uh because of that it's even the hard to
                                         
    
                                        you kind of have to have a feel for it you know
                                         
                                        it's so yeah i think moneyball the impact wasn't as great you know it's yeah it certainly did
                                         
                                        help to try to find cheap play cheap good underpriced players yeah it wasn't to the point where one
                                         
                                        team like you know like the athletics just kind of went to the world see yeah never happened like
                                         
                                        that i was going to put a challenge out yeah to the audience to all any of the uh ambitious software
                                         
                                        engineers out there. Yeah, I'm telling a lot of software engineers do
                                         
                                        BJJ judo now. Yeah. Yeah. Oh my God. We'll leave that for another
                                         
                                        conversation because I'm just, this is your
                                         
    
                                        mission, I guess, since you're in Silicon Valley. What is my mission? You can
                                         
                                        actually find out. Yeah. Oh, my mission. If it is, if it is getting
                                         
                                        big or not. I'm just curious. It's like anecdotally, I've heard that it's
                                         
                                        getting more popular in the My Circle book. Yeah. Yeah, which is
                                         
                                        very surprising to me because Rebecca,
                                         
                                        I started at, like, mid-2000s, like, wrestling and just, it was not a sport for...
                                         
                                        Attractive thing.
                                         
                                        Yeah, it was not an objective thing to do.
                                         
    
                                        Yeah, yeah, exactly.
                                         
                                        So it's like, I happen to be in Michigan, and wrestling is very popular in Michigan.
                                         
                                        That's why I, like, started doing it.
                                         
                                        Oh, did you grow up?
                                         
                                        You actually grew up there?
                                         
                                        Yeah, I went to high...
                                         
                                        I moved to Michigan when I was 15 from Korea.
                                         
                                        Oh, okay.
                                         
    
                                        So this is almost like you're going home.
                                         
                                        Yeah, yeah, exactly.
                                         
                                        Yeah, I still, yeah, yeah, like I could reconnecting with my high school friends and, yeah, all that, yeah.
                                         
                                        Oh, you're such a good guy.
                                         
                                        Yeah, so, yeah, I'm a Midwestern boy.
                                         
                                        Yeah, I was going to, yeah, stereotypical Midwestern boy.
                                         
                                        Who wrestled and now did you know, yeah.
                                         
                                        That's right.
                                         
    
                                        Yeah, I was going to put out a list of some of these open source packages.
                                         
                                        I guess I'll still put them into the, into the description and just sort of see.
                                         
                                        But there's just been.
                                         
                                        much development and just, you know, in machine learning and this modeling and now in this sort
                                         
                                        of neural networks and LLMs that you can't help, but, you know, and it's like, is it getting
                                         
                                        cheap enough and widespread enough and understood enough to apply this to not like going to the moon
                                         
                                        problems? You know what I mean? But it seems like we're not quite there yet, but you never know.
                                         
                                        There might be a clever solution out there. I wasn't like trying to say this is like,
                                         
    
                                        I wasn't trying to be like, oh, it's not there.
                                         
                                        Yeah, I think this type of constraint actually gives it's a fraud of ground for creativity.
                                         
                                        So maybe, you know, you have to kind of strike a balance.
                                         
                                        Of course, you can have this whole deal.
                                         
                                        Maybe if you really want to solve this, maybe this type of crazy approach, learning-based approach should be applied.
                                         
                                        But the reality is, you know, you have to consider the practicality.
                                         
                                        And maybe this type of your idea of kind of like using the.
                                         
                                        physics simulation as a crutch to get the model to some usable space.
                                         
    
                                        And then they get into that loop of like, you know, virtual cycle of improvement.
                                         
                                        Yeah, yeah, people can definitely start.
                                         
                                        You should start like that, yeah.
                                         
                                        Well, before we close things down, there is one for our software and technical people out there.
                                         
                                        There is one other idea that you could explore that is far more practical.
                                         
                                        know is possible right now and that is the butt yeah so if you imagine even someone like
                                         
                                        chantar right and this is something i'm actively messing around with right now shantaro has an
                                         
                                        inordinate amount of videos on youtube and he's also recorded his fair share of an instructional
                                         
    
                                        material both on its own side and uh for bjj fanatics and and such so uh the idea here would be
                                         
                                        create like a shin bot shin bot right that would uh encompass all the material he's ever put out
                                         
                                        that's sort of the idea and people are using these systems you know people call them rag systems
                                         
                                        or kag systems or whatever where you're using an lLM to essentially um aggregate i guess i mean it's not
                                         
                                        really you're doing the aggregating as a developer but you don't know what i'm saying it sort of
                                         
                                        aggregates all the information on the subject and you can ask you questions and all that kind of stuff
                                         
                                        and it'll sort of serve as you're like, you're a conigliary, you know, on a particular knowledge base of material.
                                         
                                        And so this is something that I'm just sort of messing around with, with Shintaro's material,
                                         
    
                                        just to sort of see how good you can get it.
                                         
                                        I would never, never try to simulate Chintaro himself because I think he's too unique for that kind of thing.
                                         
                                        But I could create sort of maybe an autistic, you know, assistant.
                                         
                                        Yeah, there's like a, that sort of knows everything.
                                         
                                        anything that he said and done.
                                         
                                        And it's sort of interesting.
                                         
                                        It's sort of working.
                                         
                                        Oh, really?
                                         
    
                                        What did you use?
                                         
                                        Oh, yeah, yeah.
                                         
                                        Right now I'm using, I think I'm using Gemini.
                                         
                                        Okay, okay.
                                         
                                        Right now, like this sort of, you know, standard stuff, you know.
                                         
                                        I've used, I've used all three of them, but, you know, just for the context window
                                         
                                        and stuff.
                                         
                                        I'm just using Gemini for convenience.
                                         
    
                                        Nice, nice.
                                         
                                        But, so there's that.
                                         
                                        But then to take it a step further, for old guys.
                                         
                                        like me who have some income, disposable income, I'm sure many of us have an embarrassingly large
                                         
                                        library of these instructions. And there are some instructors that you like and some of them
                                         
                                        that you like but are just, you can't bear to watch. I won't, you know, I won't name mine by name
                                         
                                        because I'm sure somebody else would love them. But you know who I mean. So, you know,
                                         
                                        imagine being able to synthesize and distill everything in your library.
                                         
    
                                        But it would be on demand just in time.
                                         
                                        He'd say, hey, I'm having problems from this position in half guard or whatever position.
                                         
                                        And you would have access to the exactly relevant material from your different instructions.
                                         
                                        Now, practically speaking, it is a lot of work because you've got to strip the audio.
                                         
                                        You got to, you know, some of these transcription packages are amazing.
                                         
                                        They just like zip, just an hour of audio in a second is transcribed.
                                         
                                        Pretty well, pretty well.
                                         
                                        So that's pretty amazing.
                                         
    
                                        So that's something that you may see.
                                         
                                        I'll invite you to the report.
                                         
                                        Oh, okay.
                                         
                                        Okay, I'll take a look.
                                         
                                        So you can take a look.
                                         
                                        Oh, so this official, it's like a repository and everything.
                                         
                                        Yeah.
                                         
                                        Oh, yeah, we got a repo, baby.
                                         
    
                                        Everything's got need to be organized in this house.
                                         
                                        Plus, it's a lot easier when it's digital.
                                         
                                        If you look at my room, there's socks everywhere.
                                         
                                        But that's sort of one idea to throw out to people who are technically
                                         
                                        technically united and very desperate and very desperate about their own
                                         
                                        technical development I bet I bet this could be a good addition to your
                                         
                                        resume if you if you want to like oh it's too late for my resume well now I
                                         
                                        bet you as it like audience oh yeah for sure yeah for sure yeah for sure and it
                                         
    
                                        would be you know and who knows you can open it up to your best buddies only
                                         
                                        your best buddies because you only you know you don't want to spread this stuff
                                         
                                        around but a lot of us own a lot of this material and uh it's just sitting there because we're
                                         
                                        just like i'm not watching this i'm not going to watch this again you know no way but if you could
                                         
                                        query it if you could um yeah query it at will why why would i ignore the half of my body and thing
                                         
                                        you know yeah yeah 50% of the body why ignore 50% of the body yeah and it's just hard man
                                         
                                        to watch hours and hours of material guys it's just hard just give me this
                                         
                                        the golden nuggets, for God's sake.
                                         
    
                                        Man's a serious academic.
                                         
                                        So he has to work within this tightly formulated framework, you know, where everything is
                                         
                                        well defined.
                                         
                                        Yeah.
                                         
                                        But the nice thing, too, would be like if you could take it a step further, just aside
                                         
                                        from just pure recall, right?
                                         
                                        If there are drills or exercises or stuff like that, you could combine that kind of
                                         
                                        information and, you know, create your own plan, create your own, you know, something tailored
                                         
    
                                        to you.
                                         
                                        Yeah, right.
                                         
                                        Yeah, the synthesis will be amazing.
                                         
                                        Yeah.
                                         
                                        I'm sure it'll happen.
                                         
                                        I mean, with just the base LLM, the knowledge already in LLM.
                                         
                                        The technology is, I think, is there with a little bit of extra, you know, spice.
                                         
                                        Yeah, yeah, yeah, yeah.
                                         
    
                                        You could do it.
                                         
                                        I think you could do it.
                                         
                                        Maybe this will be, inspire a lot of people to actually set up this judo bot or BGJ bot.
                                         
                                        Whatever.
                                         
                                        If you haven't gotten involved with this stuff for your own personal projects, people,
                                         
                                        even if you don't know anything about writing software, I highly encourage you to check it out
                                         
                                        because my wife, who is not technical at all, has written, I would say, like three different
                                         
                                        applications for our own use.
                                         
    
                                        Oh.
                                         
                                        Yeah, in the workplace, like her own sort of dashboards and stuff.
                                         
                                        I mean, with a little bit of coaching for me.
                                         
                                        Wow.
                                         
                                        Because I love my wife.
                                         
                                        When you're in a corporate environment, you don't need to worry about like millions of people,
                                         
                                        serving millions of people.
                                         
                                        You're just serving yourself and maybe a few of your colleagues.
                                         
    
                                        So you are the, you like that old saying about like you are not the user, you are not your customer.
                                         
                                        So you need to like think more broadly.
                                         
                                        In this case, it is absolutely true.
                                         
                                        You are the user and you know exactly what you need.
                                         
                                        You know?
                                         
                                        So I would highly encourage people to check it out.
                                         
                                        Yeah.
                                         
                                        That's my soapbox for today.
                                         
    
                                        thank you everyone for listening if you're even listening this far we'll talk to you again
                                         
                                        all right thanks guys thanks peter
                                         
