Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x20: Expanding ML Models Beyond Current Limits with Groq

Starting point is 00:00:00 Welcome to Utilizing AI, the podcast about enterprise applications for machine learning, deep learning, and other artificial intelligence topics. Each episode brings experts in enterprise infrastructure together to discuss applications of AI in today's data center. Today, we're discussing expanding ML models beyond the current limits with Dennis Apps of Grok. Hi, I'm Dennis Apps, the chief architect at Grok. Nice to be here. And I'm your co-host, Chris Grundemann. I'm a consultant, coach, and mentor. And you can check out everything I do at chrisgrundemann.com. And I'm Stephen Foskett, organizer of AI Field Day and publisher of Gestalt IT.

Starting point is 00:00:45 You can find me on Twitter and other social media networks at sfoskett. So, Dennis, one of the things that we've talked about quite a lot on the podcast in the last two seasons is the fact that AI models are expanding in terms of the amount of data that they're utilizing, as well as the reach of this data and location of this data. But it seems that there are natural limits to the size of ML models. In other words, there's a point where you can have too much data to make it work. What are those limits? How do we decide when an ML model is just too big? That's a great question. I think there's really two ways to approach it. And as we've seen in the last several years, it's been really this voracious growth of models as they get trained. And specifically, for example, natural language

Starting point is 00:01:45 processing models like GPT-2 and GPT-3 are great examples, where the corpus of data and the number of parameters is in the hundreds of billions, and where the amount of time it takes to train the model like that is measured in weeks and millions of dollars to train it in terms of the energy costs, just the energy costs, not to mention the scientist cost and the fine tuning and all the things that go into getting it right and the firefighting that go into bringing a model to convergence so that you can deploy it in the cloud. And by deploying it, I mean you can actually take that model now and the parameters that you've learned and actually put it to good use. And that deployment process is often one that

Starting point is 00:02:32 it can be very expensive, right? So there's this process of distillation that we go through and it ends up being kind of a seesaw, like a sawtooth kind of waveform where the models are really big, and then we shrink them with distillation, and then we train new models that are even bigger, and then we shrink those. And that process has been kind of this almost like simulated annealing that we've been going through over the last several years, as we increasingly grow models into the trillions of parameters and what that means for increasing amounts of data. And that's really some of the limits of scalability. And we see a number of big systems today.

Starting point is 00:03:21 Right now, there's a system that Nvidia makes that's called Selene. They build it as a reference, basically a reference supercomputer. And that serves as an example of a large scale system that one could build to be able to do training on it. And right now, Selene, for example, I believe is rated one of the top five supercomputers in the world based on just you know, just its performance

Starting point is 00:03:46 on LINPACK. But it's really kind of, it demonstrates just the onerous, the amount of computational power that you need to bring to bear in order to train some of these really large scale models. So you talked about both, you know, size of the model, the model in bits and bytes, or nice example where we can train this model, and then to use it, you actually have to infer on it. And it gets used in a variety of places. But for example, just using it to do email completion. For example, if you're typing an email, and it's going to complete the remainder of the sentence, and that's a query into the predictive model. And if that, for example, if you do that inference

Starting point is 00:04:48 and if it takes eight seconds for that inference to complete that's not very practical if you're trying to actually do something in real time or quasi real time to be able to use it. So in other words, there's some scalability issues that creep into the deployment of these models. And those usually give rise as a result of service level agreements and that the cloud provider or the

Starting point is 00:05:12 application provider has some goals about the response time that they want to be able to deliver. And that those kinds of scal and and computational complexities bump into those kinds of service level agreements and and for example if if it takes uh you know if it takes eight seconds for it to complete a a a sentence in your completion it makes it really impractical to use that as a as a a large-scale deployment and a usable tool right? So there's been more interest in scalable deployment on the inference side. In fact, if you've seen the keynote from GTC from Jensen, there's one of the areas that they're growing into is essentially multi-GPU inference to try to reduce the response time. In other words, when you scale your application, there's a number of ways you can scale it. You can scale it

Starting point is 00:06:12 so that you want to do larger problem sizes. In other words, kind of strong scaling. You want to take the amount of work that you do and increase it so that you can get more work done. Or you can try to scale it up so that you can get more work done. Or you can try to scale it up so that you can get more work done per unit of time. In other words, you want to get more throughput. And so there's a variety of ways you can kind of try to scale your problems so that you can fit usually within some fixed time horizon, as an example. I think many people would have assumed that this conversation was just going to go straight to deep learning and talking about training. But it's interesting

Starting point is 00:06:51 that you're going to inferencing. How large are we going to see the models on the inferencing side get in practical respects? Because the things that you're bringing up are going to be true almost no matter what the infrastructure is. I mean, we've all thought about this with, you know, AI in the cloud where, you know, there's only so much data that you can transfer. There's only so much, you know, latency that you can tolerate. So that does seem to put a limit on the size of inferencing. That's right. And for a very practical reason, these tend to be kind of like liquid. Liquid fills the size of the container.

Starting point is 00:07:30 Data is the same kind of thing. It fills the size of the GPU or the TPU in which it's running, right? And so these kinds of things will be deployed. And the container, so to speak, is essentially the memory footprint that we can fit in that model. For example, if it's 40 gigabytes of GPU memory that you have to fit in, that's kind of the container you have to live within. And so there's an element of practitioners and model designers are have to live within the um the containers that they have and in a sense you go to war with the the weapons you have not the ones you want right ideally we'd

Starting point is 00:08:12 love to have you know hundreds of gigabytes available on each node uh but in reality that's just not the case and and so given those kind of constraints with smaller memory footprints that really brings to bear uh the size of models that you can really deploy. And deploying models is really a scalable question, a question of scalability. Because when you deploy the model, you're going to be deploying it, for example, if it's a translation model, you're going to be deploying it in different languages, English, French, German, and so forth. And that takes up compute resources, as well as model designer resources, people times, site reliability engineers, and so forth. And as we've discussed quite a few times as well, we're starting to see models being deployed from an inferencing perspective on all sorts of hardware and all sorts of locations. So we're not just talking about,

Starting point is 00:09:10 you know, 40 gigabyte GPUs in the data center here. I mean, we're talking about smaller devices, we're talking about edge computing, we're even talking about mobile devices. I mean, is that the topic that you're bringing up? Are you really focused on the sort of centralized processing? So it's a great point in that scalability really affects a variety of things, right? As a model designer, if you want to bring to bear a new model and deploy it at the edge, well, that might be anything that's running an ARM processor from a printer, for example, an embedded processor in a printer or Raspberry Pi at the low end, all the way up to a supercomputer or a cloud instance, a large memory cloud instance. So

Starting point is 00:09:51 there's a variety and a wide breadth and scale of different hardware options on which these models are deployed. And often these kind of state-of-the-art models really get deployed on state-of-the-art hardware, right? These are really very typically computationally intensive kind of problems. And the rule of thumb I typically use is for training, it's generally 8 to 10 times more footprint in terms of model complexity and model size just to carry all the intermediate gradients that are part of the model. Because after all, it is gradients that make the world go around. It's not money that makes the world go around. It's gradients that makes the world go around. Or so I tell my daughters anyway. And so it's really kind of a fascinating spot that we find ourselves in building these models and then deploying them in all sorts of different areas. And to your point, there's places for which it just doesn't make economic sense to deploy it at scale on some form factors and some platforms where it just simply

Starting point is 00:11:05 may not make sense given the requirements and given the constraints of that edge configuration, for example. Yeah, that's certainly fascinating. And to both of those points, it sounds like in this case anyway, in all cases, right, in many cases, anyway, scalability is both scaling up and scaling down. And there's a limit to both. I've definitely, I've personally never worked in an environment that was completely unconstrained resource-wise, right? There's always been some kind of limit, whether it was dollars or space or cooling or power or something that was a limiting factor. I'm sure there may be some deployment somewhere that are beyond that, but I've never built one. So within that, I mean, obviously some of those things are some of the factors that are obviously challenges with scaling both up or down.

Starting point is 00:11:53 Maybe you could talk a little bit more about other current challenges with scalability in general. Yeah, we find ourselves in this brave new world of what I'll call converged HPC, which is really, I think of it as take all your conventional high-performance computing type of codes. I mean, you know, old things that are built on Fortran 77, all the way up to more modern climatology codes and COVID models, for example. And there's a lot of discussion in the HPC community around, you know, really how expensive that 64-bit data type is. And really this converged high-performance computing is really looking at HPC from the perspective of what can you do with these lower resolution data types that are, for example, 8-bits or 16-bits wide to supplant or act as a surrogate or proxy for these wider data types,

Starting point is 00:12:47 these 64-bit values, where if you need the dynamic range, that's fine, and they're there, and you can use them. But if you don't, if you really don't need all the precision that those 64 bits are delivering you, there's a very real performance cost, power cost, and area cost of deploying models with those kinds of data types. And so what we're finding is that instead of, for example, if you're doing finite element analysis, or if you're doing, for example, non-destructive detonation of a bomb, and you're computing all the Navier-Stokes, you're computing the Navier-Stokes equations to compute all the different elemental interactions of all the particles is very computationally expensive. But we're bringing to bear machine learning models in a way that allows us to predict what the end result of,

Starting point is 00:13:37 for example, an explosion might be. So you don't have to do all the part-by-part finite element analysis. You can just kind of go to the punchline. And that's really an exciting thing as we move forward with this converged HPC. We're going to see several orders of magnitude speed up in what I'll call classical simulation or conventional simulation tools where you're able to use both conventional classical methods of that as well as machine learning models where you're predicting where to go to next, for example. A good example of this is state space exploration where you might be exploring the state space of a fluid dynamics model and in one part of that state space it's metastable. So you don't want to go into that part.

Starting point is 00:14:26 And so you can use your predictive models, your machine learning models as a sort of scout to make sure that you're exploring the state space in a productive way. And you don't end up encroaching in some strange little corner of the model that pushes you into a metastable weird state where, for example, the physics break down and they don't work anymore. So there's all sorts of really creative ways and neat ways that are being brought to bear that I think land squarely in this area of converged HPC that certainly admit to the scalability problems. Obviously, you know, large HPC machines that are national resources like Oak Ridge's Summit supercomputer, those are tens of thousands of nodes, and they're national resources with many NSF scholars using them for a variety of computational science from biology to COVID and drug research, for example. So it's really an exciting time,

Starting point is 00:15:26 not only from machine learning, but from high-performance computing and really this intersection, if you will, this Venn diagram over where those two intersect. As you can imagine, there's quite a lot of commonality between the two. It's really interesting what you're saying about the sort of collision of HPC

Starting point is 00:15:43 with artificial intelligence, because we've certainly seen a collision from HPC into enterprise infrastructure as well. So you mentioned Summit, for example. Summit showed the architecture that has now been deployed by NVIDIA Grace in terms of creating an NVLink linked CPU sidecar for GPU machine learning tasks. And I think that it is sort of indicative of how the entire industry is moving, not even just artificial intelligence industry, but the entire enterprise compute industry. We've been talking, for example, about the CXL technology, which will allow us to scale compute resources outside the box and so on. But it seems sort of a paradox because on the one hand, we are figuring

Starting point is 00:16:33 out ways of bringing HPC to bear to make our computers bigger. But on the other hand, we're actually taking the same HPC concepts to build, I don't know, smaller, bigger compute, if you kind of get where I'm going with that. So, you know, you've got Summit with all its, you know, thousands and thousands of GPUs, and then we're taking that architecture and building something with tens of GPUs. So it seems really interesting

Starting point is 00:17:01 that we're building bigger systems in a smaller way. We're sort of smaller instantiations of bigger systems. And one of the things, too, that you brought up was this whole idea that we're starting to discover that some applications perhaps don't need the precision that we had been using previously. We've seen studies of that. And we, of course, heard from companies like Intel and recently BrainShift talking about how you can use lower precision. Does this sort of put a monkey wrench in your scalability claims? Could it be that we're going to solve the scalability problem simply by using lower precision ML on smaller systems? Very good points. In fact, you're right. We capture that the additional precision that we get

Starting point is 00:17:49 comes from the hidden state, the hidden layers in the model, right? So you can think of it as really a trade-off. We're not losing that precision necessarily. We're replacing it with different types of state that capture the essential qualities of what that quote variable would be capturing if it were in a 64-bit data type. But you're absolutely right in that we're building these large heterogeneous supercomputers, but we're also, we have a need to

Starting point is 00:18:21 carve those up into smaller units of, call it smaller granularities, smaller atomic units of compute that we can deploy at a smaller scale, but more of them. For example, when you do deploy your model, you're going to want to deploy multiple instances of it so that you can have throughput, obviously. And that it takes cost and power and resources to be able to do that as you point out. So yeah, there is definitely kind of this, these push and pull kind of compromise and trade offs we'll call it for lack of a better word.

Starting point is 00:19:00 And this is where different architectures can shine and show their ability to scale. You kind of pointed out the NVIDIA GPUs more recently have the ability to scale up and to create larger, what's called a shared address space, a PGAS address space, a partitioned global address space that they all share. And that allows them to be able to essentially do that, you know, what I talked about earlier, biting off more, being able to process a larger them to be able to essentially do that, you know what I talked about earlier biting off more being able to process a larger chunk and be able to do more capable things.

Starting point is 00:19:32 It reminds me of, of, you know, one of the expressions that Jim Smith was a computer architect at Cray research. He used to say if you have a vector problem, you're always best served to do is to use a vector processor on a vector problem. And're always best served to use a vector processor on a vector problem. And the same thing really applies for a tensor problem. If you have tensors, you want to use a tensor processor for the same reasons that make graphics processors really good at doing graphics, right? If you're pushing polygons around a screen, there's nothing better than a graphics processor for that. So it really brings to bear some really interesting architectural tradeoffs that you see in the marketplace across different hardware providers, different system architectures.

Starting point is 00:20:16 For example, the Grok system allows you to build a multiprocessor from their chips just by connecting them up. It's kind of a glueless multiprocessor in that there's no additional hardware is required. In other words, it doesn't use an additional external switch or anything like that. It literally uses just the chips that are included in the system. So there's no additional complexity, so to speak, of building out these systems. Like, for example, NVIDIA uses NVLink, and they have larger, you know, fat tree topologies and fat tree networks that you can make from that. And all of those introduce additional complexity in the system complexity, as well as it reduces the reliability. And what I mean by that is, as you start to kind of grow these systems,

Starting point is 00:21:07 you can think of it as they become a little bit brittle in that you have to keep everything functioning at the same time. I think of it as, you know, it's kind of like trying to pull the plow. If you're trying to pull a plow, do you want to pull it with one big ox or do you want to try to pull it with 7,000 chickens? And when you're working at scale, right, you have 7,000 chickens and you have to keep them all pulling and synchronized in the same direction. And that's difficult. And you start to bump into what I call Amdahl's law effects, right, where you start to hit diminishing returns and and it's one thing to uncover some thread level parallelism it's totally something different to to uncover 7 000 way thread

Starting point is 00:21:52 parallelism or you know uh 7 000 way on just a single chip for example and then you put multiples of these chips together now you're suddenly looking around trying to discover 70,000 or 100,000 way parallelism, that becomes very, very untenable from a scalability standpoint. So there's some first principles kind of here to really look at how does the system scale and how do you make it so that when you scale from one to 64, that you're getting good kind of linear scaling at those system sizes. Yeah, I appreciate that. I love that metaphor, because that is kind of what I was trying to get at. Like, you know, we're hearing sort of conflicting messages from the industry, with some people saying, oh, it's not a problem, we'll just lash together a bunch of tiny nodes or

Starting point is 00:22:42 chickens, and other people saying, no, no, no, no, no, we need big, strong oxen here in order to pull this forward. Is it a matter, is it just simply a question of the data types being used? I mean, are there some data types, for example, that are going to need an ox instead of a bunch of chickens? That's a great question. Yeah, unfortunately, there's some truth to that, right? Larger data types like FP32 and FP16 or BFloat16. You just think about the energy consumption of an 8-bit ALU operation compared to a 16-bit and a 32-bit operation. If you look at an AD instruction, and this is really into the

Starting point is 00:23:26 weeds, so I apologize, but I think it's important. If you look at this kind of an eight-bit versus 16-bit, even though the data type is twice as wide, it's quadratically more complex in terms of its area and power consumption. In other words, when you plop down an FP16 ALU, it's four times as expensive as plopping down an 8-bit ALU. And an FP32 is, again, double that. And so these things have a way of really compounding and adding complexity and area to the design and ultimately to the system in which they're operating. And we see a lot of these trade-offs happening in practice, right? Tense torrent, NVIDIA supporting 4-bit data types. We're going down toward the low end. It reminds me of a lecture Richard Feynman gave that was titled, there's plenty of room at the bottom, right? And that kind of applies to data types as well. Of course, that wasn't what he was talking about

Starting point is 00:24:26 at the time. He was talking about nanotechnology and really exploring the spaces of quantum effects and what has emerged from that. But really it applies to data types as well. And that's kind of the fun thing. If you take this to its absurdity, to its extreme, you can go down to a single bit, right? Binary net.

Starting point is 00:24:48 You can take this down to just using essentially the sign bit as the encoding. And surprisingly, it works, right? It's all part of a trade-off. But those kinds of things are really, you know, there's a lot of hyperparameters that are available to the model designers, and sometimes it's a little overwhelming. You almost want to have kind of a Costco model where you walk in and you only have four choices to pick from, not 4,000, where you can tune your hyperparameters in a way that is fairly straightforward and reasonable to reason about the correctness and the performance of the model. So it was a while back, but something you said really struck

Starting point is 00:25:36 me. And as we're talking about data types and things, what part of my head jumps to use cases and how we actually use these data models. And there was a phrase you used, I think it was non-destructive bomb detonation. And I just want to take a second to roll back to that, because I think you're just talking about simulating explosions, right? But the turn of phrase was really interesting. And I wonder if there's a spectrum of use cases that follow along, what can potentially use sine wave encoding versus FP32? And then kind of what are the barriers there? What's that spectrum look like from real world enterprise applications? Yeah, and it's a great question. And I think it's actually an ongoing and fertile area of

Starting point is 00:26:16 research, right, is because we're trying to discover and find the new areas in which we can really apply these new techniques. For example, using GAN models, GANs can be used to take a 2D picture. For example, I forget what NVIDIA calls this technique, but they take a 2D picture of it, and they basically render it as a 3D model in the environment. And it allows you to kind of fake a bunch of things and interpolate a bunch of results. And that's really bringing to bear new ways of applying this

Starting point is 00:26:56 and new ways of applying physics rendering and exploring the way that we think about and really model the world around us, right? Instead of modeling the world around us using the classical Newtonian physics, we're going to be teaching a model how it works. And that's our ground truth is empirical physics, right? It's kind of like experimental physics is really the ground truth. Everything kind of stems and emerges from that in the same way. And I think that's really kind of a nice regime to be operating in, and that is your ground truth. And that's kind of everything else that we see as part of our reality is really the emergent property of those fundamental rules. And we see that with AI and training with these large models. It's really, you know,

Starting point is 00:28:00 this whole concept of software 2.0 is really kind of an exciting, you know, time to kind of see where its niche-y applications are as well as its broad applications. And the example of kind of see where it's where it's niche applications are as well as its broad applications and and the the example of kind of bomb detonation as as one example um really applies to a bunch of different physical spaces anywhere where you have a physical representation of the world that you want to take it and decompose it into its three-dimensional components. And I think a good example of this is, I think, climatology, right? We take climatology code, we break up the globe, we break up the world into its 3D components, we model everything in terms of its pressure, temperature, and volume, and its various physical properties.

Starting point is 00:28:40 And we see that same kind of thing, and we can bring to bear these same kind of techniques to be able to simulate climatology. And I'm really quite excited about what machine learning can bring to bear for climatology. And for example, one exciting application that I'm looking forward to, we've had a lot of wildfires in the United States. And one thing I think is an exciting application of AI is the old proverb, if there's smoke, there's fire. Well, we can detect this smoke using satellite imagery and then be able to run a machine learning model to actually do smoke detection. Think of it as a global scale smoke detector, right? And it's kind of your early warning way of being able to hone in and pinpoint

Starting point is 00:29:29 where the hotspot is literally in this case. And I think we're running into all sorts of new applications where not only does it allow us to be a better guardian of the globe, better steward of our resources, but be able to improve our manufacturing processes by bringing to bear some of these techniques that we apply at a large scale. We can apply them at smaller scales in manufacturing, for example, looking for defects in a bottling plant,

Starting point is 00:29:58 for example. So there's a lot of different ways to kind of replace the physical processes that would have been computed with these new machine learning models that provide us with really many of those same benefits, but are really same great taste, less filling is kind of the way to look at it. That's a very Wisconsin thing. It seems, so to me, the takeaway from this conversation is essentially that all these technologies, whether it's HPC or bigger, better, badder tensor processors or distributed architecture or reduced precision or ML, tuning ML, all of these things, the takeaway is that we're going to be using more data, bigger data, bigger models, and we're gonna be able to tackle things that we never thought we could tackle before,

Starting point is 00:30:58 like global climate prediction and things like that, that we're just too big, but we're building systems that can process them. Is that the takeaway that I should get from this conversation? Yeah. I mean, there's several good analogies about data is like the petroleum or the oil resources and that labeling of data is like the refinery, right? And those who have the resources here, the oil and the ability to refine it, are in a unique position to take advantage of that energy density, so to speak.

Starting point is 00:31:31 And that's a great example because I think that if you look at the forerunners and the thought leaders in this space, it's research communities like Google, DeepMind, and Microsoft Research that have these huge corpus of data, and they have the ability to test out their ideas at scale before they deploy them. And that's really, really, like I said, that the data and having access to data at the fingertips is really the powerful component. And that allows you to take advantage of, in this new software 2.0 kind of world, to be able to build better models, train better models, to be able to deploy those models at scale.

Starting point is 00:32:16 And that is a really strategic advantage. Well, I do appreciate that. And I think that that's actually a great way to leave the conversation. We do have to wrap up here. But before we go, we do have a fun little tradition of asking our guests to answer three questions for which they are totally unprepared. And so, Dennis, it's your turn. And I have a few questions here.

Starting point is 00:32:40 Now, they're pretty much the same questions week to week. But I do try to shuffle them up and pick some that are appropriate or, you know, at least something that I would get a fun answer for. So, Dennis, it's time. As a reminder, there's no right or wrong answer. It's just sort of a let's predict the future here. So, here we go. First things first. In terms of assistance, you did mention, you know mention voice assistance and latency and things like that. So how long will it be until we have a conversational voice-based AI that can pass the Turing test and fool an average person into thinking they're talking to another human? Wow, great question. Conversational AI is taking some great

Starting point is 00:33:26 strides forward. I think Jarvis is the technology that NVIDIA calls their new conversational chat bot kind of technology. And it's really bringing to bear new techniques, including tone and inflection and sentiment. Instead of sentiment analysis, you can think of it as sentiment expression, expressing sentiment through voice. And that's making it even more difficult to be able to detect kind of subtle intonation that only comes from, frankly, from human vocal cords, right? And that kind of subtle inflection. Well, those kinds of micro expressions and subtleties are now starting to be able to be expressed

Starting point is 00:34:12 and controlled through AI. And I think it's becoming increasingly difficult and we will see more technology and more models that are essentially trying to ferret out the deep fakes, right? And the better this gets, the more difficult it becomes. So bringing to bear some new techniques to be able to detect and, you know, identify ground truth so that we as consumers of these things can have confidence that what we're looking at is indeed, you know, either generated from the real world surroundings, or if it's just a figment of some, you know, machine learning model. two years to have really the kind of sentiment analysis and sentiment expression that will make

Starting point is 00:35:08 that type of conversational chatting with AI, almost imperceptible in terms of the ability to tell if it's real or if it's Memorex, I guess to borrow an expression from the 80s. All right. New question here. Will we ever see a Hollywood-style fake mind? In other words, Mr. Data or a similar AI? Great question. So will we ever bring to bear kind of an android, so to speak, that would be able to kind of articulate and encompass? I actually think that we will. And I think there's strides happening in the neural interface area that will allow us to take those steps perhaps sooner than we ought to. But I think we will see tighter integration with that kind of computer brain interface.

Starting point is 00:36:15 And largely, I think the initial applications will be medically inspired and use cases will be for, for example, for treating Parkinson's or epilepsy or other aspects of degenerative diseases that can be staved off through technology applications. So I do believe that within the next five years, we will have that tighter integration. And by that time, we may see Elon Musk walking around with a third eye or something like that to kind of encompass that. All right, final one. We talked big, so let's talk small. How small will machine learning get? Can we have ML-powered household appliances?

Starting point is 00:37:03 How about toys or even disposable ML devices? Fantastic question. So when your toothbrush has ML embedded in it, I think we've arrived, right? You take all the ubiquitous things that you do in life. And when your comb tells you that you've only, you know, brushed your hair 13 times rather than the traditional 27 times on average, then maybe we've got problems as a society. But I think it's certainly getting to that point, right? I bought a toothbrush that actually has AI in it and that they, which is a little scary, right?

Starting point is 00:37:41 They actually characterize your brushing habits, your tendencies, and they, in a sense, kind of turn your mouth into a set of features that they can look at. And that's a little alarming at some point, but I think that that's just, that's coming, right? And before long, we will have them integrated into our glasses.

Starting point is 00:38:02 And in fact, we see this today, augmented glasses with a variety of electronics embedded in them. And these will become our kind of viewport into the world. And they will be not only recognizing and identifying objects for us, but they'll be able to extract technology and be able to extract details that we, our brains may not be able to detect, for example, infrared or other electromagnetic frequencies that we can't in endemically pick up. So I think it's an exciting new area

Starting point is 00:38:35 that we will see greater and greater integration of machine learning into smaller and smaller form factors to the point where you're gonna to say, hey, Siri, and your glasses are going to wake up and respond to that. You heard it here, folks. First, folks, your mouth is a feature store. Get ready for it. That's right. So thank you so much, Dennis, for joining us today on Utilizing AI. Where can people connect with you and follow your thoughts on enterprise AI and other topics? Look me up on LinkedIn. You can find me at Dennis Apps, A-B-T-S on LinkedIn. And that's my preferred social media outlet. So I look forward to connecting with you on that. How about you, Chris? What have you been up to lately?

Starting point is 00:39:23 Yeah, lots of fun stuff. I also enjoy having conversations on LinkedIn. So find me there. You can also follow me on Twitter at Chris Gunderman. Or for all things that I've been working on and doing lately, chrisgunderman.com always has the latest. All right. Very good, Chris. Thank you. And you can find me at techfieldday.com. And you can find my writing at gestaltit.com. And as I mentioned, you can also connect with me on most social media networks at SFOSKIT. One thing I will call attention to is next week is AI Field Day number two.

Starting point is 00:39:53 So if you join us on Thursday and Friday, the 27th and 28th, you'll see some great presentations about enterprise tech and enterprise applications of artificial intelligence. So please do head over to techfieldday.com next week and check that out. So thank you for listening to Utilizing AI. If you enjoyed this discussion, please do remember to subscribe, rate, and review, as always. And please, of course, share this discussion with the rest of the AI community.

Starting point is 00:40:24 This podcast is brought to you by gestaltit.com, your home for IT coverage from across the enterprise. For show notes and more episodes, go to utilizing-ai.com, or you can find us on Twitter at utilizing underscore AI. Thanks, and we'll see you next time.

Your Ad Here

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x20: Expanding ML Models Beyond Current Limits with Groq

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.