Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x20: Expanding ML Models Beyond Current Limits with Groq
Episode Date: May 18, 2021Machine learning models have grown tremendously in recent years, with some having hundreds of billions of data points, and we wonder how big they can get. How do we deploy even bigger models, whether ...it’s in the cloud or using captive infrastructure? Models are getting bigger and bigger, then are distilled and annealed, and then grow bigger still. In this episode, Dennis Abts of Groq discusses the scalability of ML models with Stephen Foskett and Chris Grundemann. HPC architecture and concepts are coming to the enterprise, enabling us to work with unthinkable amounts of data. But we are also reducing precision and complexity of models to reduce their size. The result is that businesses will be able to work with ever-larger data sets in the future. Three Questions How long will it take for a conversational AI to pass the Turing test and fool an average person? Will we ever see a Hollywood-style “artificial mind” like Mr. Data or other characters? How small can ML get? Will we have ML-powered household appliances? Toys? Disposable devices? Guests and Hosts Dennis Abts, Chief Architect at Groq. Connect with Dennis on LinkedIn or on Twitter @DennisAbts Chris Grundemann, Gigaom Analyst and Managing Director at Grundemann Technology Solutions. Connect with Chris on ChrisGrundemann.com on Twitter at @ChrisGrundemann. Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett. Date: 5/18/2021 Tags: @SFoskett, @ChrisGrundemann, @DennisAbts, @GroqInc
Transcript
Discussion (0)
Welcome to Utilizing AI, the podcast about enterprise applications for machine learning,
deep learning, and other artificial intelligence topics. Each episode brings experts in enterprise
infrastructure together to discuss applications of AI in today's data center. Today, we're
discussing expanding ML models beyond the current limits with Dennis Apps of Grok.
Hi, I'm Dennis Apps, the chief architect at Grok. Nice to be here.
And I'm your co-host, Chris Grundemann. I'm a consultant, coach, and mentor.
And you can check out everything I do at chrisgrundemann.com.
And I'm Stephen Foskett, organizer of AI Field Day and publisher of Gestalt IT.
You can find me on Twitter and other social media networks at sfoskett.
So, Dennis, one of the things that we've talked about quite a lot on the podcast in the last two seasons is the fact that AI models are expanding in terms of the amount of data that they're utilizing, as well as the reach of this data
and location of this data. But it seems that there are natural limits to the size of ML models.
In other words, there's a point where you can have too much data to make it work. What are
those limits? How do we decide when an ML model is just too big?
That's a great question. I think there's really two ways to approach it.
And as we've seen in the last several years, it's been really this voracious growth of models as they get trained.
And specifically, for example, natural language
processing models like GPT-2 and GPT-3 are great examples, where the corpus of data and the number
of parameters is in the hundreds of billions, and where the amount of time it takes to train
the model like that is measured in weeks and millions of dollars to train it in terms of the energy
costs, just the energy costs, not to mention the scientist cost and the fine tuning and
all the things that go into getting it right and the firefighting that go into bringing
a model to convergence so that you can deploy it in the cloud.
And by deploying it, I mean you can actually take that model now and the parameters
that you've learned and actually put it to good use. And that deployment process is often one that
it can be very expensive, right? So there's this process of distillation that we go through and
it ends up being kind of a seesaw, like a sawtooth kind of waveform where the models are really big,
and then we shrink them with distillation, and then we train new models that are even bigger,
and then we shrink those. And that process has been kind of this almost like simulated
annealing that we've been going through over the last several years, as we increasingly grow models into the trillions
of parameters and what that means for increasing amounts of data.
And that's really some of the limits of scalability.
And we see a number of big systems today.
Right now, there's a system that Nvidia makes that's called Selene.
They build it as a reference,
basically a reference supercomputer.
And that serves as an example of a large scale system
that one could build to be able to do training on it.
And right now, Selene, for example,
I believe is rated one of the top five supercomputers
in the world based on just you know, just its performance
on LINPACK. But it's really kind of, it demonstrates just the onerous, the amount of
computational power that you need to bring to bear in order to train some of these really large scale
models. So you talked about both, you know, size of the model, the model in bits and bytes, or nice example where we can train this model,
and then to use it, you actually have to infer on it. And it gets used in a variety of places. But
for example, just using it to do email completion. For example, if you're typing an email,
and it's going to complete the remainder of the sentence, and that's a query into the
predictive model.
And if that, for example, if you do that inference
and if it takes eight seconds for that inference to complete
that's not very practical if you're trying
to actually do something in real time
or quasi real time to be able to use it.
So in other words, there's some scalability issues
that creep into the deployment of these models.
And those
usually give rise as a result of service level agreements and that the cloud provider or the
application provider has some goals about the response time that they want to be able to
deliver. And that those kinds of scal and and computational complexities bump into those kinds of
service level agreements and and for example if if it takes uh you know if it takes eight seconds for
it to complete a a a sentence in your completion it makes it really impractical to use that as a
as a a large-scale deployment and a usable tool right? So there's been more interest in scalable
deployment on the inference side. In fact, if you've seen the keynote from GTC from Jensen,
there's one of the areas that they're growing into is essentially multi-GPU inference to try to reduce the response time. In other words,
when you scale your application, there's a number of ways you can scale it. You can scale it
so that you want to do larger problem sizes. In other words, kind of strong scaling. You want to
take the amount of work that you do and increase it so that you can get more work done.
Or you can try to scale it up so that you can get more work done. Or you can try to scale
it up so that you can get more work done per unit of time. In other words, you want to get more
throughput. And so there's a variety of ways you can kind of try to scale your problems so that
you can fit usually within some fixed time horizon, as an example. I think many people would have
assumed that this conversation
was just going to go straight to deep learning and talking about training. But it's interesting
that you're going to inferencing. How large are we going to see the models on the inferencing side
get in practical respects? Because the things that you're bringing up are going to be true almost no
matter what the infrastructure is. I mean, we've all thought about this with, you know, AI in the cloud where,
you know, there's only so much data that you can transfer. There's only so much, you know,
latency that you can tolerate. So that does seem to put a limit on the size of inferencing.
That's right. And for a very practical reason, these
tend to be kind of like liquid.
Liquid fills the size of the container.
Data is the same kind of thing.
It fills the size of the GPU or the TPU
in which it's running, right?
And so these kinds of things will be deployed.
And the container, so to speak, is essentially the memory footprint that we can fit in that model. For
example, if it's 40 gigabytes of GPU memory that you have to fit in, that's kind of the container
you have to live within. And so there's an element of practitioners and model designers are have to live within the um the containers that they have
and in a sense you go to war with the the weapons you have not the ones you want right ideally we'd
love to have you know hundreds of gigabytes available on each node uh but in reality that's
just not the case and and so given those kind of constraints with smaller memory footprints
that really brings to bear uh the size of models that you can really deploy.
And deploying models is really a scalable question, a question of scalability.
Because when you deploy the model, you're going to be deploying it, for example, if it's a translation model, you're going to be deploying it in different languages, English, French,
German, and so forth. And that takes up compute resources, as well as model designer resources,
people times, site reliability engineers, and so forth.
And as we've discussed quite a few times as well, we're starting to see models being deployed from an inferencing perspective on all sorts of hardware and all sorts of locations. So we're not just talking about,
you know, 40 gigabyte GPUs in the data center here. I mean, we're talking about smaller devices,
we're talking about edge computing, we're even talking about mobile devices. I mean,
is that the topic that you're bringing up? Are you really focused on the sort of centralized processing?
So it's a great point in that scalability really affects a variety of things, right?
As a model designer, if you want to bring to bear a new model and deploy it at the edge,
well, that might be anything that's running an ARM processor from a printer, for example,
an embedded processor in a printer or Raspberry Pi at the low end,
all the way up to a supercomputer or a cloud instance, a large memory cloud instance. So
there's a variety and a wide breadth and scale of different hardware options on which these models
are deployed. And often these kind of state-of-the-art models really get deployed on state-of-the-art hardware, right?
These are really very typically computationally intensive kind of problems.
And the rule of thumb I typically use is for training, it's generally 8 to 10 times more footprint in terms of model complexity and model size just to carry all the intermediate
gradients that are part of the model. Because after all, it is gradients that make the world
go around. It's not money that makes the world go around. It's gradients that makes the world go
around. Or so I tell my daughters anyway. And so it's really kind of a fascinating spot that we find ourselves in building these models and then deploying them in all sorts of different areas.
And to your point, there's places for which it just doesn't make economic sense to deploy it at scale on some form factors and some platforms where it just simply
may not make sense given the requirements and given the constraints of that edge configuration,
for example. Yeah, that's certainly fascinating. And to both of those points, it sounds like
in this case anyway, in all cases, right, in many cases, anyway, scalability is both scaling up and scaling down.
And there's a limit to both.
I've definitely, I've personally never worked in an environment that was completely unconstrained resource-wise, right?
There's always been some kind of limit, whether it was dollars or space or cooling or power or something that was a limiting factor.
I'm sure there may be some deployment somewhere that are beyond that, but I've never built one.
So within that, I mean, obviously some of those things are some of the factors that are obviously challenges with scaling both up or down.
Maybe you could talk a little bit more about other current challenges with scalability in general.
Yeah, we find ourselves in this brave new world of what I'll call converged HPC,
which is really, I think of it as take all your conventional high-performance computing type of codes.
I mean, you know, old things that are built on Fortran 77,
all the way up to more modern climatology codes and COVID models, for example.
And there's a lot of discussion in the HPC community around, you know,
really how expensive that 64-bit data type is.
And really this converged high-performance computing is really looking at HPC from the perspective of what can you do with these lower resolution data types that are, for example, 8-bits or 16-bits wide to supplant or act as a surrogate or proxy for these wider data types,
these 64-bit values, where if you need the dynamic range, that's fine, and they're there,
and you can use them. But if you don't, if you really don't need all the precision that those
64 bits are delivering you, there's a very real performance cost, power cost, and area cost of
deploying models with those kinds of data types.
And so what we're finding is that instead of, for example, if you're doing finite element analysis,
or if you're doing, for example, non-destructive detonation of a bomb, and you're computing all
the Navier-Stokes, you're computing the Navier-Stokes equations to compute all the different elemental interactions of all the particles is very computationally expensive. But we're bringing
to bear machine learning models in a way that allows us to predict what the end result of,
for example, an explosion might be. So you don't have to do all the part-by-part finite element
analysis. You can just kind of go to the punchline.
And that's really an exciting thing as we move forward with this converged HPC. We're going to
see several orders of magnitude speed up in what I'll call classical simulation or conventional
simulation tools where you're able to use both conventional classical methods of that as well as machine learning models where
you're predicting where to go to next, for example. A good example of this is state space exploration
where you might be exploring the state space of a fluid dynamics model and in one part of that
state space it's metastable. So you don't want to go into that part.
And so you can use your predictive models, your machine learning models as a sort of scout to
make sure that you're exploring the state space in a productive way. And you don't end up encroaching
in some strange little corner of the model that pushes you into a metastable weird state where,
for example, the physics break down and they don't work anymore.
So there's all sorts of really creative ways and neat ways that are being brought to bear that I think land squarely in this area of converged HPC that certainly admit to the scalability problems. Obviously, you know, large HPC machines that are national resources
like Oak Ridge's Summit supercomputer, those are tens of thousands of nodes, and they're
national resources with many NSF scholars using them for a variety of computational science from
biology to COVID and drug research, for example. So it's really an exciting time,
not only from machine learning,
but from high-performance computing
and really this intersection, if you will,
this Venn diagram over where those two intersect.
As you can imagine,
there's quite a lot of commonality between the two.
It's really interesting what you're saying
about the sort of collision of HPC
with artificial intelligence, because we've certainly
seen a collision from HPC into enterprise infrastructure as well. So you mentioned Summit,
for example. Summit showed the architecture that has now been deployed by NVIDIA Grace in terms of
creating an NVLink linked CPU sidecar for GPU machine learning tasks. And I think that it is sort of
indicative of how the entire industry is moving, not even just artificial intelligence industry,
but the entire enterprise compute industry. We've been talking, for example, about the CXL
technology, which will allow us to scale compute resources
outside the box and so on. But it seems sort of a paradox because on the one hand, we are figuring
out ways of bringing HPC to bear to make our computers bigger. But on the other hand, we're
actually taking the same HPC concepts to build, I don't know, smaller, bigger compute,
if you kind of get where I'm going with that.
So, you know, you've got Summit with all its, you know,
thousands and thousands of GPUs,
and then we're taking that architecture
and building something with tens of GPUs.
So it seems really interesting
that we're building bigger systems in a smaller way.
We're sort of smaller instantiations of bigger systems.
And one of the things, too, that you brought up was this whole idea that we're starting to discover that some applications perhaps don't need the precision that we had been using previously.
We've seen studies of that.
And we, of course, heard from companies like Intel and recently BrainShift talking about how you can use lower precision. Does this sort of put a monkey wrench in your
scalability claims? Could it be that we're going to solve the scalability problem simply by
using lower precision ML on smaller systems? Very good points. In fact, you're right.
We capture that the additional precision that we get
comes from the hidden state,
the hidden layers in the model, right?
So you can think of it as really a trade-off.
We're not losing that precision necessarily.
We're replacing it with different types of state
that capture the essential qualities of what that
quote variable would be capturing if it were in a 64-bit data type. But you're absolutely right
in that we're building these large heterogeneous supercomputers, but we're also, we have a need to
carve those up into smaller units of, call it smaller granularities,
smaller atomic units of compute that we can deploy at a smaller scale, but more of them.
For example, when you do deploy your model, you're going to want to deploy multiple instances of it
so that you can have throughput, obviously. And that it takes cost and power and resources
to be able to do that as you point out.
So yeah, there is definitely kind of this,
these push and pull kind of compromise and trade offs
we'll call it for lack of a better word.
And this is where different architectures can shine
and show their ability
to scale. You kind of pointed out the NVIDIA GPUs more recently have the ability to scale up and to
create larger, what's called a shared address space, a PGAS address space, a partitioned global
address space that they all share. And that allows them to be able to essentially do that, you know,
what I talked about earlier, biting off more, being able to process a larger them to be able to essentially do that, you know what I talked
about earlier biting off more being able to process a larger chunk and be able to do more
capable things.
It reminds me of, of, you know, one of the expressions that Jim Smith was a computer
architect at Cray research.
He used to say if you have a vector problem, you're always best served to do is to use
a vector processor on a vector problem. And're always best served to use a vector processor on a
vector problem. And the same thing really applies for a tensor problem. If you have tensors, you
want to use a tensor processor for the same reasons that make graphics processors really good at doing
graphics, right? If you're pushing polygons around a screen, there's nothing better than a
graphics processor for that. So it really brings to bear some really interesting architectural tradeoffs that you see in the marketplace across different hardware providers, different system architectures.
For example, the Grok system allows you to build a multiprocessor from their chips just by connecting them up. It's kind of a
glueless multiprocessor in that there's no additional hardware is required. In other words,
it doesn't use an additional external switch or anything like that. It literally uses just the
chips that are included in the system. So there's no additional complexity, so to speak, of building out these systems.
Like, for example, NVIDIA uses NVLink, and they have larger, you know,
fat tree topologies and fat tree networks that you can make from that.
And all of those introduce additional complexity in the system complexity, as well as it reduces the reliability.
And what I mean by that is, as you start to kind of grow these systems,
you can think of it as they become a little bit brittle in that you have to keep everything
functioning at the same time. I think of it as, you know, it's kind of like trying to pull the
plow. If you're trying to pull a plow, do you want to pull it with one big ox or do you want to try
to pull it with 7,000 chickens? And when you're
working at scale, right, you have 7,000 chickens and you have to keep them all
pulling and synchronized in the same direction. And that's difficult. And you start to bump into
what I call Amdahl's law effects, right, where you start to hit diminishing returns and and it's one thing to
uncover some thread level parallelism it's totally something different to to uncover 7 000 way thread
parallelism or you know uh 7 000 way on just a single chip for example and then you put multiples
of these chips together now you're suddenly looking around trying to discover 70,000 or 100,000 way parallelism, that becomes very,
very untenable from a scalability standpoint. So there's some first principles kind of here to
really look at how does the system scale and how do you make it so that when you scale from
one to 64, that you're getting good kind of linear scaling at those system sizes.
Yeah, I appreciate that. I love that metaphor, because that is kind of what I was trying to get
at. Like, you know, we're hearing sort of conflicting messages from the industry,
with some people saying, oh, it's not a problem, we'll just lash together a bunch of tiny nodes or
chickens, and other people saying, no, no, no, no, no,
we need big, strong oxen here in order to pull this forward. Is it a matter, is it just simply
a question of the data types being used? I mean, are there some data types, for example, that are
going to need an ox instead of a bunch of chickens? That's a great question. Yeah, unfortunately, there's some truth to that, right? Larger data types like FP32 and FP16 or BFloat16.
You just think about the energy consumption
of an 8-bit ALU operation compared to a 16-bit
and a 32-bit operation.
If you look at an AD instruction, and this is really into the
weeds, so I apologize, but I think it's important. If you look at this kind of an eight-bit versus
16-bit, even though the data type is twice as wide, it's quadratically more complex in terms
of its area and power consumption. In other words, when you plop down an FP16 ALU, it's four times as expensive as plopping down an 8-bit ALU.
And an FP32 is, again, double that.
And so these things have a way of really compounding and adding complexity and area to the design and ultimately to the system in which they're operating. And we see a lot of these trade-offs happening
in practice, right? Tense torrent, NVIDIA supporting 4-bit data types. We're going down
toward the low end. It reminds me of a lecture Richard Feynman gave that was titled,
there's plenty of room at the bottom, right? And that kind of applies to data types as well. Of course, that wasn't what he was talking about
at the time.
He was talking about nanotechnology
and really exploring the spaces of quantum effects
and what has emerged from that.
But really it applies to data types as well.
And that's kind of the fun thing.
If you take this to its absurdity, to its extreme, you can go down to a single bit, right?
Binary net.
You can take this down to just using essentially the sign bit as the encoding.
And surprisingly, it works, right?
It's all part of a trade-off.
But those kinds of things are really, you know, there's a lot of hyperparameters
that are available to the model designers, and sometimes it's a little overwhelming. You almost
want to have kind of a Costco model where you walk in and you only have four choices to pick from,
not 4,000, where you can tune your hyperparameters in a way that is fairly straightforward and reasonable to reason about the correctness
and the performance of the model. So it was a while back, but something you said really struck
me. And as we're talking about data types and things, what part of my head jumps to use cases
and how we actually use these data models. And there was a phrase you used, I think it was
non-destructive bomb detonation. And I just want to take a second to roll back to that,
because I think you're just talking about simulating explosions, right? But the turn
of phrase was really interesting. And I wonder if there's a spectrum of use cases that follow along,
what can potentially use sine wave encoding versus FP32? And then kind of what are the
barriers there? What's that spectrum look like from real world enterprise applications?
Yeah, and it's a great question. And I think it's actually an ongoing and fertile area of
research, right, is because we're trying to discover and find the new areas in which we can really apply these new techniques.
For example, using GAN models, GANs can be used to take a 2D picture.
For example, I forget what NVIDIA calls this technique,
but they take a 2D picture of it,
and they basically render it as a 3D model in the environment.
And it allows you to kind of fake a bunch of things
and interpolate a bunch of results.
And that's really bringing to bear new ways of applying this
and new ways of applying physics rendering and exploring
the way that we think about and really model the world around us,
right? Instead of modeling the world around us using the classical Newtonian physics,
we're going to be teaching a model how it works. And that's our ground truth is empirical
physics, right? It's kind of like experimental physics is really the ground truth. Everything kind of stems and emerges from that in the same
way. And I think that's really kind of a nice regime to be operating in, and that is your
ground truth. And that's kind of everything else that we see as part of our reality is really the emergent property of those fundamental rules.
And we see that with AI and training with these large models. It's really, you know,
this whole concept of software 2.0 is really kind of an exciting, you know, time to kind of see where its niche-y applications are as well as its broad applications. And the example of kind of see where it's where it's niche applications are as well as its broad applications
and and the the example of kind of bomb detonation as as one example um really applies to a bunch of
different physical spaces anywhere where you have a physical representation of the world that you
want to take it and decompose it into its three-dimensional components. And I think a good example of this is, I think, climatology, right?
We take climatology code, we break up the globe,
we break up the world into its 3D components,
we model everything in terms of its pressure, temperature, and volume,
and its various physical properties.
And we see that same kind of thing,
and we can bring to bear these same kind of techniques to be able to simulate climatology.
And I'm really quite excited about what machine learning can bring to bear for climatology.
And for example, one exciting application that I'm looking forward to, we've had a lot of wildfires in the United States. And one thing I think is an exciting application of AI is
the old proverb, if there's smoke, there's fire. Well, we can detect this smoke using
satellite imagery and then be able to run a machine learning model to actually do smoke
detection. Think of it as a global scale smoke detector, right? And it's kind of your early warning way
of being able to hone in and pinpoint
where the hotspot is literally in this case.
And I think we're running into all sorts of new applications
where not only does it allow us
to be a better guardian of the globe,
better steward of our resources,
but be able to improve our manufacturing processes by
bringing to bear some of these techniques that we apply at a large scale. We can apply them at
smaller scales in manufacturing, for example, looking for defects in a bottling plant,
for example. So there's a lot of different ways to kind of replace the physical processes that would have been computed with these new machine learning models that provide us with really many of those same benefits, but are really same great taste, less filling is kind of the way to look at it.
That's a very Wisconsin thing. It seems, so to me, the takeaway from this conversation is essentially
that all these technologies, whether it's HPC or bigger, better, badder tensor processors
or distributed architecture or reduced precision or ML, tuning ML, all of these things,
the takeaway is that we're going to be using more data,
bigger data, bigger models,
and we're gonna be able to tackle things
that we never thought we could tackle before,
like global climate prediction and things like that,
that we're just too big,
but we're building systems that can
process them. Is that the takeaway that I should get from this conversation?
Yeah. I mean, there's several good analogies about data is like the petroleum or the oil
resources and that labeling of data is like the refinery, right? And those who have the resources
here, the oil and the ability to refine it,
are in a unique position to take advantage of that energy density, so to speak.
And that's a great example because I think that if you look at the forerunners and
the thought leaders in this space, it's research communities like Google,
DeepMind, and Microsoft Research that have these huge
corpus of data, and they have the ability to test out their ideas at scale before they
deploy them.
And that's really, really, like I said, that the data and having access to data at the
fingertips is really the powerful component.
And that allows you to take advantage of, in this new software 2.0 kind of world, to be able to build better models, train better models, to be able to deploy those models at scale.
And that is a really strategic advantage.
Well, I do appreciate that.
And I think that that's actually a great way to leave the conversation.
We do have to wrap up here.
But before we go, we do have a fun little tradition of asking our guests to answer three
questions for which they are totally unprepared.
And so, Dennis, it's your turn.
And I have a few questions here.
Now, they're pretty much the same questions week to week.
But I do try to shuffle them up and pick some that are appropriate or, you know, at least something that I would get a
fun answer for. So, Dennis, it's time. As a reminder, there's no right or wrong answer.
It's just sort of a let's predict the future here. So, here we go. First things first.
In terms of assistance, you did mention, you know mention voice assistance and latency and things like that.
So how long will it be until we have a conversational voice-based AI that can pass
the Turing test and fool an average person into thinking they're talking to another human?
Wow, great question. Conversational AI is taking some great
strides forward. I think Jarvis is the technology that NVIDIA calls their new conversational
chat bot kind of technology. And it's really bringing to bear new techniques, including
tone and inflection and sentiment. Instead of
sentiment analysis, you can think of it as sentiment expression, expressing sentiment
through voice. And that's making it even more difficult to be able to detect kind of subtle
intonation that only comes from, frankly, from human vocal cords, right? And that kind of subtle inflection.
Well, those kinds of micro expressions and subtleties
are now starting to be able to be expressed
and controlled through AI.
And I think it's becoming increasingly difficult
and we will see more technology and more models
that are essentially trying to ferret out the deep fakes,
right? And the better this gets, the more difficult it becomes. So bringing to bear
some new techniques to be able to detect and, you know, identify ground truth so that we as
consumers of these things can have confidence that what we're looking at is indeed, you know, either generated from the real world surroundings, or if it's just a figment of some, you know, machine learning model. two years to have really the kind of sentiment analysis
and sentiment expression that will make
that type of conversational chatting with AI,
almost imperceptible in terms of the ability to tell
if it's real or if it's Memorex,
I guess to borrow an expression from the 80s.
All right. New question here. Will we ever see a Hollywood-style fake mind? In other words,
Mr. Data or a similar AI? Great question. So will we ever bring to bear kind of an android, so to speak, that would be able to kind of articulate and encompass?
I actually think that we will. And I think there's strides happening in the neural interface area that will allow us to take those steps perhaps sooner than we ought to.
But I think we will see tighter integration with that kind of computer brain interface.
And largely, I think the initial applications will be medically inspired and use cases will be for, for example, for treating Parkinson's or epilepsy
or other aspects of degenerative diseases that can be staved off through technology applications.
So I do believe that within the next five years, we will have that tighter integration. And by that time, we may see Elon Musk walking around with a third eye or something like
that to kind of encompass that.
All right, final one.
We talked big, so let's talk small.
How small will machine learning get?
Can we have ML-powered household appliances?
How about toys or even disposable ML devices?
Fantastic question. So when your toothbrush has ML embedded in it, I think we've arrived,
right? You take all the ubiquitous things that you do in life. And when your comb tells you that
you've only, you know, brushed your hair 13 times rather than the traditional 27 times on average,
then maybe we've got problems as a society.
But I think it's certainly getting to that point, right?
I bought a toothbrush that actually has AI in it
and that they, which is a little scary, right?
They actually characterize your brushing habits,
your tendencies, and they, in a sense,
kind of turn your mouth into a set of features
that they can look at.
And that's a little alarming at some point,
but I think that that's just, that's coming, right?
And before long, we will have them integrated
into our glasses.
And in fact, we see this today,
augmented glasses with
a variety of electronics embedded in them. And these will become our kind of viewport into the
world. And they will be not only recognizing and identifying objects for us, but they'll be able
to extract technology and be able to extract details that we, our brains may not be able to detect, for example,
infrared or other electromagnetic frequencies
that we can't in endemically pick up.
So I think it's an exciting new area
that we will see greater and greater integration
of machine learning into smaller and smaller form factors
to the point where you're gonna to say, hey, Siri,
and your glasses are going to wake up and respond to that. You heard it here, folks. First, folks,
your mouth is a feature store. Get ready for it. That's right. So thank you so much, Dennis,
for joining us today on Utilizing AI. Where can people connect with you and follow your thoughts on enterprise AI and other topics? Look me up on LinkedIn. You can find me at Dennis Apps, A-B-T-S on LinkedIn.
And that's my preferred social media outlet. So I look forward to connecting with you on that.
How about you, Chris? What have you been up to lately?
Yeah, lots of fun stuff. I also enjoy having conversations on LinkedIn. So find me there.
You can also follow me on Twitter at Chris Gunderman. Or for all things that I've been
working on and doing lately, chrisgunderman.com always has the latest.
All right. Very good, Chris. Thank you. And you can find me at techfieldday.com. And you can find
my writing at gestaltit.com. And as I mentioned, you can also connect with me
on most social media networks at SFOSKIT.
One thing I will call attention to is next week
is AI Field Day number two.
So if you join us on Thursday and Friday,
the 27th and 28th,
you'll see some great presentations
about enterprise tech
and enterprise applications of artificial intelligence.
So please do head over to techfieldday.com next week and check that out. So thank you for listening
to Utilizing AI. If you enjoyed this discussion, please do remember to subscribe, rate, and review,
as always. And please, of course, share this discussion with the rest of the AI community.
This podcast is brought to you by gestaltit.com, your home for IT coverage from across the
enterprise.
For show notes and more episodes, go to utilizing-ai.com, or you can find us on Twitter at utilizing
underscore AI.
Thanks, and we'll see you next time.