The AI Daily Brief: Artificial Intelligence News and Analysis - AI Is Transforming How We Interact With Computers
Episode Date: March 2, 2024A reading and discussion of the changing nature of the relationship between humans and computers, inspired by https://www.theinformation.com/articles/how-ai-will-change-our-relationship-with-computers...?rc=jrwr4u ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're looking at how AI is going to change the relationship between people and computers.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.net network for more information about our YouTube, for Discord, and our newsletter.
Welcome back to the AI breakdown.
It is, of course, a long-reeds episode of the breakdown.
And today, the inspiration piece that we will be drawing from is how AI will change our relationship with computers.
It was published in the information and was written by venture capitalist Finoid Kosla,
who has been one of the loudest proponents for artificial intelligence.
The key idea is right there in a subheader.
We've always adapted to software.
Now, AI is enabling software to adapt to humans instead.
So I'm going to read a few sections of this, and then we'll have a discussion.
Venoid writes, there's a lot of buzz about AI hardware and gadgets.
I want to discard these terms.
They're misleading.
What you call gadgets or hardware, I see as facilitators of a new era in which low-latency voice
becomes the primary way users interact with smart AI. These devices are support infrastructure.
This isn't just about devices, it's about a fundamental shift in human computer interaction.
Artificial intelligence will spur two fundamental changes in our relationship with technology.
The first is that voice, already the most natural interface for human interaction, will become a
dominant interface. Imagine latency reduced to less than half a second. A stark contrast to the sluggish
of touch-based devices. Even silent voices on the table, vocalizing commands without a sound,
and especially useful option in a public setting like a cafe. Silent voices mouthing the words
without allowing sound to come out of your mouth. Technologies to detect such silent speech
will allow one to privately dictate in public places without anyone being able to listen.
The second revolution is in how apps will adapt to us. No longer will we need to learn to
navigate through apps like Uber or complex systems like those of SAP and Oracle. Thus far,
we've always adapted to software, learning its intricacies, remembering layered menus, and so
forth to communicate with machines. Training to use complex apps is commonplace. Now, AI is enabling
software to adapt to humans instead. This will lead to new types of hardware, designed primarily for
voice interaction, with the computer learning the human language and the human. Yes, there still might be
a screen for certain visual tasks, but the core interaction will shift to voice, be it silent or audible.
From there, Vinod gets into a number of the different projects which are reportedly working in this area.
He talks about reports of iPhone designer Johnny Ive teaming up with Sam Altman and how another set of
X Apple engineers, launched the Humane Pin, and even more, how the Rabbit R1 launch has brought
this concept of a new type of interaction to a much larger audience. Vennone also points out that when
it comes to how we're going to start adopting these interactions, it's not necessarily just
going to be because of work. He talks about the cumbersome workflow he currently uses when hiking
to identify plants with an app, which could be replaced by simply pointing a device powered by
AI at the app and asking what is this plant. He also makes a number of arguments about why device
creators might be excited about a voice paradigm. One, he says, it's significantly more affordable
than a traditional smartphone. He also argues that it's simpler and more efficient than typing
or touching a screen. He describes how different Rabbit's Large Action Model is, two other modalities.
He writes,
Lambs learn to use software the way humans do, rather than communicating with an app through
an application programming interface as traditional software does. Imagine someone peeking over
your shoulder as you swipe on your phone and learning those patterns. That's a lamb, or whatever
it might be called in the future. It's a complete inversion of the traditional paradigm.
and means that ultimately we won't have to interact with the software because the AI will do so on our
behalf. The biggest benefit that Vod sees to this is actually a deeply personal and human psychological
benefit, a shift from a paradigm in which the devices that live in our pockets are designed to
distract us, to capture our attention, to keep our attention, to a set of devices that are designed
to specifically save us time, minimize distractions, and get us back to interacting with the real
world. Now, when it comes to where this innovation is likely to come from, it will perhaps
not surprise you that a VC thinks that it's got to be from the startups. Effectively, he says,
because this is not the evolution of an iPhone 16 to an iPhone 17, but a total reimagining of the
ground up for the human computer relationship, startups that can think anew and act totally differently
are in the best place to figure it out. He writes precisely because this next phase is not about
hardware or gadgets, but a complete overhaul of how humans and computers interact, will be adding
devices like Rabbits R1 to our repertoire. These devices are inventing a human-centered and agent-based
future enabled by new AI technology. It's not just an incremental change. Apple brought us the world of
there's an app for that. Rabbit-like devices will bring us the world of there's an AI for that,
accessible through a voice-driven personal agent that can run apps for you. So that's a summary of the
piece. Like I said, you should go read it on the information, which is by far the best source of,
especially breaking news in the AI space. I think shifting the mental paradigm from thinking about
this category as AI hardware to something that's about a much more fundamental exploration and
shift of the relationship between humans and computers is the right idea. It's why it's harder to
imagine shifting behavior to something like the rabbit because it's not incremental, as Vinod points out.
One thing that he doesn't discuss is how AI is still a little bit behind where it needs to be
for some of these systems to really come into play. For example, we've recently had people get their
hands on grok, GROQ, which runs LLMs massively faster than the experience that people are used
to with things like Gemini and ChatGPT. Reducing the latency in some of the latency in some of
a profound way is going to be a key step towards having natural voice-mediated interactions
with a device like the rabbit or whatever else comes like it. The other pillar of this is obviously
the agent future. Now, it's very clear that most of the big labs are betting on some version of
that future coming to fruition. OpenAI has not been cagey about the fact that GPTs and the custom
Gpt store were a very first baby step towards something like that. However, if you look at where a lot of
the startups talking about agents are, it's pretty clear to me.
me that there is not yet a really strong sense of which use cases for agents, which functions in
people's lives, they're most likely to automate an outsource to an agent first. You see a ton
of companies trying to do generalist AI agent type activities. And the reason for that is that until
people start actually using them, we're not really going to be able to know or predict what people
will use them for. I actually tend to be more skeptical of a generalist personalized AI assistant
type of future, where everyone has a little EA in their pocket that does things for them. But I think
that lots of people will find very specific use cases for agents that save them time and become part
of their normal flows. Without knowing what those use cases are, again, it's really hard to know
what the right type of device or form factor will be. There's also an interesting countervailing
trend to the future that Vinod is imagining, which I don't think actually contradicts it, but it is
worth noting. In 2023, in the wake of ChatGBTBT, every new AI startup was experimented.
with a totally natural language sort of interface,
where we did away with all the buttons and all the controls that normal software had,
and instead had you just talk to software.
What you started to see towards the second half of last year, though,
and certainly into this year,
is more and more custom-purpose AI software suites
returning and actually integrating natural language interfaces where it makes sense
with more traditional software-type controls that are about enabling that particular use case.
You're seeing this a lot, for example, in the visual realm with design software.
where, for example, a startup or a platform's entire purpose is to help someone create the best product shots,
which, for example, a company called Flair AI is doing,
well, because it's such a specific use case, there are a lot of controls in their software
that make more sense for them to give you specific guidance around and interact with, like traditional software,
than to just have a natural language input, as does something like Mid Journey or Dolly 3,
which suggests to me a hybrid future.
Where certain behaviors that are cumbersome, laborious, annoying, inefficient, in their
current smartphone-mediated modality or laptop-mediated modality gets swept up into voice-controlled,
natural language-controlled agent-AI-type experiences. But then other things will still be mediated
by screens, by controls that aren't just about natural language. There will still be software where
it makes, there will still be software where the trade-off to learn complex interactions will be
worth whatever challenge it presents. Whatever the case, it's going to be an exciting and diverse and
dynamic future, and I'll be sharing everything I learn about it with all of you as we go.
For now, though, that is going to do it for today's AI breakdown.
Until next time, peace.
