The AI Daily Brief: Artificial Intelligence News and Analysis - AI Is Transforming How We Interact With Computers

Episode Date: March 2, 2024

A reading and discussion of the changing nature of the relationship between humans and computers, inspired by https://www.theinformation.com/articles/how-ai-will-change-our-relationship-with-computers...?rc=jrwr4u ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI breakdown, we're looking at how AI is going to change the relationship between people and computers. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.net network for more information about our YouTube, for Discord, and our newsletter. Welcome back to the AI breakdown. It is, of course, a long-reeds episode of the breakdown. And today, the inspiration piece that we will be drawing from is how AI will change our relationship with computers. It was published in the information and was written by venture capitalist Finoid Kosla, who has been one of the loudest proponents for artificial intelligence.
Starting point is 00:00:44 The key idea is right there in a subheader. We've always adapted to software. Now, AI is enabling software to adapt to humans instead. So I'm going to read a few sections of this, and then we'll have a discussion. Venoid writes, there's a lot of buzz about AI hardware and gadgets. I want to discard these terms. They're misleading. What you call gadgets or hardware, I see as facilitators of a new era in which low-latency voice
Starting point is 00:01:09 becomes the primary way users interact with smart AI. These devices are support infrastructure. This isn't just about devices, it's about a fundamental shift in human computer interaction. Artificial intelligence will spur two fundamental changes in our relationship with technology. The first is that voice, already the most natural interface for human interaction, will become a dominant interface. Imagine latency reduced to less than half a second. A stark contrast to the sluggish of touch-based devices. Even silent voices on the table, vocalizing commands without a sound, and especially useful option in a public setting like a cafe. Silent voices mouthing the words without allowing sound to come out of your mouth. Technologies to detect such silent speech
Starting point is 00:01:46 will allow one to privately dictate in public places without anyone being able to listen. The second revolution is in how apps will adapt to us. No longer will we need to learn to navigate through apps like Uber or complex systems like those of SAP and Oracle. Thus far, we've always adapted to software, learning its intricacies, remembering layered menus, and so forth to communicate with machines. Training to use complex apps is commonplace. Now, AI is enabling software to adapt to humans instead. This will lead to new types of hardware, designed primarily for voice interaction, with the computer learning the human language and the human. Yes, there still might be a screen for certain visual tasks, but the core interaction will shift to voice, be it silent or audible.
Starting point is 00:02:24 From there, Vinod gets into a number of the different projects which are reportedly working in this area. He talks about reports of iPhone designer Johnny Ive teaming up with Sam Altman and how another set of X Apple engineers, launched the Humane Pin, and even more, how the Rabbit R1 launch has brought this concept of a new type of interaction to a much larger audience. Vennone also points out that when it comes to how we're going to start adopting these interactions, it's not necessarily just going to be because of work. He talks about the cumbersome workflow he currently uses when hiking to identify plants with an app, which could be replaced by simply pointing a device powered by AI at the app and asking what is this plant. He also makes a number of arguments about why device
Starting point is 00:03:00 creators might be excited about a voice paradigm. One, he says, it's significantly more affordable than a traditional smartphone. He also argues that it's simpler and more efficient than typing or touching a screen. He describes how different Rabbit's Large Action Model is, two other modalities. He writes, Lambs learn to use software the way humans do, rather than communicating with an app through an application programming interface as traditional software does. Imagine someone peeking over your shoulder as you swipe on your phone and learning those patterns. That's a lamb, or whatever it might be called in the future. It's a complete inversion of the traditional paradigm.
Starting point is 00:03:30 and means that ultimately we won't have to interact with the software because the AI will do so on our behalf. The biggest benefit that Vod sees to this is actually a deeply personal and human psychological benefit, a shift from a paradigm in which the devices that live in our pockets are designed to distract us, to capture our attention, to keep our attention, to a set of devices that are designed to specifically save us time, minimize distractions, and get us back to interacting with the real world. Now, when it comes to where this innovation is likely to come from, it will perhaps not surprise you that a VC thinks that it's got to be from the startups. Effectively, he says, because this is not the evolution of an iPhone 16 to an iPhone 17, but a total reimagining of the
Starting point is 00:04:11 ground up for the human computer relationship, startups that can think anew and act totally differently are in the best place to figure it out. He writes precisely because this next phase is not about hardware or gadgets, but a complete overhaul of how humans and computers interact, will be adding devices like Rabbits R1 to our repertoire. These devices are inventing a human-centered and agent-based future enabled by new AI technology. It's not just an incremental change. Apple brought us the world of there's an app for that. Rabbit-like devices will bring us the world of there's an AI for that, accessible through a voice-driven personal agent that can run apps for you. So that's a summary of the piece. Like I said, you should go read it on the information, which is by far the best source of,
Starting point is 00:04:49 especially breaking news in the AI space. I think shifting the mental paradigm from thinking about this category as AI hardware to something that's about a much more fundamental exploration and shift of the relationship between humans and computers is the right idea. It's why it's harder to imagine shifting behavior to something like the rabbit because it's not incremental, as Vinod points out. One thing that he doesn't discuss is how AI is still a little bit behind where it needs to be for some of these systems to really come into play. For example, we've recently had people get their hands on grok, GROQ, which runs LLMs massively faster than the experience that people are used to with things like Gemini and ChatGPT. Reducing the latency in some of the latency in some of
Starting point is 00:05:29 a profound way is going to be a key step towards having natural voice-mediated interactions with a device like the rabbit or whatever else comes like it. The other pillar of this is obviously the agent future. Now, it's very clear that most of the big labs are betting on some version of that future coming to fruition. OpenAI has not been cagey about the fact that GPTs and the custom Gpt store were a very first baby step towards something like that. However, if you look at where a lot of the startups talking about agents are, it's pretty clear to me. me that there is not yet a really strong sense of which use cases for agents, which functions in people's lives, they're most likely to automate an outsource to an agent first. You see a ton
Starting point is 00:06:10 of companies trying to do generalist AI agent type activities. And the reason for that is that until people start actually using them, we're not really going to be able to know or predict what people will use them for. I actually tend to be more skeptical of a generalist personalized AI assistant type of future, where everyone has a little EA in their pocket that does things for them. But I think that lots of people will find very specific use cases for agents that save them time and become part of their normal flows. Without knowing what those use cases are, again, it's really hard to know what the right type of device or form factor will be. There's also an interesting countervailing trend to the future that Vinod is imagining, which I don't think actually contradicts it, but it is
Starting point is 00:06:49 worth noting. In 2023, in the wake of ChatGBTBT, every new AI startup was experimented. with a totally natural language sort of interface, where we did away with all the buttons and all the controls that normal software had, and instead had you just talk to software. What you started to see towards the second half of last year, though, and certainly into this year, is more and more custom-purpose AI software suites returning and actually integrating natural language interfaces where it makes sense
Starting point is 00:07:16 with more traditional software-type controls that are about enabling that particular use case. You're seeing this a lot, for example, in the visual realm with design software. where, for example, a startup or a platform's entire purpose is to help someone create the best product shots, which, for example, a company called Flair AI is doing, well, because it's such a specific use case, there are a lot of controls in their software that make more sense for them to give you specific guidance around and interact with, like traditional software, than to just have a natural language input, as does something like Mid Journey or Dolly 3, which suggests to me a hybrid future.
Starting point is 00:07:50 Where certain behaviors that are cumbersome, laborious, annoying, inefficient, in their current smartphone-mediated modality or laptop-mediated modality gets swept up into voice-controlled, natural language-controlled agent-AI-type experiences. But then other things will still be mediated by screens, by controls that aren't just about natural language. There will still be software where it makes, there will still be software where the trade-off to learn complex interactions will be worth whatever challenge it presents. Whatever the case, it's going to be an exciting and diverse and dynamic future, and I'll be sharing everything I learn about it with all of you as we go. For now, though, that is going to do it for today's AI breakdown.
Starting point is 00:08:28 Until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.