The AI Daily Brief: Artificial Intelligence News and Analysis - Are ChatGPT Plugins Overhyped?
Episode Date: May 31, 2023ChatGPT Plugins are one of the most hyped things in AI -- but are they overhyped? In this episode, NLW explores how the default interfaces of the internet have changed over time, the issues with the c...urrent implementations of ChatGPT Plugins, and what it means for the future. Check out The Cognitive Revolution The perfect AI interview complement to The AI Breakdown https://link.chtbl.com/TheCognitiveRevolution The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're asking if chat GPT plugins are actually overhyped.
Before that on the breakdown brief, China is talking AI regulation, a federal judge bans chat
GPT from his courtroom, and much, much more. The AI breakdown is a daily video and podcast about
the most important news and discussions in AI. Like, subscribe and share, and go to breakdown.
Network for more information.
Welcome back to the AI breakdown brief. All the AI headline news you need in five minutes or less.
Quick reminder that all of this can also be delivered in newsletter form.
You can get that at the AI breakdown.bihive.com.
Every morning I do what I call the AI breakdown first five, which has basically the news that I often then cover on this brief.
Now, first up today, we head to China.
At a meeting yesterday, President Xi Jinping called for, quote,
dedicated efforts to safeguard political security and improve the security governance of Internet data and artificial intelligence.
Now, the statements and the broader meeting context seem to say,
suggest that China is nervous, at least on some level, about AI being harnessed for spying and
sabotage. Now, in many ways, China has been thinking about AI for much longer than the West has.
In April, for example, the Chinese internet regulator published some draft rules for generative
AI that, while not calling out any specific company, made it pretty clear that they were pretty
worried about the rise of LLMs in the United States. Some parts of the proposed regulation are
things that might find their way into U.S. or EU regulations as well, things like anti-discrimination, as well
as transparency around how models are trained, but simultaneously it's clear that China is trying
to contort AI into something that reinforces their existing system. They want people who use
generative AI tools to sign up with their real identities, and they also say that the content
that AI generates should, quote, reflect the core values of socialism. Heading back over to the US,
we're definitely starting to see just how much AI might reshape markets and businesses within them.
HP's CEO Enrique Lores was on Jim Kramer and basically said that they were reimagining the PC
from the ground up on the basis of how people were going to interact with AI applications.
So, for example, integrating tools that better and more quickly analyze data and spreadsheets was one
of the things that he mentioned.
Laura said, I've been in this industry for many, many years, Jim.
I have never seen an opportunity like this to really drive innovation and drive new type of customer
needs that we think are going to be fundamental.
Another small but significant update Microsoft Teams is rolling out support for AI-based intelligent
meeting recaps.
You've seen probably a ton of different tools that are trying to.
bring AI transcripting and AI power note taking into the Zoom or conference call experience.
Well, it looks like these companies are just putting that in natively to the existing platforms.
This isn't a huge story on its own, but I do think reflects what we're likely to see over the
next 12 months, which is every company that has any sort of experience that touches the consumer
or the enterprise, figuring out if and how AI can be integrated for higher productivity.
A quick update to yesterday's story all about Nvidia's big announcements.
the company did indeed cross the $1 trillion market cap line, albeit temporarily yesterday.
InVVVD close the day trading just shy of that trillion dollar market around $992 billion,
but at this point its importance to the current stock market narrative couldn't be clearer.
Here's one from the why not files. Google DeepMind has introduced Barcore,
which is a benchmark for quadrupedal robots.
The TechCrunch article here jokes that maybe they worked backwards from the name Barcore,
and I couldn't agree more.
Now, the interesting thing here is less the robot itself,
more that this is a new benchmark system for researchers to be able to determine how closely their
robotics models actually mirror the real-world example that they're building off of.
TechCrunch writes, given that these machines are inspired by animals, the research team determined
that real animals would provide the best performance analog for their robotic counterparts.
That meant setting up an obstacle course in the lab and having a dog run it.
Performance is rated on a scale of zero to one, a simple binary to determine whether the robot can
successfully cross the space in the 10 seconds or so it takes for a similarly sized dog to do so.
The org says that Barcour has proven an effective benchmark even in the face of the inevitable
unexpected event in hardware issues.
Now, moving on yesterday, we discussed a 30-year veteran lawyer in New York who had thrown
himself on the mercy of the courts after it was revealed that basically all of the cases that
he gave to provide precedent for his client's lawsuit against the big airline were invented
by ChatGPT.
Well, a new Texas federal judge Brantley-Star has added a requirement that any attorney now
appearing in his court must attest that, quote,
no portion of the filing was drafted by generative artificial intelligence
or that it was checked by a human being.
This includes quotations, citations, paraphrased assertions,
and legal analysis.
Again, TechCrunch writes,
as summary is one of AI's strong suits
and finding and summarizing precedent or previous cases
is something that has been advertised as potentially helpful in legal work,
this may end up coming into play more often than expected.
From the memorandum, quote,
these platforms in their current state are prone to hallucinate.
and bias. On hallucinations, they make stuff up, even quotes and citations. Another issue is
reliability or bias. While attorneys swear an oath to set aside their personal prejudices, biases,
and beliefs to faithfully uphold the law and represent their clients, generative artificial
intelligence is the product of programming devised by humans who did not have to swear such an oath.
As such, these systems hold no allegiance to any client the rule of law or the laws in the Constitution
of the United States, or, as addressed above, the truth. Unbound by any sense of duty, honor, or justice,
programs act according to computer code rather than conviction, based on programming rather than
principle. Any party believing a platform has the requisite accuracy and reliability for legal
briefing may move for leave and explain why. Now, hold aside any of the jockey headlines. I actually
think that this is pretty much exactly how these tools are going to find their way to be integrated
into existing industries. It's not just like overnight chat GPT is going to disrupt everything.
There's going to be a messy process of integration that involves humans figuring out where the
boundaries of how they can trust it are. This sort of approach that clearly and articulately understands
the challenges, but also leaves space for the tool to be used as tools under the right circumstances,
seems like a pretty decent starting point to move forward. All right, guys, that's it for today's
AI breakdown brief. If you're enjoying, please like, subscribe and share, and I will be back soon
with the main AI breakdown. Hey guys, before the main AI breakdown, I wanted to tell you about
another AI podcast that I've been absolutely loving recently, and that is the cognitive revolution.
The Cognitive Revolution features interviews with the people who are actually pushing the bleeding edge of AI forward.
It's entrepreneurs, researchers, thinkers, basically the people who are building critical tools and infrastructure
that are going to shape technology, the economy, and the broader collective human experience.
Now, as you know, the AI breakdown is a daily news show, and for me, I love having TCR to go to to get my big think interview fix.
The show is hosted by my friend Eric Torrenberg, who I've known for years and years,
and his co-host Nathan LeBenz,
who spends a ton of time actually exploring and testing
and just generally finding out about the product
so that he can go to a much more significant level of conversations with their guests.
Just for example, Nathan last year was invited to be a red teamer for OpenAI for GPT4 for two months.
The AI space moves incredibly fast.
If you're looking for great interviews with the most interesting people at the frontiers of AI,
I can't recommend the cognitive revolution highly enough.
Are chat GPT plugins actually kind of overhyped?
That haram statement is exactly what we're exploring in today's AI breakdown.
Welcome back to the AI breakdown.
Today we are exploring whether chat GPT plugins are overhyped,
and it's a bit broader than that.
If you spend any time in and around AI,
it is almost for sure that you've seen some thread like the one right behind me right now.
Joss Singh writes,
people in their 20s are making $240,000 plus a year.
only using chat GPT.
Here are seven chat GPT plugins you can use to start your business.
Another example, Akash Gupta says chat GPT plugins are a superpower, but 99% of people
haven't explored them.
I tried out all 134 released in the seven days since launch.
Here are the top 10 chat GPT plugins to 2X your productivity.
Now, as I always point out, I am not calling out specific individual creators.
This is a template that works for Twitter engagement.
If you're going to blame someone, blame the algorithms.
But the point of this type of content is to, one, make ChatGBTGPT plugins feel like the most transformative thing around,
and two, to make you feel like you were missing out if you don't understand them.
To assess whether ChatGPT plugins are overhyped or underhyped or appropriately hyped,
I think we actually need to take a step back and talk about ChatGPT in general.
ChatGPT is at core two separate things.
The first thing it is is a large language model, GPT 3.5 when it released, followed by GPT4 a couple months later.
but it's not just an LLM.
It is also an interface that sits on top of that and allows people to interact.
Before ChatGPT, we had had both LLMs and we had chatbots,
but they hadn't been combined in the particular way.
And it was that combination, that interface experience that gave regular people access
to the power of AI that really was the transformative thing.
Now, we know that when ChatGPT was released,
it became the fastest growing startup in history,
hitting 100 million users in about five weeks,
which was much, much faster than the previous record holder of TikTok who hit 100 million users in about nine months.
This was not something that even OpenAI's executives expected.
Co-founder and president Greg Brockman told Fortune, I'll admit that I was on the side of like,
I don't know if this is going to work.
Mira Muratai, the CTO, said this was definitely surprising.
Even the idea of releasing this to the public wasn't something that the company was convinced that they were going to do.
The point that I'm trying to make is that the particular implementation of this LLM in the form of a chatbot
was the magic that made it accessible to regular people.
It was not just a technology experience per se.
It was an interface experience.
It wasn't even a particularly sophisticated or good interface experience,
but the idea of a text box that you could chat with in a web browser
was it turns out the thing that it took to make this type of genitive AI come to life.
Now, I think it bears repeating that we are really conditioned to particular experiences
by these services that we interact with on the internet.
For a very long time, one of those default experiences was the news feed that was at the
center of basically all social networking applications. It's dating myself, but I still remember when
they announced that feed a couple years after Facebook was first launched. Before that, you basically
just browsed around to different people's profiles that you wanted to look at. On September 6th,
2006, Facebook announced the news feed, and it did so too much user chagrin. Users complained that it
violated their privacy, that it was much too intrusive, that it documented every moment with time stamps
in a way that they didn't like. There were even calls to boycott Facebook. Now, of course, we know that
that news feed, the aggregation of activity from social networks, became the default experience
for basically every other social networking application as well. However, that was the last time that
the default interface of the feed would evolve. 12 years later, after the introduction of that
news feed, there was another moment of people getting frustrated with their social networking apps,
which was when people started pushing back after Instagram was showing random posts from people
they didn't follow in their feeds. Now, at that point, people had gotten used to the fact that these
platforms had immense control over which content of the people they followed that they were going
to surface and in what order and in what sequence. But the idea that a network could just plop
down content from people that you weren't subscribed to was incredibly foreign. Fast forward again,
and this idea of a network showing you content from people you didn't actually proactively
follow reaches its apotheosis in TikTok, where the default experience that people have of that
app is the 4U page. The 4U page is entirely algorithmically curated. It is based on things like
you interacting and engaging with videos, how long you like and watch certain videos, all of which
creates effectively a dossier profile on you that is used by the algorithm to recommend the next
piece of content. Yes, of course, you can toggle over to see videos just from people you follow,
but that is by far the secondary experience for most TikTok users. The point is that over time,
the default experiences that we have of the internet shift and change. As they do, they retrain our
expectations in ways that can be hugely significant and, of course, lucrative for the companies behind them.
And that brings us to another default interface of the internet, maybe the most default interface on the internet, which is the Google search.
For even longer than social networks have been with us, Google's search has been the way that we query the internet for information.
You search for whatever it is that you're looking for at that time, and you have a set of blue links to follow with maybe some ads at the top.
The entire internet has been structured around this.
All content and content discovery and SEO platforms all are based on the idea of the supremacy of this being the default and core.
way that people interact with trying to get information from the web. In fact, I don't think it would
be a stretch to argue that there are in fact two default experiences for the internet, one being
whatever the norms around the feed of social networks are, and the other being Google search.
And that's why even in its most nascent proto form that came out last November, Chad GPT represented
such a serious threat to Google's core business. What Google recognized is that potentially
they were witnessing the birth of a new default way that people might start to experience.
the internet. Instead of going to Google.com, typing a question and seeing the little links to follow,
people might get comfortable with this new all-powerful Oracle that could bring things from the
internet back to them in ways that were closer in some ways to the experience of asking an expert
and having them tell you in plain language. Now, of course, the initial experience of chat GPT had
some major gaps, primarily that it couldn't access the internet. It was trained on data up to late
2021, but no farther. And two, that because of that, it couldn't really interact with other
internet-based services that people might use. In other words, for the time being, chat GPT in that
version wasn't actually a threat to Google. It was more that it could evolve into one. Even in its
basic, neutered state, people were starting to shift their behaviors from searching on Google to
searching or asking chat GPT for the same questions. Part of that was the speed with which it could bring
back answers. Part of that was the interface. Part of that was the modality of feeling like you could
ask a question in natural language and have the answer come back in natural language.
But of course, chat GPT was not going to sit still. And in the last couple weeks, we've had
a number of major, major updates. One is that browse the internet features are now officially
an option for how you can use chat GPT. This was cemented last week when at Microsoft build,
it was revealed that all users, not just premium users, but all users would have access to the
Browse with Bing version of chat GPT. Now, all of a sudden, people didn't need to limit their
queries just to what happened before 2021, but could get up-to-date information as well,
thanks to these browse features. However, the second big update is something called plugins.
Plugins effectively extend the chat GPT experience into new realms. They allow people to access
specific data sets, such as public equities data, crypto market data, academic research papers,
specific PDFs, and more from that chat GPT interface. And there is, of course, a ton of
promise, who wouldn't want all of this additional functionality embedded into this tool that was
becoming such a growing and important part of so many of our lives? However, for as many of those
breathless threads as there have been on Twitter, it didn't take long for some people to realize
that plugins had some fairly significant issues. Warden Professor Ethan Malik write,
here's an example of why plugins are overhyped for now. Wolfram is capable of amazing things,
but GPT fails to use it successfully most of the time. The other plugins are incredibly limited in
what they can do and fail often. More recently, Ethan again wrote,
Today, ChatGTPT struggles as it does with plugins to use the web well.
Others have pointed out that there are real serious UI issues. We'll get into this in a moment,
but there's no easy way to search for them based on what you're looking for. Instead,
you have to click through page after page after page of plugin. Mashable recently wrote an
article called five ChatGPT plugins that do what they promise. Amongst the sea of duds,
there are some good and funny plugins that work is advertised. They write, OpenAI has
finally given ChatGPT the eyes and ears necessary to truly take advantage of its premier generative
AI-based chatbot. The new feature is still in beta, and it shows, with some plugins that are
unable to do what they were built to do in the first place. So here are what I think are the three
biggest problems with ChatGPT plugins right now. The first is a really simple one. They're only
available for Plus users. That means you have to shell out $20 a month just to get access to any of
these tools. Now, this one I'm not overly worried about because you have to imagine is going to change
over time, but for right now it remains a barrier. Number two, the
The search experience of trying to find the plugin that you actually need is absolutely terrible right now.
So here's how you would go about it. You would toggle up to GPT4 and scroll down to Plugins Beta.
From there, you would see which plugins you had enabled, or you could scroll all the way down and go to the plugin store.
And this is where the experience really just gets subpar.
Plugins are shown in these sets of eight, and you can toggle between popular, new, all, or installed.
Each plugin has a tiny little description of less than 20 words, and you can't click to learn any more.
You just have to install them.
Now, if you have a particular use case, like you're trying to find a plugin that would allow you to read directly from PDFs
and ask ChatGAPT to summarize a PDF that you wanted to upload, there's no way to search for that functionality.
You can't type PDF in some search.
Instead, you have to click through set of eight by set of eight by set of eight, looking hopefully for the plugin that would actually solve your problem.
This is obviously a hugely, hugely suboptimal way to do plugin discovery.
The third problem is the experience of actually using the plugins.
Once you've installed the plugins that you want, you have to manually select up to three
that you can enable at any given time, which kind of means that you have to know or at least
guess at how to use them in advance of actually using them.
I found over and over that it's very easy to install a plugin and hope that it's going to
work, only to have chat GPT not recognize it or for you to have to go do some other
setup somewhere else.
For me, however, the biggest thing comes back to this idea of interface, and it's why I took so much time on the interface to begin with.
The fourth problem isn't so much a problem as it is an open question.
The chat GPT bet and what plugins enable is this idea that this sort of chat interface becomes one of the default interfaces for interacting with the internet.
There are some categories of experience or query or question that that makes obvious sense to me.
For example, the plugin that I've found myself using the most by far is a plugin called XPapers.
It allows chat GPT to pull from and read papers that are in the archive.org data set.
And given how much of the frontier of AI and these related technologies lives in these research papers,
giving chat CPT the ability to read and summarize those papers directly is hugely valuable for me as a content creator.
This is a type of experience and information query that fits perfectly within this chat-based paradigm.
The question is whether everything else will. Do people, for example, as we're watching here, want to search for jobs in the chat GPT interface? Or are they inevitably going to go to a jobs website and browse because it's hard for them to know exactly what might spark interest when they're looking in that way? Are people going to dump their shopping lists into chat GPT and have it plug into Instacart? Or again, are they going to want to browse Instacart directly to go find things that they're looking for? Are people going to want to build apps from this type of chat-based interface? Or are there other
types of developer environments that are going to be better suited to it. The argument for all of those
could be yes, it could be that we discover that chat-based interfaces are just the default way that we interact
with the internet going forward. But it would be unsurprising to me if we discover that there are
ultimately certain types of experiences that fit well within this and certain types of experiences that do not.
You're already seeing other startups, even in the chatbot space, even using GPT models,
try to figure out different types of experiences for different types of queries. Perplexity, for example,
that I've been absolutely loving for research use cases. I recently asked its co-pilot version,
what are important terms for a podcast sponsorship contract as a way to test it out. It showed the
steps it was taking, including which articles it was pulling from and considering, and then came up
with a recommendation or a response that had actual sourcing so I could go dig into any particular
piece or see where it was pulling that information from. From there, I was able to ask a follow-up,
could you please write a sample contract, including all of the above? Perhaps an even better example
is how Google is starting to integrate generative AI into its Google search,
effectively creating something that is a hybrid between the traditional Google search experience
that people know and are used to with this type of chatbot experience.
This was at the very center of their Google I.O.
Conference a couple weeks ago, and people are starting to get access to what they're calling
their Google search generative experience that integrates the type of shopping features
that wouldn't make sense in a chat GPT context right now, at least as it's currently designed.
And so we come back to the question of whether chat GPT,
plugins are overhyped. As is so often the case, the answer is yes and no. The yes part of the overhyped
answer is that there are big blaring, glaring issues with plugins that make them far, far from
their most effective. They're only for paid users. They're impossible to search for. They have very
limited information to describe themselves when you do find them. You kind of have to know how to use
them and query them. Some of them require you to set things up off-site, such as the PDF services,
and some of them just flat out don't work for the thing that you expected them to do.
At the same time, others are like literal magic and absolutely extend the chat GPT experience
that has seemed so transformative into new realms and domains.
When they are designed well and when they are used well, these plugins can make chat GPT
even more performant.
And I think that in many ways, the only thing that they really suffer from right now is perhaps
mistaken expectations about how evolved they are.
They are still very, very much a beta feature.
They just made the decision that it was better to let people try these things out,
even if they didn't have good tools for searching for exactly the ones that they wanted.
And I don't know many people who are excited about Chat Chepti who would have preferred to wait.
I told you about how I used XPapers, and in the next couple days,
I will do another video about the Chat ChbT plugins that I'm actually finding useful.
But in the meantime, I'd encourage us just to first contextualize this in terms of where this feature is
relative to Chat Chept's evolution.
two, to understand what it might mean for the long term of interfaces on the internet,
and three, to remember that ultimately hype threads are exactly that.
They get engagement by convincing us that everything is wonderful and that we're missing out,
not by having nuanced opinions.
Anyways, guys, that is it for today's AI breakdown.
Hopefully this was interesting or useful, a little bit of internet history in there.
If you're enjoying the AI breakdown, please like, subscribe, and share.
Check out the newsletter version and the podcast.
And until next time, peace.
