The AI Daily Brief: Artificial Intelligence News and Analysis - How Big A Deal Is GPT-3.5 Fine-Tuning From OpenAI?
Episode Date: August 24, 2023OpenAI has released finetuning for GPT-3.5 Turbo. It's expensive, and for some comes with privacy concerns, but also opens up many new applications. Before that on The Brief: Meta's new AI translation... model, Nvidia earnings hopes and a new approach to AI agents. Today's Sponsor: Supermanage - AI for 1-on-1's - https://supermanage.ai/breakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're looking at OpenAI's new fine-tuning for GPT3.5 Turbo.
Before that on the brief, a new translation model from meta and a new approach to AI agents.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.network for more information about our newsletter, our YouTube, and our Discord.
Welcome back to the AI breakdown brief, all the AI headline news you need in around five minutes.
One of the great promises of artificial intelligence is the ability to
break down linguistic barriers. There is no reason in a world of advanced artificial intelligence
that people should be divided because they can't understand each other. However, of course,
to bring that future into reality, people have to go out and build the models. Yesterday,
Meta announced Seameless M4T, which is a multimodal AI model for speech and text translations
that works across nearly 100 languages. So what's really interesting about this model is that it
is multimodal and multilingual. It was trained on not only speech data, it was trained on both
text and audio. And because of that, it has speech recognition for nearly 100 languages,
speech to text translation for those languages, speech to speech translation, text to text translation,
and text to speech translation. Now, in addition to the model itself being open sourced under
a research license, meta is also releasing the dataset that it was trained on called Seameless Align,
which they say is the biggest open multimodal translation data set to date that includes 270,000 hours of mind speech and text alignments.
Now, this is not metas first entrance into this space.
Last year, they released a model called No Language Left Behind, which was a text-to-text machine translation model supporting 200 languages,
NLLB has since become integrated into Wikipedia as one of the translation providers.
And then earlier this year, they also unveiled their massively multilingual speech, which is a speech recognition model,
for more than 1,100 languages. TechCrunch described the training in somewhat more detail.
They said,
Researchers aligned 443,000 hours of speech with texts and created 29,000 hours of speech to
speech alignments, which taught Seamless M4T how to transcribe speech to text, translate text,
generate speech from text, and even translate words spoken in one language into words in another
language. Meta claims that on internal benchmarks, Seamless M4T performed better against background
noises and speaker variations in speech-to-text tasks compared to the current state-of-the-art speech
transcription model. And Meta basically thinks that's because of the rich combination of speech
and text data that is combined in the training dataset. Now, one of the big concerns with any
AI model is the extent to which it brings to bear or even amplifies the biases of the data
set that it was trained upon. Meta has found some of this. For example, in a white paper that they
published alongside the announcement, Meda said that the model, quote, overgeneralizes to masculine
forms when translating from neutral terms. And what's more, in the absence of gender information,
it translates the masculine form about 10% of the time, which they believe might be due to a, quote,
overrepresentation of masculine lexica in the training data. Now, overall, they argue that Seamless M4T
doesn't add a, quote, outsized amount of toxic text in its translations, but some languages
seem more of a problem with that, which led meta to create a filter for toxicity in the public demo.
Now, the bigger issue, TechCrunch says, is the, quote, loss of lexical richness which can result from
AI translators being overused. They write, unlike AI, human interpreters make choices unique to them when
translating one language into another. They might explicate, normalize, or condense, and summarize,
creating fingerprints known informally as translation ease. AI systems might generate more, quote-unquote,
accurate translations, but those translations could be coming at the expense of translation
variety and diversity. That's probably why meta advises against using seamless M4T for long-form
translation and certified translations, like those recognized by government agencies and translation
authorities. Meta also discourages deploying seamless M4T for medical or legal purposes, where a
mistranslation has a much higher cost. In fact, it can be life or death. Anyways, I think this is an
extremely valuable and potent area of AI research, and I'm really glad to see more efforts focused
here. Next up, we move over to markets for just a moment. Anyone who has been watching the stock market
this year knows that there has been this weird dichotomy between, on the one hand, an incredibly bad macro
drop, right? We've had banking crises in the U.S., continued interest rate hikes from the Federal
Reserve, and recently China has gone into outright deflation, and yet the stock market keeps chugging
along largely on the back of big tech and more specifically AI companies. Now the rally is
starting to look shaky, and it seems to many that the market has run out of steam, including
potentially the AI stocks. For that reason, Nvidia's earnings report, which comes out after the bell
closes today, is being viewed with much broader anticipation than it might otherwise have been. Now, people are
expecting good things from that report. InVedia was up 2% on the day, which puts it up 14% on the week
and over 300% on the year, said JJ Kinnahan, the CEO of IG North America. It's not often that
the fate of the market rests in the hand of just one stock, but it very much feels like that
is what's going on at the moment. One other company that outperforms slightly was China's
by-do. They beat quarterly revenue estimates leading to a small bump in their stock price,
and markets also seem to respond well to their CEO and co-founder talking about how much of
their emphasis was on generative AI going forward.
Interestingly, though, as Reuters points out, the company is still waiting for Chinese
regulators' approval for a large-scale rollout of its chat GPT like ErnieBot to the public.
Now, Baidu's CEO says that he does see authorities as becoming more supportive of generative
AI. There's clearly a very different sense of nervousness when it comes to releasing those
models to the public. Lastly, I want to close on an area that has been one of the hot-button
topics in the AI space this year, which is, of course, autonomous AI agents.
Around about April, there was so much buzz around baby AGI and AutoGPT.
And really the idea of these things, what captured people's imaginations, was the idea that
all you had to do was articulate an end goal.
And then agents would figure out what steps they needed to take to do it and then go take
those steps, including potentially spinning up other AIs to actually get those steps done.
Now, in practice, things haven't really worked exactly that way.
And so I thought it was notable when Emberra, a startup in the space, wrote a tweet thread about
how they're evolving. Ember founder Zach Tratter writes, Ember was one of the first AI agent's
startups. Today we are renaming AI agents to AI commands and narrowing our focus away from
autonomous agents. While autonomous agents took off in popularity, we found they were often
unreliable for work, inefficient, and unsafe. Zach continues, first off, a primer on autonomous
agents. Give them a task, give them memory, give them access to tools. They decide on the fly
how to accomplish your task and remove you from this process because it's autonomous.
This has some problems. Failure rate is high. Even when they're successful, they're often costly and inefficient.
Full autonomy removes humans from being in control. Agentic controls are weak. Security concerns are very real.
Autonomy and access to your private data equals a challenge. As we've seen with chat GPT, the way machine learning experts create safety is via reinforcement learning.
But regular end users, even engineers, cannot assess safety based on model weights. And prompt injection remains a large attack factor.
We all need basic functional controls. Emberra exists to help professionals accomplish.
work, not create AGI. By design, Emberra is a professional AI assistant where you stay in control.
You can always know what it will do, why, and how. I believe AI agents make sense for applications
where each agent combines long-term memory and high autonomy. The work by NVIDIA and A16Z on AI-town-like
environments produces interesting behavior, though its practical utility for professionals appears
low. We care about making humans as great as they can be. When you work, your personality
and humanity should be core, not diminished by NIs. You represent yourself still, but you're able to
become a superpowered version of yourself. You are not a black box. Salespeople care about
closing deals and crafting outbound. Support teams care about accessing knowledge, and managers
care about performance and people. Professional should be the ones making the important decisions
in steering AI, not letting it run loose and hoping for the best. Now, I think this is a super
smart shift from the perspective of them as a startup. I think honing in on the fact that AI can
be really good at tasks, but has a hard time with overall solving end-to-end problems and sequencing,
is a good insight to build around when it comes to bringing something to the market that's actually
useful for professionals right now. I think that this is an area where there's a little bit of a mismatch
between, on the one hand, the developer's dream and the excitement and enticement of something that
really was a true practical autonomous agent, and the reality of what most people might need or actually
adopt, which is something that kind of just lets them scale their own labor. Now, I do think that what
people build in startups has a big impact on how markets evolve. And so seeing autonomous agent
start-start-to-shift to this more task or, as they put it, command-focused model,
could have interesting implications for how the AI agent space develops in the months to come.
Anyways, thought it was an interesting little footnote, something to keep in mind.
But that is going to do it for today's brief.
Appreciate you listening as always, and I'll be back soon with the main AI breakdown.
Before we get into the main AI breakdown, I want to tell you about today's sponsor,
Supermanage.
If you work in a professional setting, you probably have some version of a one-on-one meeting,
either with the people that work for you or the people that you work with.
Unfortunately, all too often, those one-on-one meetings become glorified catch-up calls.
Don't you wish you could jump right to the stuff that really matters?
That's where SuperManage comes in.
Supermanage AI magically distills your team's public Slack channels into a real-time brief on any employee, any time.
Catch up on contributions, work in progress, challenges they're facing, sentiment,
everything you need to show up ready for a truly meaningful conversation.
And it's completely free.
Visit supermanage.AI forward slash breakdown today to start making the most of your one-on-ones.
And thanks again to Supermanage for sponsoring the AI breakdown.
Welcome back to the AI breakdown.
Today we are talking about the latest update from OpenAI and putting it in the context of the larger evolution of the AI space,
particularly as it relates to enterprise companies and the decisions they're making about how to implement artificial intelligence models.
First, the specific update. OpenAI has now released fine-tuning for GBT 3.5 Turbo.
Now, fine-tuning is basically exactly what it sounds like. It means taking a pre-trained
model, in this case, GPT3.5 Turbo, and adjusting the pre-trained model's parameters
slightly to make it perform better on a specific task. In this instance, the use case that
OpenAI is imagining is companies and developers bringing their own data to the table
to better customize GBT3.5 Turbo for their specific needs.
The company writes,
This update gives developers the ability to customize models that perform better for their use cases and run these custom models at scale.
Early tests have shown a fine-tuned version of GPT3.5 Turbo can match or even outperform based GPT4 level capabilities on certain narrow tasks.
As with all our APIs, data sent in and out of the fine-tuning API is owned by the customer and is not used by OpenAI or any other organization to train other models.
Now, we'll come back to that question around privacy and data integrity in just a moment.
So what are some of the use cases? The announcement post gives a few. They write, in our private
beta, fine-tuning customers have been able to make meaningful improvement model performance across
common use cases such as improved steerability. Fine-tuning allows businesses to make the model
follow instructions better, such as making outputs terse or always responding in a given language.
For instance, developers can use fine-tuning to ensure that the model always responds in German
when prompted to use that language.
A second common use case is reliable output formatting.
Fine-tuning improves the model's ability to consistently format responses.
A crucial aspect for applications demanding a specific response format such as code
completion or composing API calls.
A developer can use fine-tuning to more reliably convert user prompts into high-quality
JSON snippets that can be used with their own systems.
A third common use case is custom tone.
Fine-tuning they write is a great way to hone the qualitative feel of the model output
such as its tone, so it better fits the voice of businesses' brands.
Now, the other things that OpenAI points out is that fine-tuning can enable businesses to shorten
their prompts while having similar performance. This, as we'll see in a moment, becomes really
important given the cost. And also, they've increased the number of tokens that fine-tuning
with GBT 3.5 Turbo can handle. It's up to 4K tokens, which was double their previous fine-tuned
models. OpenAI writes that early testers have reduced prompt size by up to 90% by fine-tuning
instructions into the model itself, speeding up each API call and cutting costs. So obviously
what this is about ultimately is businesses and developers better being able to use this model
to get outcomes that are specific to their business and that rely on their specific data.
The optimistic view of this is that it's hugely significant and unlocks a huge variety
of use cases that weren't available before. Invidia's Dr. Jim Fan writes,
OpenAI's most significant product update since the App Store, GPT3.5, fine-tuning API.
This will be the largest Laura as a service ever.
Laura, in this case, stands for low-rank adaptation.
And as Jim concludes, quote,
I'm expecting a barrage of new applications from all walks of life out there.
So this is great, right?
OpenAI is giving businesses and developers the ability to do more with their tools.
It's one of the leading platforms to build on.
Seems likely to be a big success, right?
Well, there are some concerns.
Cambridge AI postdoc Dr. Ahmed Zaidi says,
fine-tuning is essential for broader AI application adoption,
but I think OpenAI got this wrong. And Dr. Offmed then points to two different reasons. The first,
he says, is data privacy. He argues no company will feel comfortable uploading their data to fine-tune the
model. The second is ROI versus cost. Companies and individuals are paying for training and inference
while using their own data seems like a bad deal. All upside for OpenAI, but not necessarily for
companies. Now, on this first point, the privacy point, OpenAI's developer relations lead Logan
actually responded. He wrote,
Many of the world's leading and most sophisticated companies trust open AI and use our services.
Security and privacy are critical to us. Dr. Ahmed responded, I hear you and I genuinely hope I'm
wrong. Building language models for over a decade, what chat GPT has done for language model
streetcred is great. However, it does seem more in a different world now in terms of data and
ownership at an enterprise level. Ultimately, the market will tell us if this strategy works or not.
And this is, I think, the really interesting thing that this brings up. One of the big
business questions in this space is to what extent enterprises and big companies will work with
third-party platforms such as OpenAI or whether they'll spin up solutions on their own. To the extent
that the pattern of the past holds true, companies will likely try to spin up their own solutions
and then ultimately shift over to the winners in the third-party space that the market agrees and
decides to trust. This is sort of the position of Antonosica who writes,
Prediction. Fine tuning is going to be huge. Most companies training their own foundation models right now
will regret it, since fine tuning APIs will become so much better. In the same way that companies that
hire tons of deep learning engineers for computer vision models, et cetera, before they had the data
maturity for it, regretted not just waiting for cloud APIs. So basically the argument here is,
there's going to be a real temptation for enterprises to spin up their own models using their own
resources. This is something we've been following closely, right? The proliferation of advanced open
source models has made the calculus around this feel just a little bit different. It's had impact on
the startup sector, which hasn't necessarily had the uptake from enterprises that they assume they would,
and it obviously has big implications for how the industry develops. One of the interesting
patterns that sort of reflects this as well is that the big enterprise software providers are
increasingly offering models that are something of a hybrid, where instead of just pushing one proprietary
model like OpenAI's GPT, they're creating a sandbox or environment in which enterprises can
customize any of the various models for their specific needs. This is basically Amazon's approach with
Bedrock. And more recently, Microsoft has been giving signals that, despite their huge investment in
OpenAI, they might be doing something similar, partnering with companies like Databricks to help
companies customize their own solutions. The other factor in this is, of course, cost. This is a lot more
expensive. That was the second part of Dr. Ahmed's tweet, and that's been a big part of the discussion as
well. Wireless Anon writes, GP2 fine-tuning pricing is pretty bad, might push even more people to
open-source models. Now, Logan's response to that was saying, the upside is being able to create a
better product experience for your users. Companies invest in this all the time. Fine-tuning is no different.
Basically, OpenAI's argument is that even if it's expensive, it's worth it. But of course,
companies aren't going to be looking at these prices in a vacuum. They're going to be comparing it
to other approaches to the problem, including adapting open source models. Right now, we just don't know.
Like I said, I think if patterns hold true, enterprises are likely to try a lot of custom
spun-up solutions and then shift over to whatever third parties end up being dominant,
just because that's the natural pattern of software.
Is it possible, however, that the concerns around data privacy in this case are so high
that the risk of leaks is too great ultimately for many to choose a third-party service?
It does seem possible.
Now, one more perhaps related, perhaps unrelated note on the open source front,
the information is reporting that Salesforce is leading the financing of HuggingFace at more than a
$4 billion valuation.
HuggingFace basically helps companies store and use AI software across a huge variety of open
source models.
The information writes,
HuggingFace is on pace to generate more than $30 million in revenue annually.
Amazon Web Services, Microsoft IBM, and others pay HuggingFace for steering its users to their
respective cloud computing services.
Founded in 2016, HuggingFace also charges developers for an enterprise version of its
repository for machine learning models.
The startup, which has more than 200 employees, says more than 10,000 companies use its
free or paid repositories.
It hosts at least several hundred thousand AI models, including popular open source large
language models such as meta platforms Lama 2.
The funding by Salesforce suggests it may view Hugging Face as a potential future acquisition.
While Salesforce is best known for software used by sales professionals and for the Slack chat
app, it also sells an array of services for software developers.
Now, while it may seem like some of this stuff is a little bit in the weeds, I actually think
that business model conversations have a huge and deterministic impact on how the AI industry is going
to evolve. If every enterprise in the world prioritizes privacy to the highest degree and only wants
to customize their own models, the AI space is going to look very different than if they all
choose to work with third parties like OpenAI. It'll have ramifications on how startups build,
what startups build, and ultimately ramifications on the tools we use. But for now, in this stage
where it's all about competition and seeing what sticks, it's great to have this new capacity of
GPT 3.5 turbo fine tuning available, and I'm excited to see what companies build next with it.
That's going to do it for today's AI breakdown. If you're enjoying, please like, subscribe,
and share. And until next time, peace.
