The AI Daily Brief: Artificial Intelligence News and Analysis - How Big A Deal Is GPT-3.5 Fine-Tuning From OpenAI?

Starting point is 00:00:00 Today on the AI breakdown, we're looking at OpenAI's new fine-tuning for GPT3.5 Turbo. Before that on the brief, a new translation model from meta and a new approach to AI agents. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our newsletter, our YouTube, and our Discord. Welcome back to the AI breakdown brief, all the AI headline news you need in around five minutes. One of the great promises of artificial intelligence is the ability to break down linguistic barriers. There is no reason in a world of advanced artificial intelligence that people should be divided because they can't understand each other. However, of course,

Starting point is 00:00:44 to bring that future into reality, people have to go out and build the models. Yesterday, Meta announced Seameless M4T, which is a multimodal AI model for speech and text translations that works across nearly 100 languages. So what's really interesting about this model is that it is multimodal and multilingual. It was trained on not only speech data, it was trained on both text and audio. And because of that, it has speech recognition for nearly 100 languages, speech to text translation for those languages, speech to speech translation, text to text translation, and text to speech translation. Now, in addition to the model itself being open sourced under a research license, meta is also releasing the dataset that it was trained on called Seameless Align,

Starting point is 00:01:27 which they say is the biggest open multimodal translation data set to date that includes 270,000 hours of mind speech and text alignments. Now, this is not metas first entrance into this space. Last year, they released a model called No Language Left Behind, which was a text-to-text machine translation model supporting 200 languages, NLLB has since become integrated into Wikipedia as one of the translation providers. And then earlier this year, they also unveiled their massively multilingual speech, which is a speech recognition model, for more than 1,100 languages. TechCrunch described the training in somewhat more detail. They said, Researchers aligned 443,000 hours of speech with texts and created 29,000 hours of speech to

Starting point is 00:02:07 speech alignments, which taught Seamless M4T how to transcribe speech to text, translate text, generate speech from text, and even translate words spoken in one language into words in another language. Meta claims that on internal benchmarks, Seamless M4T performed better against background noises and speaker variations in speech-to-text tasks compared to the current state-of-the-art speech transcription model. And Meta basically thinks that's because of the rich combination of speech and text data that is combined in the training dataset. Now, one of the big concerns with any AI model is the extent to which it brings to bear or even amplifies the biases of the data set that it was trained upon. Meta has found some of this. For example, in a white paper that they

Starting point is 00:02:46 published alongside the announcement, Meda said that the model, quote, overgeneralizes to masculine forms when translating from neutral terms. And what's more, in the absence of gender information, it translates the masculine form about 10% of the time, which they believe might be due to a, quote, overrepresentation of masculine lexica in the training data. Now, overall, they argue that Seamless M4T doesn't add a, quote, outsized amount of toxic text in its translations, but some languages seem more of a problem with that, which led meta to create a filter for toxicity in the public demo. Now, the bigger issue, TechCrunch says, is the, quote, loss of lexical richness which can result from AI translators being overused. They write, unlike AI, human interpreters make choices unique to them when

Starting point is 00:03:26 translating one language into another. They might explicate, normalize, or condense, and summarize, creating fingerprints known informally as translation ease. AI systems might generate more, quote-unquote, accurate translations, but those translations could be coming at the expense of translation variety and diversity. That's probably why meta advises against using seamless M4T for long-form translation and certified translations, like those recognized by government agencies and translation authorities. Meta also discourages deploying seamless M4T for medical or legal purposes, where a mistranslation has a much higher cost. In fact, it can be life or death. Anyways, I think this is an extremely valuable and potent area of AI research, and I'm really glad to see more efforts focused

Starting point is 00:04:04 here. Next up, we move over to markets for just a moment. Anyone who has been watching the stock market this year knows that there has been this weird dichotomy between, on the one hand, an incredibly bad macro drop, right? We've had banking crises in the U.S., continued interest rate hikes from the Federal Reserve, and recently China has gone into outright deflation, and yet the stock market keeps chugging along largely on the back of big tech and more specifically AI companies. Now the rally is starting to look shaky, and it seems to many that the market has run out of steam, including potentially the AI stocks. For that reason, Nvidia's earnings report, which comes out after the bell closes today, is being viewed with much broader anticipation than it might otherwise have been. Now, people are

Starting point is 00:04:45 expecting good things from that report. InVedia was up 2% on the day, which puts it up 14% on the week and over 300% on the year, said JJ Kinnahan, the CEO of IG North America. It's not often that the fate of the market rests in the hand of just one stock, but it very much feels like that is what's going on at the moment. One other company that outperforms slightly was China's by-do. They beat quarterly revenue estimates leading to a small bump in their stock price, and markets also seem to respond well to their CEO and co-founder talking about how much of their emphasis was on generative AI going forward. Interestingly, though, as Reuters points out, the company is still waiting for Chinese

Starting point is 00:05:18 regulators' approval for a large-scale rollout of its chat GPT like ErnieBot to the public. Now, Baidu's CEO says that he does see authorities as becoming more supportive of generative AI. There's clearly a very different sense of nervousness when it comes to releasing those models to the public. Lastly, I want to close on an area that has been one of the hot-button topics in the AI space this year, which is, of course, autonomous AI agents. Around about April, there was so much buzz around baby AGI and AutoGPT. And really the idea of these things, what captured people's imaginations, was the idea that all you had to do was articulate an end goal.

Starting point is 00:05:55 And then agents would figure out what steps they needed to take to do it and then go take those steps, including potentially spinning up other AIs to actually get those steps done. Now, in practice, things haven't really worked exactly that way. And so I thought it was notable when Emberra, a startup in the space, wrote a tweet thread about how they're evolving. Ember founder Zach Tratter writes, Ember was one of the first AI agent's startups. Today we are renaming AI agents to AI commands and narrowing our focus away from autonomous agents. While autonomous agents took off in popularity, we found they were often unreliable for work, inefficient, and unsafe. Zach continues, first off, a primer on autonomous

Starting point is 00:06:33 agents. Give them a task, give them memory, give them access to tools. They decide on the fly how to accomplish your task and remove you from this process because it's autonomous. This has some problems. Failure rate is high. Even when they're successful, they're often costly and inefficient. Full autonomy removes humans from being in control. Agentic controls are weak. Security concerns are very real. Autonomy and access to your private data equals a challenge. As we've seen with chat GPT, the way machine learning experts create safety is via reinforcement learning. But regular end users, even engineers, cannot assess safety based on model weights. And prompt injection remains a large attack factor. We all need basic functional controls. Emberra exists to help professionals accomplish. work, not create AGI. By design, Emberra is a professional AI assistant where you stay in control.

Starting point is 00:07:18 You can always know what it will do, why, and how. I believe AI agents make sense for applications where each agent combines long-term memory and high autonomy. The work by NVIDIA and A16Z on AI-town-like environments produces interesting behavior, though its practical utility for professionals appears low. We care about making humans as great as they can be. When you work, your personality and humanity should be core, not diminished by NIs. You represent yourself still, but you're able to become a superpowered version of yourself. You are not a black box. Salespeople care about closing deals and crafting outbound. Support teams care about accessing knowledge, and managers care about performance and people. Professional should be the ones making the important decisions

Starting point is 00:07:54 in steering AI, not letting it run loose and hoping for the best. Now, I think this is a super smart shift from the perspective of them as a startup. I think honing in on the fact that AI can be really good at tasks, but has a hard time with overall solving end-to-end problems and sequencing, is a good insight to build around when it comes to bringing something to the market that's actually useful for professionals right now. I think that this is an area where there's a little bit of a mismatch between, on the one hand, the developer's dream and the excitement and enticement of something that really was a true practical autonomous agent, and the reality of what most people might need or actually adopt, which is something that kind of just lets them scale their own labor. Now, I do think that what

Starting point is 00:08:31 people build in startups has a big impact on how markets evolve. And so seeing autonomous agent start-start-to-shift to this more task or, as they put it, command-focused model, could have interesting implications for how the AI agent space develops in the months to come. Anyways, thought it was an interesting little footnote, something to keep in mind. But that is going to do it for today's brief. Appreciate you listening as always, and I'll be back soon with the main AI breakdown. Before we get into the main AI breakdown, I want to tell you about today's sponsor, Supermanage.

Starting point is 00:09:00 If you work in a professional setting, you probably have some version of a one-on-one meeting, either with the people that work for you or the people that you work with. Unfortunately, all too often, those one-on-one meetings become glorified catch-up calls. Don't you wish you could jump right to the stuff that really matters? That's where SuperManage comes in. Supermanage AI magically distills your team's public Slack channels into a real-time brief on any employee, any time. Catch up on contributions, work in progress, challenges they're facing, sentiment, everything you need to show up ready for a truly meaningful conversation.

Starting point is 00:09:33 And it's completely free. Visit supermanage.AI forward slash breakdown today to start making the most of your one-on-ones. And thanks again to Supermanage for sponsoring the AI breakdown. Welcome back to the AI breakdown. Today we are talking about the latest update from OpenAI and putting it in the context of the larger evolution of the AI space, particularly as it relates to enterprise companies and the decisions they're making about how to implement artificial intelligence models. First, the specific update. OpenAI has now released fine-tuning for GBT 3.5 Turbo. Now, fine-tuning is basically exactly what it sounds like. It means taking a pre-trained

Starting point is 00:10:13 model, in this case, GPT3.5 Turbo, and adjusting the pre-trained model's parameters slightly to make it perform better on a specific task. In this instance, the use case that OpenAI is imagining is companies and developers bringing their own data to the table to better customize GBT3.5 Turbo for their specific needs. The company writes, This update gives developers the ability to customize models that perform better for their use cases and run these custom models at scale. Early tests have shown a fine-tuned version of GPT3.5 Turbo can match or even outperform based GPT4 level capabilities on certain narrow tasks. As with all our APIs, data sent in and out of the fine-tuning API is owned by the customer and is not used by OpenAI or any other organization to train other models.

Starting point is 00:10:58 Now, we'll come back to that question around privacy and data integrity in just a moment. So what are some of the use cases? The announcement post gives a few. They write, in our private beta, fine-tuning customers have been able to make meaningful improvement model performance across common use cases such as improved steerability. Fine-tuning allows businesses to make the model follow instructions better, such as making outputs terse or always responding in a given language. For instance, developers can use fine-tuning to ensure that the model always responds in German when prompted to use that language. A second common use case is reliable output formatting.

Starting point is 00:11:32 Fine-tuning improves the model's ability to consistently format responses. A crucial aspect for applications demanding a specific response format such as code completion or composing API calls. A developer can use fine-tuning to more reliably convert user prompts into high-quality JSON snippets that can be used with their own systems. A third common use case is custom tone. Fine-tuning they write is a great way to hone the qualitative feel of the model output such as its tone, so it better fits the voice of businesses' brands.

Starting point is 00:11:59 Now, the other things that OpenAI points out is that fine-tuning can enable businesses to shorten their prompts while having similar performance. This, as we'll see in a moment, becomes really important given the cost. And also, they've increased the number of tokens that fine-tuning with GBT 3.5 Turbo can handle. It's up to 4K tokens, which was double their previous fine-tuned models. OpenAI writes that early testers have reduced prompt size by up to 90% by fine-tuning instructions into the model itself, speeding up each API call and cutting costs. So obviously what this is about ultimately is businesses and developers better being able to use this model to get outcomes that are specific to their business and that rely on their specific data.

Starting point is 00:12:36 The optimistic view of this is that it's hugely significant and unlocks a huge variety of use cases that weren't available before. Invidia's Dr. Jim Fan writes, OpenAI's most significant product update since the App Store, GPT3.5, fine-tuning API. This will be the largest Laura as a service ever. Laura, in this case, stands for low-rank adaptation. And as Jim concludes, quote, I'm expecting a barrage of new applications from all walks of life out there. So this is great, right?

Starting point is 00:13:05 OpenAI is giving businesses and developers the ability to do more with their tools. It's one of the leading platforms to build on. Seems likely to be a big success, right? Well, there are some concerns. Cambridge AI postdoc Dr. Ahmed Zaidi says, fine-tuning is essential for broader AI application adoption, but I think OpenAI got this wrong. And Dr. Offmed then points to two different reasons. The first, he says, is data privacy. He argues no company will feel comfortable uploading their data to fine-tune the

Starting point is 00:13:32 model. The second is ROI versus cost. Companies and individuals are paying for training and inference while using their own data seems like a bad deal. All upside for OpenAI, but not necessarily for companies. Now, on this first point, the privacy point, OpenAI's developer relations lead Logan actually responded. He wrote, Many of the world's leading and most sophisticated companies trust open AI and use our services. Security and privacy are critical to us. Dr. Ahmed responded, I hear you and I genuinely hope I'm wrong. Building language models for over a decade, what chat GPT has done for language model streetcred is great. However, it does seem more in a different world now in terms of data and

Starting point is 00:14:08 ownership at an enterprise level. Ultimately, the market will tell us if this strategy works or not. And this is, I think, the really interesting thing that this brings up. One of the big business questions in this space is to what extent enterprises and big companies will work with third-party platforms such as OpenAI or whether they'll spin up solutions on their own. To the extent that the pattern of the past holds true, companies will likely try to spin up their own solutions and then ultimately shift over to the winners in the third-party space that the market agrees and decides to trust. This is sort of the position of Antonosica who writes, Prediction. Fine tuning is going to be huge. Most companies training their own foundation models right now

Starting point is 00:14:48 will regret it, since fine tuning APIs will become so much better. In the same way that companies that hire tons of deep learning engineers for computer vision models, et cetera, before they had the data maturity for it, regretted not just waiting for cloud APIs. So basically the argument here is, there's going to be a real temptation for enterprises to spin up their own models using their own resources. This is something we've been following closely, right? The proliferation of advanced open source models has made the calculus around this feel just a little bit different. It's had impact on the startup sector, which hasn't necessarily had the uptake from enterprises that they assume they would, and it obviously has big implications for how the industry develops. One of the interesting

Starting point is 00:15:24 patterns that sort of reflects this as well is that the big enterprise software providers are increasingly offering models that are something of a hybrid, where instead of just pushing one proprietary model like OpenAI's GPT, they're creating a sandbox or environment in which enterprises can customize any of the various models for their specific needs. This is basically Amazon's approach with Bedrock. And more recently, Microsoft has been giving signals that, despite their huge investment in OpenAI, they might be doing something similar, partnering with companies like Databricks to help companies customize their own solutions. The other factor in this is, of course, cost. This is a lot more expensive. That was the second part of Dr. Ahmed's tweet, and that's been a big part of the discussion as

Starting point is 00:16:04 well. Wireless Anon writes, GP2 fine-tuning pricing is pretty bad, might push even more people to open-source models. Now, Logan's response to that was saying, the upside is being able to create a better product experience for your users. Companies invest in this all the time. Fine-tuning is no different. Basically, OpenAI's argument is that even if it's expensive, it's worth it. But of course, companies aren't going to be looking at these prices in a vacuum. They're going to be comparing it to other approaches to the problem, including adapting open source models. Right now, we just don't know. Like I said, I think if patterns hold true, enterprises are likely to try a lot of custom spun-up solutions and then shift over to whatever third parties end up being dominant,

Starting point is 00:16:42 just because that's the natural pattern of software. Is it possible, however, that the concerns around data privacy in this case are so high that the risk of leaks is too great ultimately for many to choose a third-party service? It does seem possible. Now, one more perhaps related, perhaps unrelated note on the open source front, the information is reporting that Salesforce is leading the financing of HuggingFace at more than a $4 billion valuation. HuggingFace basically helps companies store and use AI software across a huge variety of open

Starting point is 00:17:11 source models. The information writes, HuggingFace is on pace to generate more than $30 million in revenue annually. Amazon Web Services, Microsoft IBM, and others pay HuggingFace for steering its users to their respective cloud computing services. Founded in 2016, HuggingFace also charges developers for an enterprise version of its repository for machine learning models. The startup, which has more than 200 employees, says more than 10,000 companies use its

Starting point is 00:17:33 free or paid repositories. It hosts at least several hundred thousand AI models, including popular open source large language models such as meta platforms Lama 2. The funding by Salesforce suggests it may view Hugging Face as a potential future acquisition. While Salesforce is best known for software used by sales professionals and for the Slack chat app, it also sells an array of services for software developers. Now, while it may seem like some of this stuff is a little bit in the weeds, I actually think that business model conversations have a huge and deterministic impact on how the AI industry is going

Starting point is 00:18:03 to evolve. If every enterprise in the world prioritizes privacy to the highest degree and only wants to customize their own models, the AI space is going to look very different than if they all choose to work with third parties like OpenAI. It'll have ramifications on how startups build, what startups build, and ultimately ramifications on the tools we use. But for now, in this stage where it's all about competition and seeing what sticks, it's great to have this new capacity of GPT 3.5 turbo fine tuning available, and I'm excited to see what companies build next with it. That's going to do it for today's AI breakdown. If you're enjoying, please like, subscribe, and share. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - How Big A Deal Is GPT-3.5 Fine-Tuning From OpenAI?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.