The AI Daily Brief: Artificial Intelligence News and Analysis - Google Bard Gets Major Update While OpenAI Races to Multimodal

Episode Date: September 19, 2023

Google Bard announced a big update including a deep integration with Google Workspace apps. At the same time, The Information reports that OpenAI is racing to beat Google to release a multimodal model... of GPT-4. Before that on the Brief, Microsoft AI researchers accidentally expose 38 terabytes of private data. TAKE OUR SURVEY ON EDUCATIONAL AND LEARNING RESOURCE CONTENT: https://bit.ly/aibreakdownsurvey ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI breakdown, we're looking at Google Bard's big new announcements and updates. Before that on the brief, a major data breach from Microsoft as it accidentally gives access to terabytes of private company and employee data through a misconfigured GitHub repository. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our Discord, our newsletter, and our YouTube channel. Welcome back to the AI breakdown brief. All the AI headline news you need in around 5. minutes. We start with a story that shows just how difficult data security really can be in this
Starting point is 00:00:39 new world. Yesterday news broke that Microsoft AI researchers had accidentally exposed 38 terabytes of internal sensitive data, ranging from passwords to private keys to even backups of personal computers when they accidentally uploaded it to a GitHub repository, along with open source code. Now, the discovery was made by online security startup whiz. So what actually has to be? So what actually happened? It sounds like a GitHub repository that Microsoft had uploaded open source code and AI models for image recognition to was configured incorrectly. Basically to get the repository, they were given an Azure storage URL. However, as TechCrunch put it, WIS found that the URL was configured to grant permissions on the entire storage account exposing additional private data
Starting point is 00:01:22 by mistake. And it wasn't just a little bit of data. In fact, it was 38 terabytes of data. In addition to the passwords, personal computer backups, and private keys that were mentioned before, it also included more than 30,000 internal Microsoft Teams messages spread across hundreds of different Microsoft employees. Basically, this was a small misconfiguration, which allowed for full control rather than read-only permissions, and apparently had exposed this data going back all the way to 2020. While the story just broke yesterday, Wizz said that it actually shared its findings first with Microsoft all the way back on June 22nd, and within two days, Microsoft had removed the overly permissive shared action. access signature or SAS token in the URL. Now, according to a post-mortem blog post by Microsoft,
Starting point is 00:02:03 their security response center said that, quote, no customer data was exposed and no other internal services were put at risk because of this issue. What's more, Microsoft said that because of all of this, it expanded what's called GitHub's secret spanning service, which is a tool for monitoring public open source code uploaded to GitHub that might have accidentally exposed credentials or other secrets with this type of overly permissive token. So, as Wiz put it on Twitter, AI's potential is limitless, but data security is paramount. The incident serves as a reminder, as AI evolves, developers and researchers who rely on data sharing must prioritize securing sensitive information. Next up today, just a little conversation between the Israeli Prime Minister,
Starting point is 00:02:40 OpenAI President and co-founder Greg Brockman, AI safety expert and Professor Max Tegmark, and of course the owner of ex Elon Musk himself. Now, while the event was nominally billed as an AI roundtable, it very quickly became a conversation about anti-Semitism and censorship, which makes sense, given the battles that Elon Musk has recently been fighting with the American organization the Anti-Defamation League. Indeed, the conversation was so not focused on artificial intelligence ultimately that basically no follow-up headline even made note of what they said. One thing that is making headlines is SoftBanks's growing plans to get deeper into the artificial intelligence space. SoftBank is seen by many as one of the big progenitors of the boom, or
Starting point is 00:03:18 depending on who you ask, the bubble, in tech startups over the last decade. The company has been up and down, but in the wake of the Arm IPO, the most successful IPO in the last two years, it appears that SoftBank is eager to deploy tens of billions of dollars into the artificial intelligence space. The Financial Times reports that one of the targets that they have in mind is, in fact, ChatGPT creator OpenAI. According to FT sources, not only is SoftBank interested in an investment in OpenAI, but they're also potentially exploring a broader strategic partnership as well. That said, the same sources say that SoftBank is deligencing a significant number of substantial AI investments, including those in direct rivals of OpenAI. The sources said that they
Starting point is 00:03:56 had also made a preliminary approach to buy a UK-based AI chipmaker called Graphcore. According to the FT, Masayoshi Sun, who said in June he was a heavy user of ChatGBT, BT, has developed a close relationship with OpenAI's chief executive Sam Altman. Son has described Altman as, quote, one of the key people on Earth and said he speaks to him almost every day. Now, some viewed this report is a sign that if Goldman Sachs is right and were not yet in an AI bubble, that well, with Sop Offbank's help, a bubble might be coming right around the corner. Bubble or not, AI startups continue to command significant financing rounds with the latest nine-figure venture investment going to Ryder.
Starting point is 00:04:31 Writer has raised a $100 million series B, although the company is not yet in that unicorn billion dollar valuation class. Writer CEO May Habib wrote, dominant design is emerging for enterprise-grade generative AI internal applications. It's an LLM closely coupled with knowledge graphs and AI guardrails, supporting multiple generative AI use cases. Turns out it's all the stuff after setting up an LLM that's the most work, and writer solves those problems in a single integrated platform. High accuracy, plus fast time to value, plus ultimate security is what enterprises are looking for. In an email to TechCrunch,
Starting point is 00:05:04 Habib also said, many enterprises are still just scratching the surface on generative AI, mostly building internal company XGPT type applications. The harder, more impactful use cases require a lot more know-how on retrieval augmented generation, data gathering, and cleaning and workflow construction, and they're realizing that that's 90% of the work. That's the part that writer makes much easier. Now, in addition to just being a sign of, A, how much venture capitalists still see open space for enterprise AI companies, and B, writer's reflections on what the enterprise AI market is actually looking for and evolving towards, one other little detail I noticed from the TechCrunch reporting was that they wrote that writer, quote, claims to have trained its fine-tunable
Starting point is 00:05:41 models on business writing that isn't copyrighted. A key point at a time when the copyright status of AI generated works in the U.S. remains somewhat nebulous. Lastly, today, we are about due for a full show on the implications and the integration of generative AI into the financial space. And on that note, Morgan Stanley has just announced the launch of what they call their AI at Morgan Stanley Assistant, a tool for financial advisors, which gives them access to a database of more than 100,000 research reports and documents. Now, this AI powered assistant for financial advisors was built on top of software from OpenAI, but also trained on this proprietary database of more than 100,000 documents. According to the firm, quote, it's just the first in a series of solutions based on generative AI planned by the bank.
Starting point is 00:06:22 The firm is piloting a tool called debrief that automatically summarizes the content of client meetings and generates follow-up emails. You know, at some point soon, we're going to have to create an AI breakdown bingo card, where on any given week, we can go through the news and see how many of the most common threads get hit. New rumor from Google or OpenAI over here. New industry that implements a custom-built LLM over there. And of course, Enterprise AI Software financing rounds everywhere. But for now, that is going to do it for today's episode of the AI breakdown brief. Next up is the main AI breakdown.
Starting point is 00:06:55 Hey guys, one more quick note before we dive in. I so appreciate everyone who has taken the time to fill in our educational content survey. The TLDR, for those of you who haven't, is that we are considering launching a number of different types of AI educational content to better help you retrain, re-skill, get to where you want to be when it comes to AI, including some really different ideas, such as an AI learning community that would exist on digital. Discord or some other shared space. But I really need feedback to know what people actually need and what people most want. And if you are willing to take less than one minute to give me that feedback, I would be ever grateful. Just go to bit.ly slash AI breakdown survey.
Starting point is 00:07:32 And like I said, it'll take you less than a minute to fill that out. Thanks in advance and let's get to today's show. Welcome back to the AI breakdown. Today we are talking about the biggest conversation that we've had over the past month, which is, of course, Google versus OpenAI. when it comes to the battle for artificial intelligence, but today we don't just get to talk in terms of new rumors or scoops from the information, although we do have one of those, of course. Instead, we get to talk about actual feature updates. At 915 this morning, Google CEO Sundar Pichai tweeted, we're adding extensions to Bard, so you can connect it to your favorite Google apps, including Gmail, drive, and docs for even deeper collaboration. We're also updating how we validate the claims in Bard's responses with an improved Google it button and more. So let's dig into these updates and then go explore, A, what people are thinking about them, B, how they relate to news from OpenAI, and see what it says for the state of the industry.
Starting point is 00:08:27 All right, so Google Post this morning, Bard can now connect to your Google apps and services, and that is the really big part of this. The company writes, one of the biggest benefits of Bard and experiment to collaborate with generative AI is that it can tailor its responses to exactly what you need. Today, they say we're rolling out Bard's most capable model yet, And the biggest piece of that is BARD extensions. With extensions, they write, BARD can find and show you relevant information from the Google tools you use every day,
Starting point is 00:08:52 like Gmail, docs, drive, Google Maps, YouTube, and Google flights and hotels, even when the information you need is across multiple apps and services. Now, one of the examples they give is using BARD to help with the job application process. They write, you could ask Bard to, quote, find my resume titled June 2023 from my drive and summarize it to a short paragraph personal statement. What I think is really interesting about this is a couple things. First, in many ways, this is a personal version of the trend that we're seeing in the enterprise AI space as well, which is of course that if Generation 1 of these tools blew people away just
Starting point is 00:09:25 with the incredible capacities they had and the ability to help with things that technology just hadn't been able to help with before, the next generation feels very, very much about how much more helpful it can be if it has access to one's personal and private information. As we've discussed frequently on this show, this positions companies that enterprises already trust really well. Because if you are some big enterprise that's already dealing with the implications of new technology, not having to switch vendors or service providers, and being able to just work with companies that you already trust with your data is a huge benefit. Similarly, those who use Google Suite of Tools tend to really use Google Suites of Tools. As a for example,
Starting point is 00:10:05 at any given time, I have something like five Gmail accounts open. I actively use at least three different drive accounts based on different podcasts and different projects. Whenever, possible I use Google Docs and other features that live in Drive as a way to organize and share information with outside parties. And so for someone like me, having Bard actually integrated with everywhere that I already keep my information around these businesses just seems like a very differentiated use case. Now that said, I'm not exactly sure what I'm going to use it for yet. Will it end up just being a more powerful search tool that actually can go across these different accounts? Or will there be new workflows that I experiment with that change how I do things now?
Starting point is 00:10:43 I guess these are sort of the questions that everyone's asking themselves about AI in general. Now, the other part of this announcement is all about Bard's veracity, and the ability for users to actually double-check that the information they're getting from Bard is actually accurate. The company writes, when you click the G icon, Bard will read the response and evaluate whether there is content across the web to substantiate it. When a statement can be evaluated, you can click on the highlighted phrases and learn more about supporting or contradicting information found by search. Jack Krochick, who works on Bard, really emphasized this set of features in his announcement tweet about the new update. He writes, Bard now is the only language model to actively admit how confident it is in its response, and it admits when it makes a mistake. He goes on, Bard is the first language model to actively admit how confident it is in its response. Using the updated Google It button, you can now see confident claims in green, along with a link, and lower certainty and even gasp when it makes a mistake in Orange.
Starting point is 00:11:36 These capabilities are coming worldwide, and English to start will bring more languages and partners soon. Now, Jack also points out one more feature, which somehow gets buried because there's so much else going on, which is that images and prompts and responses are now live in over 40 languages. In other words, there is a subtle but clear expansion of multimodality in this new update. Now, when it comes to what gets people on AI Twitter excited about news stories, it tends to be updated technological capacities. In other words, you're going to see a lot more retweets about the force.
Starting point is 00:12:06 coming Gemini, which might be able to outcompete GPD4, then you will about a set of features that improve the usability of Bard. But when it comes to the day-to-day of actually using these tools, these types of integrations are hugely significant in how this battle is going to shake out. Basically, every company in the AI space that isn't open AI and that already has a relationship with customers is heavily leaning on the relationship that it already has with those customers to try to get a leg up in the AI battle. Certainly that is the case as we can see from this latest BART update that sits inside this hugely popular suite of workspace tools, but it also seems like it's going to be a big part of Apple strategy as well. The neural engine that Apple unveiled as part
Starting point is 00:12:46 of their iPhone 15 event is being used today to help with a new gesture-based interaction system for the Apple Watch, but feels much more like an attempt to bring enough computing power to mobile devices to actually run powerful personalized AI models on device without having to mess around with the cloud. This is, of course, something that Apple has done over and over again in other sectors, but it seems like they're trying to bring to artificial intelligence as well. Now, I mentioned that this isn't the strategy for OpenAI because, of course, they don't have existing relationships with customers. They're starting from the ground up with the people who are using their tools, such as chat GPT. Yesterday, in a surprise to no one, we got reports that OpenAI is behind the scenes,
Starting point is 00:13:23 working very diligently to try to get out ahead of Google when it comes to launching a multimodal LLM. John Victor from the information writes, as fall approaches, Google and OpenAI are locked in a good old-fashioned software race, aiming to launch the next generation of large language models, multimodal. The information throws it back to its previous scoop from last week that Google is testing Gemini with a small group of outside companies, but according to Victor's sources, OpenAI is trying to get out their next multimodal model faster. Victor writes, the Microsoft-backed startup is racing to integrate GPT4 its most advanced LLM, with multimodal features akin to what Gemini I will offer. OpenAI previewed those features when it launched GPD4 in March, but didn't make them
Starting point is 00:14:01 available except to one company, BMIIs, that created technology for people who were blind or had low vision. Six months later, the company is preparing to roll out the features known as GPT Vision more broadly. The question, of course, is what took OpenAI so long if they had these features back six months ago, what amounts to an eternity in this highly contested AI space? One issue that we've heard in the past was reports that OpenAI had been hamstrung by access to compute, and so their plans to launch multimodal in 2023 had gotten delayed because of the overall AI chip shortage. According to Victor's newer reporting, though, there was also another issue. Quote, what took Open AI so long? Mostly concerns about how the new vision features could be used
Starting point is 00:14:37 by bad actors, such as impersonating humans by solving captures automatically, or perhaps tracking people through facial recognition. But OpenAI's engineers seemed close to satisfying legal concerns around the new technology. Now, the one more throwaway line around OpenAI's plans from this report was this. Quote, open AI might follow up GPT vision with an even more powerful multimodal model codenamed Gobe. Unlike GPT4, Gobe is being designed as multimodal from the start. It doesn't sound like OpenAI has started training the model yet, so it's too soon to know if Goby could eventually become GBT5. If you are a regular listener of this show, you will know that I think that GPT5 is primarily being held up by a regulatory process right now,
Starting point is 00:15:15 in which OpenAI anticipates or maybe has been explicitly told, that models that are more advanced than their GPT4 standard are likely to be subject to some sort of licensing regime, even though that licensing regime is yet to be fully articulated. Now, the last interesting thing from this information article is just a reminder that to the extent that this battle comes down to multimodal, that could be a boon for Google. Specifically, they point out that Google has, through its ownership of YouTube, an incredible trove of data related to audio and visual information, and so might naturally have a leg up just based on that.
Starting point is 00:15:44 As Rundown AI founder Rowan Chong tweeted, rumor, Gemini is being trained on YouTube video transcripts. This is genuinely fascinating. The amount of knowledge and data on YouTube is massive. I can't wait to try it. So when will we learn more? Well, there was that announcement about a week and a half ago from Sam Altman that on November 6, OpenAI is hosting their first ever developer conference in San Francisco. He did go to pains to say that there wouldn't be GPT5 or 4.5 or anything like that, but still
Starting point is 00:16:10 he said, I think people would be very happy. Even before this, one of people's top predictions was that we would see a multimodal GPD4 model. McKay Riggily writes, some predictions for OpenAI's Developer Day on November 6th, meaningful GPT4 cost reduction, fine tuning for GPT4, UI for fine tuning, multimodal GPT4 goes live, doll e3, chat GPT API rethinking of plug-its. I bet I hit on at least three. You have to think after this reporting from OpenAI and just the general pressure from Google, that multimodal GPT4 has to be pretty high on that list.
Starting point is 00:16:42 We are heading into a really interesting period in the development of LLMs. It's interesting from a business perspective to see if anyone can, actually get to parity or even ahead of OpenAI, which has been the dominant force in the space since the launch of ChatGPT. It's interesting from a policy perspective, given everything that I was just saying about how models that are more advanced than GPT4 seem like they might be caught up in a regulatory dragnet and a new, as yet, undeveloped licensing regime. And of course, there's also a really interesting technical question. Gemini seems poised to test the question of whether just making LLMs bigger does continue to improve their capacity.
Starting point is 00:17:19 insider yesterday published a piece, Massive LLMs like Google's forthcoming Gemini could be a rare breed as generative AI enters a downsizing period. The piece says going bigger and bigger has looked like an unlikely path forward for some time, with OpenAI CEO Sam Altman suggesting earlier this year that, quote, we're at the end of the era where it's going to be these like giant, giant models. The reasons for that? One is the expense.
Starting point is 00:17:41 For example, Altman said that the cost of training GPT4 was more than $100 million. Second and thirds are concerned about data. in some cases, hallucinations and biases, and in other cases, copyrighted information. And beyond that, although not mentioned in this insider piece, there's just a lot more experimentation with where real gains are going to come from and whether it is exclusively based on ever larger training datasets. Seems likely that that's not the only vector that will lead to the leading models in the future. In any case, bringing it back to today's news,
Starting point is 00:18:08 for anyone who is already an AI user and already a Google workspace user, forget all of that future Gemini versus OpenAI go-be competition stuff, These updates are likely valuable right now. I know I am certainly going to experiment with them, and I will report back on what I find. For now, however, that is going to do it for today's AI breakdown. I appreciate you guys listening or watching as always. And until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.