The AI Daily Brief: Artificial Intelligence News and Analysis - Microsoft's Top Secret AI for Spy Agencies

Episode Date: May 9, 2024

Microsoft unveils a new top-secret AI system designed for US intelligence agencies. This video explores the implications of a fully isolated LLM for analyzing classified data, addressing security conc...erns, and potential ethical issues surrounding AI in military and intelligence applications. It also discusses contrasting views on AI adoption within the US military, highlighting worries about decision-making complexity and potential safety risks. ** Check out the hit podcast from HBS Managing the Future of Work https://www.hbs.edu/managing-the-future-of-work/podcast/Pages/default.aspx Join Superintelligent at https://besuper.ai/ -- Practical, useful, hands on AI education through tutorials and step-by-step how-tos. Use code podcast for 50% off your first month! ** ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, Microsoft releases an AI for spy agencies. Before that on the brief, Mistral is raising at a $6 billion valuation. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Check out the link to our Discord and the show notes to join the conversation. Welcome back to the AI Daily Brief Headlines edition, all the AI headline news you need in around five minutes. We kick off today with a follow-up from a story from yesterday. In the main part of our episode yesterday, we did a whole section
Starting point is 00:00:36 on all the various OpenAI news, including their release of what they called the model spec. Now, the idea of this model spec was to make it clearer to people which behaviors that they were seeing in chat GPT were supposed to be there versus which behaviors were genuinely outside and were not supposed to be there. I had given some of the first level impressions, but in the hours following publishing, people started to hone in on one particular little section. When discussing how chat GPT should respond, one of the prerogatives OpenAI listed was not to respond with not safe for work content, which OpenAI said included erotica, extreme gore, slurs, and unsolicited profanity. Now, the commentary that OpenAI included with that was, we believe developers and users
Starting point is 00:01:17 should have flexibility to use our services as they see fit, so long as they comply with our usage policies. We're exploring whether we can responsibly provide the ability to generate NSFW content in age-appropriate context through the API and chat GPT. We look forward to better understanding user and societal expectations of model behavior in this area. Of course, the way that this has been captured in headlines is like this one from first post, chat GPT maker OpenAI now wants to make ethical and responsible porn exploring ways to do it. I read it slightly differently. Basically, this is an obvious capacity of AI. In fact, it's one that has a lot of people worried. Deep fake porn of real people has been already a huge issue, and we're only just getting started.
Starting point is 00:01:58 At the same time, basically every new technology goes through a period where people people try to figure out how it can be used for porn. And so it's not like there's no financial incentive for Open AI to explore this. I think the most telling phrase is societal expectations of model behavior in this area, as in I wouldn't expect Open AI to be on the vanguard of this, but nor did they want to cut themselves off from future possibilities. I actually think Andy at Nexuist on Twitter isn't totally far off when they characterize this as the How Do We Eat Character AI Play? Character AI is an extremely popular AI service where people are spending hours every day talking with different characters, some real, some totally imagined from scratch, some based on real historical figures. And of course,
Starting point is 00:02:40 these conversations are not all innocent. Anyway, I think it's an interesting sub-story, added to the long list of questions that society is going to have to answer when it comes to AI, and one that's frankly likely to show up in the regulatory sphere faster than almost any others. Next up, reports recently had been that mistral was out raising another round, with the rumors that I had seen seeing the valuation at around $5 billion, which of course is up from the $2 billion valuation that they raised at in December, but now it appears that they are close to raising a roughly $600 million round at a $6 billion valuation. The Wall Street Journal writes that General Catalyst and Lightspeed, who are existing investors in Mistral,
Starting point is 00:03:17 are expected to be some of the biggest investors in this new round. We've discussed extensively on this show what a very specific game investing in Frontier Models is. The last time we talked about it was in the context of X-AIs, rumored $6 million raise at a $24 billion valuation, where we talked about how if you want to play in this space, you kind of have to be willing to pay what the market sets the price at. There really are just a very small handful of companies, including now Mistral, that are contenders at the state of the art, and for now they're still commanding an incredible premium. I think that even as you see a wave of consolidation among AI startups, this handful of frontier model contenders is going to continue to be
Starting point is 00:03:55 able to write pretty much whatever valuations they want. Now, over in the land of big tech, TikTok has announced that they are adding an AI-generated label to third-party content. TikTok had already been applying an AI-generated tag to content that had been made using TikTok's dedicated AI tools, but now they're applying that same label to content that comes in from other platforms. The Verge writes, TikTok will detect when images or videos are uploaded to its platform containing metadata tags indicating the presence of AI-generated content and comes through partnerships
Starting point is 00:04:24 with Adobe's Content Authenticity Initiative, as well as the Coalition for Content Provenants and Authenticity. Now, it's worth noting that this still is all about metadata that is added to AI-generated content rather than a detection system that tries to guess when those metadata tags aren't there, and so some people will, of course, have questions around how much impact this is actually likely to have.
Starting point is 00:04:44 Still, I think most people would be in the camp of it's better than nothing, although then again, in the U.S., who knows how long TikTok will even be here. Anyways, that is going to do it for today's AI Daily Brief headline edition, Next up, the main daily brief. Today's podcast is brought to you by Plum. You've played around with prompts, found ones that work great, but now what? With Plum, your best ideas don't stay stuck in the playground.
Starting point is 00:05:06 Their end-to-end AI builder lets you effortlessly take your top-performing prompts and turn them into production-ready features. Product design and engineering can all collaborate in Plum's intuitive interface, giving you the confidence to deploy AI that delivers real value to your users. Stop letting your best prompts collect dust. Check out Useplum.com. that's Plum with a B to ship them with Plum today. As a listener of this show, I have a strong feeling you like to stay up to date on all things
Starting point is 00:05:32 artificial intelligence, including its impact on the workforce, which is why I highly recommend checking out managing the future of work, the chart-topping business podcast from Harvard Business School. HBS professors Bill Kerr and Joe Fuller talked to business leaders, technologists, and policymakers grappling with the forces like AI, globalization, and demographic shifts that are reshaping the nature of work. Recent guests include IBM's CHRO, Nicol Lamarro, on how Big Blue is adopting AI, Morningstar CEO, Kunal Kapoor on how AI can raise the investment IQ, Microsoft Corporate Vice President Jared
Starting point is 00:06:03 Satero on how the tech giant is experimenting its way from AI assistance to autonomous agents, and many other prominent movers in business and the workforce ecosystem. So don't miss out. Follow managing the future of work on Apple Podcasts, Spotify, or wherever you're listening now. Hello, AI friends today. I want to tell you about our platform superintech. In short, it's a platform for useful, practical, immediately applicable AI learning. We have nearly 400 video tutorials, each of which comes with step-by-step how-toes, and the idea is to get you actually using these AI tools we talk about every day in a matter of minutes to actually solve problems, create new opportunities, and just do really cool things.
Starting point is 00:06:42 To learn more and subscribe, go to be super.aI. And if you do decide to subscribe, use code podcast for 50% off your first month. Again, that's besupor.com. Welcome back to the AI Daily Brief. Today we are talking about a topic that is lurking just under the surface of a lot of conversations around AI and policy and regulation and geopolitics, which is, of course, the use of AI in the military and the intelligence establishment. The specific catalyst for having this conversation today is a new product from Microsoft that's effectively a top secret generative AI service for U.S. spy agencies. Now, intelligence agencies are not strangers when it comes to LLM.
Starting point is 00:07:21 It's pretty safe to say that every intelligence agency in the world, certainly those in the U.S., have been experimenting with this technology right from the very beginning. However, there has always been a particular challenge with that use case in that there is basically no area in the world that has greater data sensitivity than the intelligence world. Think about the disaster that could happen if, for example, U.S. intelligence services fed top secret data into an LLM like OpenAI, and that somehow leaked into other people's use. So what Microsoft introduced this week was what they're advertising as the first. first LLM that operates fully separate from the internet.
Starting point is 00:07:55 Writes Bloomberg, most AI models, including OpenAI's chat GPT, rely on cloud services to learn and infer patterns from data. But Microsoft wanted to deliver a truly secure system to the U.S. intelligence community. Said William Chappelle, Microsoft's chief technology officer for strategic missions and technology, Microsoft has deployed a GPT4-based model
Starting point is 00:08:13 and key elements that support that model onto a cloud with a, quote, air-gapped environment that is isolated from the internet. Chappelle said that Microsoft has spent the last 18 months working on this system. As part of that, they had to overhaul an existing AI supercomputer based in Iowa. Chappelle said, this is the first time we've ever had an isolated version. When isolated means it's not connected to the internet, and it's on a special network that's only accessible by the U.S. government. About 10,000 people have the clearance to access this AI, and Microsoft describes it
Starting point is 00:08:40 as static, meaning it can read files but not learn from them or from the internet. Again, Chappelle said, you don't want it to learn on the questions that you're asking and then somehow reveal that information. Now, again, even though this might be an even more useful, you useful type of tool, it's not like intelligence agencies have been doing nothing. Last year, for example, the CIA launched something like ChatGPT that operated at unclassified levels, but as Shetal Patel, the assistant director of the CIA for the Transnational and Technology Mission Center, told a conference last month, there's a race to get generative AI onto intelligence data. She said the first country to use generative AI for their intelligence would win the race, and she said, I want it to be us.
Starting point is 00:09:16 What's next is, of course, test to make sure that this operates in the way that they hope it does. for now, the CIA and the Office of the Director of National Intelligence, which oversees America's 18 intelligence organizations, have not commented. Now, this conversation around the military and intelligence community's use of AI is something that I pay attention to fairly closely. About a week ago, there was an interesting discussion that came up, embodied by this Axios article, AI hits trust hurdles with U.S. military. Axios writes, some branches of the U.S. military are hitting the breaks on generative AI after decades of Department of Defense experiments with broader AI technology. This was based on an article in foreign affairs called Why the Military
Starting point is 00:09:53 Can't Trust AI. Large language models can make bad decisions and could trigger nuclear war. The piece was written by Max Lamparth, a fellow at Stanford Center for International Safety and Cooperation, and the Stanford Center for AI Safety, and Jacqueline Schneider, a Hoover fellow at the Hoover Institute, as well as the director of the Hoover Wargaming and Crisis Simulation Initiative. The first part of the article just goes through the recent history of generative AI and how the military started experimenting with it, but then suggests that recently there have been some issues. They write, despite the enthusiasm for AI and LLMs within the Pentagon, its leadership is worried about the risk that the technologies pose. Hackathon sponsored by the chief digital and artificial
Starting point is 00:10:30 intelligence office have identified biases and hallucinations, and recently the U.S. Navy published guidance limiting the use of LLMs, citing security vulnerabilities and the inadvertent release of sensitive information. They then talked about a series of war games that they held in an academic setting. of these war games was to see how human experts and LLMs made different decisions in the same scenarios. In other words, this wasn't humans playing against LLMs, it was humans against humans and LLMs against LLMs. They write, the game placed players in the midst of a U.S.-China maritime crisis as a U.S. Government task force made decisions about how to use emerging technologies in the face of escalation. Players were given the same background documents and game rules as well as
Starting point is 00:11:08 identical PowerPoint decks, word-based player guides, maps, and details of capabilities. They then deliberated in groups of four to six to generate recommendations. On average, both the humans and the LLM teams made similar choices about big-picture strategy and rules of engagement. But as we changed the information the LLM received or swapped between which LLM we used, we saw significant deviations from human behavior. For example, one LLM we tested tried to avoid friendly casualties or collisions by opening fire on enemy combatants and turning a Cold War hot,
Starting point is 00:11:37 reasoning that using preemptive violence was more likely to prevent a bad outcome to the crisis. The problem they said was not that an LLM made worse or better decisions than humans, or that it was more likely to, quote-unquote, win the war game. It was rather that the LLM came to its decisions in a way that did not convey the complexity of human decision-making. LLM generated dialogue between players had little disagreement and consisted of short statements of fact. It was a far cry from the in-depth argument so often a part of human wargaming. Now, it's important to note that while the article is called AI hits trust hurdles with the U.S. military,
Starting point is 00:12:08 it's not actually military sources that are saying they're mistrusting of AI, it's this set of experts from Stanford. Still, there is enough of a pattern that it's worth noting. For example, Space Force paused the use of generative AI back in September of last year, and in June, the Navy's chief information officer, Jane Overslaw Rathbun concluded that while, quote, generative AI can be a force multiplier, commercial models have inherent security vulnerabilities that are not recommended for operational use cases. Meanwhile, another person affiliated with Stanford University, Marietta Shaki, wrote a different op-ed for the Financial Times called Military is the missing word in AI safety discussions. Government attempts to regulate the technology must look at its use on the battlefield.
Starting point is 00:12:46 She writes, Western governments are racing each other to set up AI safety institutes. The US, UK, Japan, and Canada have all announced such initiatives, while the U.S. Department of Homeland Security added an AI Safety and Security board to the mix only last week. Given this heavy emphasis on safety, it is remarkable that none of these bodies govern the military use of AI. Meanwhile, the modern-day battlefield is already demonstrating the potential for clear AI safety risks. Now, regardless of whatever conclusions she comes to in the piece, I think that the underlying point that a conversation about AI safety or AI in general is incomplete without discussing the military application is totally correct. It's something that I've
Starting point is 00:13:21 often pointed out on this show when discussing these regulatory conversations, while on the same day some U.S. military offices announce their latest thing with AI, just rapidly adopting it regardless of those conversations. Given how at the heart of geopolitical struggles AI increasingly is, I believe that you're going to see a lot more of this discussion about AI in the military and AI in intelligence agencies in the months and years to come. For now, though, that is going to do it for the AI Daily Brief. Until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.