The AI Daily Brief: Artificial Intelligence News and Analysis - The More Workers Use AI, The More They Fear for Their Jobs

Starting point is 00:00:00 Today on the AI Breakdown, we're looking at the new OpenAI Preparedness Framework. Before that on the brief, the more workers use AI, the more scared they are it's going to impact their jobs. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown. Network for more information about our YouTube, our Discord, and our newsletter. Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes. Well, friends, you can tell it is that dead week before Christmas and New Year's, because the flow of AI story, is, I will admit, going to something of a trickle. Of course, just wait for Open AI to drop 4.5 at the last minute, probably right after I've tried to hang up the microphone for a nice

Starting point is 00:00:45 little break. Anyways, today we are starting with a new survey from CNBC, which I have to say deeply, deeply resonates. The survey was a look at AI in the workplace, and there were a few really interesting notes. First is that people who use AI definitely reported helping their productivity. 72% of people who use AI say that it has made them more productive. When it comes to the generational breakdown of who is actually using AI, it's pretty much cut along age lines. 37% of Gen Zers have used AI in the workplace, 35% of millennials versus 25% of Gen Xers and 17% of baby boomers.

Starting point is 00:01:21 Now, interestingly, when it comes to different ethnic groups, white workers are actually the lowest group with only 23% of workers having used AI at the workplace, as opposed to 36% of Hispanic workers, 38% of black workers, and 41% of Asian employees. Overall, there's definitely concern around AI in the workplace. 42% of employees overall say they're concerned about how AI may impact their jobs. 38% of managers are very or somewhat concerned, and 44% of individual employees are very or somewhat concerned.

Starting point is 00:01:50 People who make the least are the most concerned in general. 47% of those making 50,000 a year or less are concerned, compared to 39% making between 50 and 99,000, and 36% making 100,000 or more. But here is the most interesting statistic in my estimation. 35% of people who don't use AI at work are concerned that it will impact their job. That number rises to 60% among people who have used AI at work. In other words, the more you use AI at work, the more concerned you are of it taking your job.

Starting point is 00:02:23 Now, there are two very different possible interpretations for this. one, which I think most of our heads go to first, is that people start to use these tools, they find out how powerful they are, and they see how many different parts of their job might be able to be automated, thus growing more concerned. That's definitely the implication of the reporting on the study at least. Now, I would also like to say, just to put a little bit of a wet blanket on that, that it could also be a correlation issue. In other words, the workers whose jobs are most likely to be able to be replaced by AI might also be the people who are most attracted to using AI. And so, it might not necessarily be that because they've used it,

Starting point is 00:03:02 they're more concerned, but in fact that because they're in a more concerning position, they're more likely to use it. Either way, I think it's really fascinating and speaks to what is going to be a huge conversation in the coming years, which is how AI will impact the workforce. Last thing to note about this is that it was a CNBC survey monkey study, and it was completed really recently, between December 4th and 8th, among a national sample of nearly 8,000 workers in the U.S. Now, speaking of companies and AI, Accenture is no stranger to the artificial intelligence field, given that they have publicly announced billions of dollars of investment, retrofitting their operations to be able to service clients in this area. That said, the company's CEO, Julia Sweet,

Starting point is 00:03:44 thinks that most companies are not yet ready to deploy AI solutions, two big reasons, data infrastructure, and controls. On the data infrastructure side, she said, the thing that is going to hold it back, though, is most companies do not have mature data capabilities, and if you can't use your data, you can't use AI. Basically what Sweet is saying is that one of the biggest ways that AI can be useful for a company is when it's unleashed on that company's data. However, if that data isn't organized, if it isn't set up to be used easily, that limits the value that you can get from that use case for AI. Now, the other big area that she says will hold people back is around safety controls and risk. She said, we're still at the stage where most CEOs,

Starting point is 00:04:24 asked if there is someone in their organization who can tell them where AI is being used, what the risks are, and how they're being mitigated, the answer is still no. There is a gap between saying you're committed to responsible AI and having the programs that allow it to be real on the ground. Now, all of that said, it's very clear that Sweet and Accenture continue to be bullish. Indeed, when it comes to that gap that she just mentioned, she said, the good news is that people are not trying to leap over the gap. They are being careful in the rollout, and so it does limit in the short term some of the

Starting point is 00:04:50 scaling opportunities. She also said, in three to five years, we expect this to be a big part of our business. Overall, Accenture reported a huge jump in generative AI bookings. In the three months leading to November 30th, they saw $450 million in revenue from generative AI projects, as opposed to just $300 million over the previous six months. That three-month period was 50% higher, in other words, than the previous six before it. Now, as the Financial Times points out, that still peanuts compared to their overall revenue of $64 billion annually, but it certainly shows the trajectory. Now, one of the concerns that some of those companies might have is around privacy. A new study from a group of Stanford researchers gives yet another reason to be concerned about

Starting point is 00:05:29 privacy when it comes to AI. So a group of three Stanford graduate students created something called the predicting image geolocations project or pigeon. The goal was to identify locations on Google Street View, but to test out of the full capability of the AI underneath, they also shared a set of personal photos that the AI had never seen before, and in the majority of cases, the AI was able to accurately guess where the photos had been taken. Now, like basically everything in the AI field, this could be a good or a bad thing. An NPR article points out two positive use cases as helping people identify the locations

Starting point is 00:06:02 in old snapshots from relatives, or allowing field biologists to conduct rapid surveys of entire regions for invasive plant species. However, the surveillance implications, the negative surveillance implications, I should say, are also fairly clear as well. Now, part of the reason that this is making news is that there's actually a whole sub-community of people who like trying to figure out where photos were taken or videos were taken with just minimal information. Maybe you've seen someone like Jose Monkey on TikTok, but there's also a game online called Geogessor, which is exactly what it sounds like. When you play, you're basically given a Google Street View, and you have to place a pin on the map guessing the location. Over 50 million people have played, and some have gotten really,

Starting point is 00:06:39 really good. This team of Stanford students trained their AI based on an existing system for analyzing images called Clip and added their own dataset of around a half million street view images, leading to really, really good performance. When the pigeon system was done, it was correctly able to guess the country that a photo was in 95% of the time, and in general was able to pick a location within 25 miles of the actual photo. Then they tested it against someone named Trevor Rainbolt, who is a quote legend in geogessing circles, and in the head-to-head competition, Trevor lost multiple rounds. Lastly, today, a little update from the realm of real-world AI.

Starting point is 00:07:14 That is, of course, how Elon Musk has described Tesla's fully self-driving cars. But Elon isn't the only one thinking about cars and AI, as TomTom and Microsoft have teamed up to create a, quote, fully integrated conversational driving assistant. TomTom is currently best known for its GPS platforms, and the idea of this new tool is to create something that the auto manufacturers can use in their infotainment systems. This is honestly a super, super obvious and useful use case for chatbots and something that some manufacturers have already been experimenting with. For example, Mercedes did a pilot with chat GPT earlier this year, and when it comes to having a more sophisticated, locally aware, AI, that car use case just makes a lot of sense. However, that will do it for today's AI breakdown brief.

Starting point is 00:07:56 Next up, the main AI breakdown. Quickly a brief word from today's sponsor. As a listener of this show, I suspect you like to stay up to date on a lot. on all things AI and tech, which is why you have to check out the chart-topping podcast Web3 with A16Crypto. Produced by venture firm Andresen Horowitz, Web3 with A16Z is the perfect companion podcast to the AI breakdown. Web3 with A16Z crypto is your definitive resource for the future of the internet.

Starting point is 00:08:24 Whether you're interested in the convergence of AI in crypto or simply curious about what's next. If you need a place to start, they recently released an excellent episode with Stanford Cryptography Professor Dan Bonay and former Google X engineer Aliya, in conversation with host Sonal Choxi about the intersection of AI and crypto. From fighting deepfakes and proving humanity to large language models like ChatGBT, BT, they cover it all. I highly recommend checking it out, especially if you'd like to learn more about how AI and crypto will impact our everyday lives. Beyond Crypto and AI, this show is for creators

Starting point is 00:08:55 seeking more ways to truly own their work, for business leaders trying to prepare for the future today, and for innovators exploring trending tech topics. Don't miss out. Follow Web3 with A16Z crypto on Apple Podcasts, Spotify, or your favorite listening app. Hello, friends. One quick note before we get back to the rest of the episode, registration for January's AI Education Beta is now officially open. It's open until just Friday at 1159 p.m. Eastern Time. You can find the link to learn more and register at bit.ly slash AI beta. Now, this is an experiment that I've been running all throughout December, in which every day I drop a new video tutorial or a case study and usually partner it with a challenge, the idea of

Starting point is 00:09:38 which is to get you learning about all of these new different AI tools, as well as specific strategies for the most frequently used like chat GPT or Dali, and then gets you actually testing them out in the real world with real use cases and hopefully applying them back to your personal or professional pursuits as well. The first month has gone incredibly well. People seem to be really liking the video content as well as the incredible community that's forming. And part of that is that it's a group of really serious people. This is a paid experience. It's $20 a month. Part of the reason for that is that I want you guys to judge this content on the basis of whether it's actually worth that much to you. And second, I wanted it to be full of really

Starting point is 00:10:13 serious people who are intent on applying AI to their lives in some real and significant way. Anyways, I would love to have more AI breakdown listeners participate in January. Content will start on January 3rd after the end of the holiday season. And again, the link to find out more and to register is bit.ly slash AI beta. That's bit.l.ly slash AI beta. And now back to the show. Welcome back to the AI breakdown. Today we are talking about OpenAI's new preparedness policy. And this is meant, I think, in many ways to address concerns or questions that people have had coming off of the boards firing and rehiring of Sam Altman, at least in terms of the date of its release now. However, it should note that there has been some amount of shift.

Starting point is 00:10:59 in the way that OpenAI approaches this for the last few months. Basically, the way that OpenAI used to have it set up, there was a trust and safety team under someone named Dave Wilner, who was a former meta-platforms content moderation executive. For the past few months, they've been looking for a replacement, but it seems that that strategy has shifted, and now they are thinking about safety in three different ways. This new update on their preparedness team is the fullest articulation of how they're thinking about this problem so far. They write, the study of frontier AI risks has fallen far short of what is possible and where we need to be. To address this gap and systematize our safety thinking,

Starting point is 00:11:33 we are adopting the initial version of our preparedness framework. It describes open AI's process to track, evaluate, forecast, and protect against catastrophic risks posed by increasingly powerful models. So one of the really important things about this is that they've divided the world of safety into three different buckets. There's safety issues having to deal with current models, safety issues dealing with frontier models, and safety issues dealing with super-intelligent models. inside of OpenAI, these now all have different teams. The Super Alignment team is the one that we've talked the most about. This was formed over the summer, and is co-led, or was co-led at least, by co-founder and

Starting point is 00:12:07 chief scientist Ilya Sutskever, as well as Jan Leakey, although as of current, no one exactly knows what Ili's future with the company is, given the fallout of that whole board blowup. Now, the team that focuses on these current models is called the Safety Systems team. And in fact, earlier this month, OpenAI updated how it was approaching that in a blog post from December 5th. They wrote, building on the many years of our practical alignment work and applied safety efforts, safety systems addresses emerging safety issues and develops new fundamental solutions to enable the safe deployment of our most advanced models and future AGI to make AI that is beneficial and trustworthy. Safety systems stays closest to deployment risks, while our superalignment

Starting point is 00:12:44 team focuses on aligning superintelligence and our preparedness team focuses on safety assessments for frontier models. In collaboration, these teams span a wide spectrum of technical efforts tackling AI safety challenges at OpenAI. So when it comes to these questions of deployment for current models, what are some of the problems that this team thinks about? Open AI lists. How do we detect unknown classes of harmful answers, actions, or usage? How do we maintain user privacy while ensuring safety? How do we best leverage diverse human expertise to guide AI safety? How do we build AI to be collaborative with users and safely take action on behalf of those users? Now, within the team itself, there are actually four subteams. Safety engineering, model safety research, safety reasoning

Starting point is 00:13:23 research and human AI interaction. Safety engineering is exactly what it sounds like. It's the team that implements, as they call it, system-level mitigation into products. The model safety research team focuses on alignment with these models. The safety reasoning research team is taking a slightly different approach and thinking about how to build, quote, better safety and ethical reasoning skills into the foundation models and using these skills to enhance our moderation models. This isn't exactly the same, but this echoes Anthropics constitutional AI, where rather than trying to scale reinforcement learning from human feedback or RLHHHS. Instead, Anthropic is trying to teach its models to reason about appropriate or ethical use cases based on a constitution that comes from a corpus of other constitutions and important ethical documents that people have written across the centuries.

Starting point is 00:14:06 The last subteam of the safety systems team is human AI interaction. The way they describe it is, policy is the interface for aligning model behavior with desired human values, and we co-design policy with models and for models, and thus policies can be directly plugged into our safety systems. Now, I will say that I think OpenAI often does an admirable job of not getting caught in jargon, but that sentence literally says nothing. Anyways, on December 5th, it was a short post, but it's set up this new post, which is, of course, more about this preparedness and the implications for frontier models. So what about the actual framework itself? There are a few dimensions of it.

Starting point is 00:14:38 The first, they write, we will run evaluations and continually update scorecards for our models. They write, we will evaluate all our frontier models, including at every 2x effective compute increase during training runs. We will push models to their limits. These findings will help us assess the risk of frontier models and measure the effectiveness of any proposed mitigations. Our goal is to probe the specific edges of what's unsafe to effectively mitigate the revealed risks. To track the safety level of our models, we will produce risk scorecards and detailed reports. So the four categories of risk that they define are cybersecurity, CBRN, which stands for chemical, biological, radiological, and nuclear risks, persuasion, and model autonomy. These scores for any given

Starting point is 00:15:18 model ranged from low to medium to high to critical. And ultimately, the overall model score is the highest risk score in any category, meaning that if CBRN persuasion and model autonomy were all low, but cybersecurity was critical, that would still mean that overall the model was categorized as critical. Now, of course, what does it mean to actually have risk in these categories? Well, that's the second part of their preparedness framework. They write, we will define risk thresholds that trigger baseline safety measures. We've defined thresholds for risk levels along the following initial tract categories, cybersecurity, CBRN, persuasion, and model autonomy. Now, in terms of how they actually use these model scores, first of all, they're focused on

Starting point is 00:15:57 post-mitigation scores, i.e. not what the score is before they've done anything about it, but what the score is after they've done something about it. And basically what they're committing to in this, is that only models with a post-mitigation score of medium or below can be deployed, and only models with a post-mitigation score of high or below can be developed further. So what they're saying is that if they find critical threats in any of these categories, and they can't get that down to at least just a high threat level after mitigation techniques, they're saying here that they would cease work on that model. And of course, they'll only actually push out models that have a post-mitigation score of medium. Now, who is making the decisions about all these things? Well, that gets to the next

Starting point is 00:16:35 part of this framework. And this is the one that's certainly been picked up the most by the press. OpenAI writes, we will establish a dedicated team to oversee technical work and an operational structure for safety decision-making. The preparedness team will drive technical work to examine the limits of frontier model capabilities, run evaluations, and synthesize reports. This technical work is critical to inform OpenAI's decision-making for safe model development and deployment. So basically, the job of the preparedness team is to do all of the legwork that gets people the information they need to make decisions. That is this preparedness team. But on top of that, they're also creating a cross-functional safety advisory group to, quote,

Starting point is 00:17:09 review all reports and send them concurrently to leadership in the board of directors. While leadership, is the decision maker, the board of directors holds the right to reverse decisions. So basically, there are four groups each with a different role. The preparedness team does the technical work to assess things. The safety advisory group makes recommendations. Leadership makes decisions about whether to deploy or continue working. And the board of directors has the right to reverse those decisions. Now, the relationship then between the leadership and the board of directors makes it kind of makes sense why they're having this third party safety advisory group there as well. The safety advisory group effectively acts as a nominally impartial layer on top of the preparedness team, where both sets

Starting point is 00:17:47 of decision makers, the leadership and then the potential reverse decisioning board of directors, have that same set of recommendations and have that same set of technical work underlying it from the safety advisory group and the preparedness team respectively. Now, my assumption reading this is that when they talk about decision making, they're talking about deployment and they're talking about the continued development. Basically, the two things they specified in the previous section of the preparedness report. But it'd be good to have that a little bit more clarified. Now, the last two notes of the preparedness framework are, we will develop protocols for added safety and outside accountability.

Starting point is 00:18:17 The preparedness team will conduct regular safety drills to stress test against the pressures of our business and our own culture. Some safety issues can emerge rapidly so we have the ability to mark urgent issues for rapid responses. And then secondly, we will help reduce other known and unknown safety risks. We will collaborate closely with external parties as well as internal teams like safety systems to track real-world misuse. We will also work with superalignment on tracking emergent misalignment risks.

Starting point is 00:18:39 We're also pioneering new research and measuring how risks evolve as model scale to help forecast risks in advance, similar to our earlier success with scaling laws. Finally, we will run a continuous process to try surfacing any emerging unknown unknowns. So my thoughts about this overall are that on a basic level the framework makes sense, right? You've got what is a coherent separation of these different types of risks into these different categories and teams, which makes sense from a focus perspective, but also does require, of course, that these teams work together well and communicate effectively. But let's assume that that can happen. The framework of risk across these four categories also make sense to get a little bit clearer around what we're trying to identify when we're asking, should a model be deployed or should a model even continue being researched?

Starting point is 00:19:20 I think what I and a lot of people would like to know, although this might be proprietary and internal, is what those risk thresholds really are. What makes a cybersecurity risk move from low to medium and then medium to high and then high to critical? How is that measured? How is that determined? same across CBRN, same across persuasion, same across model autonomy, especially because these things could get very subjective very quickly. Is there a benchmark for understanding how persuasive something is?

Starting point is 00:19:44 My guess is that there are. OpenAI tends to try not to be super hand-wavy about this, but that I think is the reasonable next set of questions and something I hope we get more information on over time. Not only because Open AI is currently at the very forefront of this industry, but because the more that they share that sort of information, the more that other people have the chance to push back on it or suggest additions or updates or changes, that can actually inform how other people test their models as well. Still, it's good to have these frameworks start getting put into place, and I'll be interested to see how they evolve in the months to come. For now, that's going to do it for today's AI breakdown.

Starting point is 00:20:16 Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The More Workers Use AI, The More They Fear for Their Jobs

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.