The AI Daily Brief: Artificial Intelligence News and Analysis - The More Workers Use AI, The More They Fear for Their Jobs
Episode Date: December 19, 2023A new study from CNBC shows that the American workers who use AI the most are the most scared of it impacting their employment. Also on today's show we look at OpenAI's new Preparedness plan. Today's ...Sponsors: Listen to the chart-topping podcast 'web3 with a16z crypto' wherever you get your podcasts or here: https://link.chtbl.com/xz5kFVEK?sid=AIBreakdown Interested in the January AI Education Beta program? Learn more and sign up here - https://bit.ly/aibeta ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI Breakdown, we're looking at the new OpenAI Preparedness Framework.
Before that on the brief, the more workers use AI, the more scared they are it's going to impact their jobs.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown. Network for more information about our YouTube, our Discord, and our newsletter.
Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes.
Well, friends, you can tell it is that dead week before Christmas and New Year's, because the flow of AI story,
is, I will admit, going to something of a trickle. Of course, just wait for Open AI to drop
4.5 at the last minute, probably right after I've tried to hang up the microphone for a nice
little break. Anyways, today we are starting with a new survey from CNBC, which I have to say
deeply, deeply resonates. The survey was a look at AI in the workplace, and there were a few really
interesting notes. First is that people who use AI definitely reported helping their productivity.
72% of people who use AI say that it has made them more productive.
When it comes to the generational breakdown of who is actually using AI,
it's pretty much cut along age lines.
37% of Gen Zers have used AI in the workplace,
35% of millennials versus 25% of Gen Xers and 17% of baby boomers.
Now, interestingly, when it comes to different ethnic groups,
white workers are actually the lowest group with only 23% of workers having used AI at the workplace,
as opposed to 36% of Hispanic workers,
38% of black workers, and 41% of Asian employees.
Overall, there's definitely concern around AI in the workplace.
42% of employees overall say they're concerned about how AI may impact their jobs.
38% of managers are very or somewhat concerned,
and 44% of individual employees are very or somewhat concerned.
People who make the least are the most concerned in general.
47% of those making 50,000 a year or less are concerned,
compared to 39% making between 50 and 99,000,
and 36% making 100,000 or more.
But here is the most interesting statistic in my estimation.
35% of people who don't use AI at work are concerned that it will impact their job.
That number rises to 60% among people who have used AI at work.
In other words, the more you use AI at work, the more concerned you are of it taking your job.
Now, there are two very different possible interpretations for this.
one, which I think most of our heads go to first, is that people start to use these tools,
they find out how powerful they are, and they see how many different parts of their job might be
able to be automated, thus growing more concerned. That's definitely the implication of the
reporting on the study at least. Now, I would also like to say, just to put a little bit of a
wet blanket on that, that it could also be a correlation issue. In other words, the workers whose
jobs are most likely to be able to be replaced by AI might also be the people who are most
attracted to using AI. And so, it might not necessarily be that because they've used it,
they're more concerned, but in fact that because they're in a more concerning position,
they're more likely to use it. Either way, I think it's really fascinating and speaks to what is
going to be a huge conversation in the coming years, which is how AI will impact the workforce.
Last thing to note about this is that it was a CNBC survey monkey study, and it was completed really
recently, between December 4th and 8th, among a national sample of nearly 8,000 workers in the U.S.
Now, speaking of companies and AI, Accenture is no stranger to the artificial intelligence field,
given that they have publicly announced billions of dollars of investment, retrofitting their operations
to be able to service clients in this area. That said, the company's CEO, Julia Sweet,
thinks that most companies are not yet ready to deploy AI solutions, two big reasons,
data infrastructure, and controls. On the data infrastructure side, she said,
the thing that is going to hold it back, though, is most companies do not have mature data capabilities,
and if you can't use your data, you can't use AI. Basically what Sweet is saying is that one of the biggest
ways that AI can be useful for a company is when it's unleashed on that company's data. However, if that data
isn't organized, if it isn't set up to be used easily, that limits the value that you can get
from that use case for AI. Now, the other big area that she says will hold people back is around
safety controls and risk. She said, we're still at the stage where most CEOs,
asked if there is someone in their organization who can tell them where AI is being used,
what the risks are, and how they're being mitigated, the answer is still no.
There is a gap between saying you're committed to responsible AI and having the programs that
allow it to be real on the ground.
Now, all of that said, it's very clear that Sweet and Accenture continue to be bullish.
Indeed, when it comes to that gap that she just mentioned, she said,
the good news is that people are not trying to leap over the gap.
They are being careful in the rollout, and so it does limit in the short term some of the
scaling opportunities. She also said, in three to five years, we expect this to be a big part of our
business. Overall, Accenture reported a huge jump in generative AI bookings. In the three months leading
to November 30th, they saw $450 million in revenue from generative AI projects, as opposed to just
$300 million over the previous six months. That three-month period was 50% higher, in other words,
than the previous six before it. Now, as the Financial Times points out, that still peanuts
compared to their overall revenue of $64 billion annually, but it certainly shows the trajectory.
Now, one of the concerns that some of those companies might have is around privacy.
A new study from a group of Stanford researchers gives yet another reason to be concerned about
privacy when it comes to AI.
So a group of three Stanford graduate students created something called the predicting
image geolocations project or pigeon.
The goal was to identify locations on Google Street View, but to test out of the full
capability of the AI underneath, they also shared a set of personal photos that the AI had
never seen before, and in the majority of cases, the AI was able to accurately guess where the
photos had been taken. Now, like basically everything in the AI field, this could be a good or a
bad thing. An NPR article points out two positive use cases as helping people identify the locations
in old snapshots from relatives, or allowing field biologists to conduct rapid surveys
of entire regions for invasive plant species. However, the surveillance implications, the
negative surveillance implications, I should say, are also fairly clear as well. Now, part of the
reason that this is making news is that there's actually a whole sub-community of people who like trying
to figure out where photos were taken or videos were taken with just minimal information. Maybe you've
seen someone like Jose Monkey on TikTok, but there's also a game online called Geogessor, which is
exactly what it sounds like. When you play, you're basically given a Google Street View, and you have to
place a pin on the map guessing the location. Over 50 million people have played, and some have gotten really,
really good. This team of Stanford students trained their AI based on an existing system for
analyzing images called Clip and added their own dataset of around a half million street view images,
leading to really, really good performance. When the pigeon system was done, it was correctly
able to guess the country that a photo was in 95% of the time, and in general was able to pick a
location within 25 miles of the actual photo. Then they tested it against someone named Trevor
Rainbolt, who is a quote legend in geogessing circles, and in the head-to-head competition,
Trevor lost multiple rounds.
Lastly, today, a little update from the realm of real-world AI.
That is, of course, how Elon Musk has described Tesla's fully self-driving cars.
But Elon isn't the only one thinking about cars and AI, as TomTom and Microsoft have teamed up to create a, quote, fully integrated conversational driving assistant.
TomTom is currently best known for its GPS platforms, and the idea of this new tool is to create something that the auto manufacturers can use in their infotainment systems.
This is honestly a super, super obvious and useful use case for chatbots and something that some manufacturers
have already been experimenting with.
For example, Mercedes did a pilot with chat GPT earlier this year, and when it comes to having
a more sophisticated, locally aware, AI, that car use case just makes a lot of sense.
However, that will do it for today's AI breakdown brief.
Next up, the main AI breakdown.
Quickly a brief word from today's sponsor.
As a listener of this show, I suspect you like to stay up to date on a lot.
on all things AI and tech, which is why you have to check out the chart-topping podcast Web3
with A16Crypto.
Produced by venture firm Andresen Horowitz, Web3 with A16Z is the perfect companion podcast to
the AI breakdown.
Web3 with A16Z crypto is your definitive resource for the future of the internet.
Whether you're interested in the convergence of AI in crypto or simply curious about what's
next.
If you need a place to start, they recently released an excellent episode with Stanford
Cryptography Professor Dan Bonay and former Google X engineer Aliya,
in conversation with host Sonal Choxi about the intersection of AI and crypto.
From fighting deepfakes and proving humanity to large language models like ChatGBT, BT, they cover it all.
I highly recommend checking it out, especially if you'd like to learn more about how AI and
crypto will impact our everyday lives. Beyond Crypto and AI, this show is for creators
seeking more ways to truly own their work, for business leaders trying to prepare for the future
today, and for innovators exploring trending tech topics. Don't miss out. Follow Web3 with A16Z
crypto on Apple Podcasts, Spotify, or your favorite listening app.
Hello, friends. One quick note before we get back to the rest of the episode,
registration for January's AI Education Beta is now officially open. It's open until just
Friday at 1159 p.m. Eastern Time. You can find the link to learn more and register at bit.ly slash
AI beta. Now, this is an experiment that I've been running all throughout December, in which every day
I drop a new video tutorial or a case study and usually partner it with a challenge, the idea of
which is to get you learning about all of these new different AI tools, as well as specific
strategies for the most frequently used like chat GPT or Dali, and then gets you actually testing
them out in the real world with real use cases and hopefully applying them back to your
personal or professional pursuits as well. The first month has gone incredibly well. People seem to be
really liking the video content as well as the incredible community that's forming. And part of that is
that it's a group of really serious people. This is a paid experience. It's $20 a month.
Part of the reason for that is that I want you guys to judge this content on the basis of
whether it's actually worth that much to you. And second, I wanted it to be full of really
serious people who are intent on applying AI to their lives in some real and significant way.
Anyways, I would love to have more AI breakdown listeners participate in January. Content will start
on January 3rd after the end of the holiday season. And again, the link to find out more and to register
is bit.ly slash AI beta. That's bit.l.ly slash AI beta. And now back to the show.
Welcome back to the AI breakdown. Today we are talking about OpenAI's new preparedness policy. And this is
meant, I think, in many ways to address concerns or questions that people have had coming off of the
boards firing and rehiring of Sam Altman, at least in terms of the date of its release now. However,
it should note that there has been some amount of shift.
in the way that OpenAI approaches this for the last few months.
Basically, the way that OpenAI used to have it set up, there was a trust and safety team
under someone named Dave Wilner, who was a former meta-platforms content moderation executive.
For the past few months, they've been looking for a replacement, but it seems that that
strategy has shifted, and now they are thinking about safety in three different ways.
This new update on their preparedness team is the fullest articulation of how they're thinking
about this problem so far. They write, the study of frontier AI risks has fallen far short of
what is possible and where we need to be. To address this gap and systematize our safety thinking,
we are adopting the initial version of our preparedness framework. It describes open AI's process
to track, evaluate, forecast, and protect against catastrophic risks posed by increasingly
powerful models. So one of the really important things about this is that they've divided the
world of safety into three different buckets. There's safety issues having to deal with current models,
safety issues dealing with frontier models, and safety issues dealing with super-intelligent models.
inside of OpenAI, these now all have different teams.
The Super Alignment team is the one that we've talked the most about.
This was formed over the summer, and is co-led, or was co-led at least, by co-founder and
chief scientist Ilya Sutskever, as well as Jan Leakey, although as of current, no one
exactly knows what Ili's future with the company is, given the fallout of that whole board blowup.
Now, the team that focuses on these current models is called the Safety Systems team.
And in fact, earlier this month, OpenAI updated how it was approaching that in a blog post from
December 5th. They wrote, building on the many years of our practical alignment work and applied
safety efforts, safety systems addresses emerging safety issues and develops new fundamental solutions
to enable the safe deployment of our most advanced models and future AGI to make AI that is
beneficial and trustworthy. Safety systems stays closest to deployment risks, while our superalignment
team focuses on aligning superintelligence and our preparedness team focuses on safety assessments
for frontier models. In collaboration, these teams span a wide spectrum of technical efforts
tackling AI safety challenges at OpenAI. So when it comes to these questions of deployment for
current models, what are some of the problems that this team thinks about? Open AI lists. How do we detect
unknown classes of harmful answers, actions, or usage? How do we maintain user privacy while ensuring
safety? How do we best leverage diverse human expertise to guide AI safety? How do we build AI to be
collaborative with users and safely take action on behalf of those users? Now, within the team itself,
there are actually four subteams. Safety engineering, model safety research, safety reasoning
research and human AI interaction. Safety engineering is exactly what it sounds like. It's the team that
implements, as they call it, system-level mitigation into products. The model safety research team
focuses on alignment with these models. The safety reasoning research team is taking a slightly
different approach and thinking about how to build, quote, better safety and ethical reasoning
skills into the foundation models and using these skills to enhance our moderation models. This isn't
exactly the same, but this echoes Anthropics constitutional AI, where rather than trying to scale
reinforcement learning from human feedback or RLHHHS.
Instead, Anthropic is trying to teach its models to reason about appropriate or ethical use cases based on a constitution that comes from a corpus of other constitutions and important ethical documents that people have written across the centuries.
The last subteam of the safety systems team is human AI interaction.
The way they describe it is, policy is the interface for aligning model behavior with desired human values, and we co-design policy with models and for models, and thus policies can be directly plugged into our safety systems.
Now, I will say that I think OpenAI often does an admirable job of not getting caught in jargon,
but that sentence literally says nothing.
Anyways, on December 5th, it was a short post, but it's set up this new post, which is, of course,
more about this preparedness and the implications for frontier models.
So what about the actual framework itself?
There are a few dimensions of it.
The first, they write, we will run evaluations and continually update scorecards for our models.
They write, we will evaluate all our frontier models, including at every 2x effective compute
increase during training runs. We will push models to their limits. These findings will help us assess
the risk of frontier models and measure the effectiveness of any proposed mitigations. Our goal is to
probe the specific edges of what's unsafe to effectively mitigate the revealed risks. To track the
safety level of our models, we will produce risk scorecards and detailed reports. So the four
categories of risk that they define are cybersecurity, CBRN, which stands for chemical, biological,
radiological, and nuclear risks, persuasion, and model autonomy. These scores for any given
model ranged from low to medium to high to critical. And ultimately, the overall model score is the
highest risk score in any category, meaning that if CBRN persuasion and model autonomy were all low,
but cybersecurity was critical, that would still mean that overall the model was categorized
as critical. Now, of course, what does it mean to actually have risk in these categories?
Well, that's the second part of their preparedness framework. They write, we will define risk
thresholds that trigger baseline safety measures. We've defined thresholds for risk levels along the
following initial tract categories, cybersecurity, CBRN, persuasion, and model autonomy.
Now, in terms of how they actually use these model scores, first of all, they're focused on
post-mitigation scores, i.e. not what the score is before they've done anything about it, but what the
score is after they've done something about it. And basically what they're committing to in this,
is that only models with a post-mitigation score of medium or below can be deployed, and only
models with a post-mitigation score of high or below can be developed further. So what they're saying is
that if they find critical threats in any of these categories, and they can't get that down to
at least just a high threat level after mitigation techniques, they're saying here that they would
cease work on that model. And of course, they'll only actually push out models that have a post-mitigation
score of medium. Now, who is making the decisions about all these things? Well, that gets to the next
part of this framework. And this is the one that's certainly been picked up the most by the press.
OpenAI writes, we will establish a dedicated team to oversee technical work and an operational
structure for safety decision-making. The preparedness team will drive technical work to
examine the limits of frontier model capabilities, run evaluations, and synthesize reports.
This technical work is critical to inform OpenAI's decision-making for safe model development
and deployment. So basically, the job of the preparedness team is to do all of the legwork
that gets people the information they need to make decisions. That is this preparedness team.
But on top of that, they're also creating a cross-functional safety advisory group to, quote,
review all reports and send them concurrently to leadership in the board of directors. While leadership,
is the decision maker, the board of directors holds the right to reverse decisions. So basically,
there are four groups each with a different role. The preparedness team does the technical work to assess
things. The safety advisory group makes recommendations. Leadership makes decisions about whether to
deploy or continue working. And the board of directors has the right to reverse those decisions.
Now, the relationship then between the leadership and the board of directors makes it kind of makes
sense why they're having this third party safety advisory group there as well. The safety advisory group
effectively acts as a nominally impartial layer on top of the preparedness team, where both sets
of decision makers, the leadership and then the potential reverse decisioning board of directors,
have that same set of recommendations and have that same set of technical work underlying it
from the safety advisory group and the preparedness team respectively. Now, my assumption reading
this is that when they talk about decision making, they're talking about deployment and they're talking
about the continued development. Basically, the two things they specified in the previous section
of the preparedness report. But it'd be good to have that a little bit more clarified.
Now, the last two notes of the preparedness framework are,
we will develop protocols for added safety and outside accountability.
The preparedness team will conduct regular safety drills to stress test against the pressures
of our business and our own culture.
Some safety issues can emerge rapidly so we have the ability to mark urgent issues for rapid
responses.
And then secondly, we will help reduce other known and unknown safety risks.
We will collaborate closely with external parties as well as internal teams like safety
systems to track real-world misuse.
We will also work with superalignment on tracking emergent misalignment risks.
We're also pioneering new research and measuring how risks evolve as model scale to help forecast risks in advance, similar to our earlier success with scaling laws.
Finally, we will run a continuous process to try surfacing any emerging unknown unknowns.
So my thoughts about this overall are that on a basic level the framework makes sense, right?
You've got what is a coherent separation of these different types of risks into these different categories and teams, which makes sense from a focus perspective, but also does require, of course, that these teams work together well and communicate effectively.
But let's assume that that can happen.
The framework of risk across these four categories also make sense to get a little bit clearer
around what we're trying to identify when we're asking, should a model be deployed or should a model
even continue being researched?
I think what I and a lot of people would like to know, although this might be proprietary and
internal, is what those risk thresholds really are.
What makes a cybersecurity risk move from low to medium and then medium to high and then high
to critical?
How is that measured?
How is that determined?
same across CBRN, same across persuasion, same across model autonomy, especially because these things
could get very subjective very quickly. Is there a benchmark for understanding how persuasive something is?
My guess is that there are. OpenAI tends to try not to be super hand-wavy about this,
but that I think is the reasonable next set of questions and something I hope we get more information on over time.
Not only because Open AI is currently at the very forefront of this industry, but because the more that they
share that sort of information, the more that other people have the chance to push back on it or suggest additions
or updates or changes, that can actually inform how other people test their models as well.
Still, it's good to have these frameworks start getting put into place,
and I'll be interested to see how they evolve in the months to come.
For now, that's going to do it for today's AI breakdown.
Until next time, peace.
