Everyday AI Podcast – An AI and ChatGPT Podcast - EP 90: How To Tackle AI Privacy and Governance
Episode Date: August 29, 2023Data privacy has become a major concern for businesses and individuals when it comes to using generative AI tools. What happens to our data? How can we make sure to govern AI properly? Today Katharina... Koerner from Tech Diplomacy Network joins us to discuss AI privacy and governance. Newsletter: Sign up for our free daily newsletterMore on this: Episode PageJoin the discussion: Ask Katharina and Jordan questions about AI and researchUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTimestamps:[00:01:10] Daily AI news[00:04:20] Understanding AI governance and privacy[00:09:40] How to balance using AI but keeping privacy[00:12:45] Privacy-enhancing technologies for secure data sharing[00:16:00] AI's impact on democracy[00:20:20] Can AI models be fully transparent?[00:23:00] Are certain countries ahead in AI governance?Topics Covered in This Episode:- Importance of data privacy and concerns about AI tools- Announcement of Google's new way of watermarking AI images- Microsoft president's warning about the need for human control in AI- Release of the enterprise version of ChatGPT by OpenAI- Introduction of guest expert Katharina Koerner from the Tech Diplomacy Network- General explanation of AI governance and privacy- Globally accepted privacy principles and their application to AI systems- Specific issues related to the origin of data used by large language models- Differences in regulations around web scraping between the US and EuropeKeywords:AI, data privacy, AI tools, generative AI tools, AI images, deep mind, Synth ID, watermark, label images, Microsoft president, AI needs human control, tool AI news, AI governance, ChatGPT enterprise, OpenAI, ChatGPT, tech diplomacy network, privacy principles, data quality, data collection limitation, use limitation, transparency, accountabilitySend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live in Adobe Firefly, the all-in-one creative AI studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
Data, privacy, and AI aren't always two things that go hand in hand.
It's something that I think so many businesses and individuals are worried about when it comes
to using different generative AI tools is what happens with my data?
We're going to tackle that today with an expert in AI privacy, who I'm very excited to have on
the show. So if you're joining us live, please take part in the conversation, ask some questions.
I think it's going to be a very informative conversation. So my name is Jordan Wilson and this is
Everyday AI. It's a daily live stream podcast and free daily newsletter helping everyday people
like me, like you, not just make sense of what's going on in the world of AI, but how we can
actually make it work for us. All right. So before we get in our discussion on data privacy,
and AI governance.
Let's go ahead and go over what's happening in the world of AI news.
So we actually have a couple of things related to privacy and governance.
So starting off, Google just announced a new way that they are going to be watermarking AI images.
So they obviously announced this a couple months back at their annual conference.
But it was just released now, you know, within the last couple of hours, that Deep Mind,
which is essentially Google's AI specialty arm, so to speak.
So this is a deep mind product, and it's called Synth ID,
and it will watermark and label images that have been created with AI.
So it's going to be interesting to see how that one plays out.
All right.
Next, the Microsoft president just said that AI needs human control to be safe.
All right.
So in an exclusive interview with CNBC, Microsoft president,
Brad Smith warned, and I quote, that AI has the potential to become both a tool and a weapon.
So this is obviously something we've heard about before.
But if you wanted to check out more on what Brad Smith said, make sure to check out that section in the newsletter.
All right.
And our third news story of the day, definitely.
It might be last, but not the least.
So chat GPT enterprise has been released.
So this was announced months ago, and we covered it on the show when it was announced.
But OpenAI has released finally the enterprise version of ChatGBT,
GBT, a much more locked down version, kind of geared more toward data privacy.
So this is something that Open AI has been working on for many months.
I don't even know what all of these terms mean.
It's SOC2 compliant.
It's a certain level of compliance with data, I believe.
So a couple other things, details about this chat GPT enterprise.
Not everything's been released.
So at least right now, there's no price tag on it yet.
But we do know it will be faster with no caps or limits on GPT4 and longer context windows
as well.
So exciting news on the enterprise front,
because I know that so many companies are a little hesitant with their data, and they don't necessarily want to hand it over to Open AI or some of these larger companies.
So what a great transition piece for our experts for today.
We're going to be talking AI privacy and governance.
So I'm extremely excited to bring on our guests for today.
So please welcome to the show, Catherine Corner, from the Tech Diplomacy Network.
work. Catherine, thank you for joining us. Thank you for inviting me. Great to finish shows.
Absolutely. This is going to be a fun one. This is such a topic that I think a lot of people are
talking about. But, Catherine, if you could, just start us off very general, kind of break down
this complex world of AI governance and privacy, just kind of so the everyday person can
understand. What does AI governance and privacy even mean to the rest of us?
Thank you. Thank you. Thank you for that question. Super important question. So I thought I would start with
mentioning that we have some globally accepted privacy principles, which are embedded in many, many privacy
regulations around the world, which are, by the way, popping up here and everywhere, like just getting
more and more. And of course, all of those privacy principles also apply to AI systems in case
the process personal data.
So, which are those?
So first of all, we have data quality, for example.
That means making sure the data used by AI systems is accurate and reliable.
So, for example, if you're teaching a robot to recognize different kinds of fruits,
if it learns, if this data is, if data is, it learns from a full of mistakes, for example,
calling apples oranges, that wouldn't be very helpful.
So second one is data collection limitations.
So data minimization is a very important principle.
Always use just the data you really need to collect for a specific task
and do not gather extra information.
So for example, if you have a fitness app on your phone,
if it's asking for a location, but it doesn't need that to count your steps,
it's not following this principle.
Then we have very important principle of purpose specification.
That means the data you collect and process as an organization
should only be and only be used for the purpose it was collected for.
So let's say you sign up for a gaming app.
So the email you use for that gaming app to send you game-related stuff
should of course not be used to advertise other products.
Then we have use limitation.
This is meant in that way that you should not use data for anything else
other than the original purpose without permission.
So you have photos on your social media platform.
All of a sudden, those photos are used for something else.
That would be a breach of this principle.
And then what is super important, and I think a lot of people are very aware of this, is transparency.
This means being open about what data is collected, why it's collected, how it's used.
Let's say we order something in a restaurant or we have a recipe.
Of course, we want to know which ingredients are used.
Maybe I'm allergic to something.
so you trusted me or more if you know what was used.
And then lastly, I want to mention accountability.
If something goes wrong, there should be someone responsible.
So those are some privacy principles that also apply to AI systems.
And of course, as expected, there are some particular issues on top of that when it comes to AI.
So for example, a big issue is the origin of the data that large language
models use. The basis for services just as JetGBT, as you mentioned. And here, usually the data
is crawled from the internet, and we have different regulations around the world. It's pretty
complex, the world of data scraping, I mean the regulations about web scraping. So in the United
States, in general, the law says if information is freely available in the public, businesses can use
it. So if you have, I don't know, if you wrote something on a poster on a public,
bulletin board, anyone can read it, anyone can use it, except for when the website says you can take
the information or if it's behind a paywall or if it's behind your credentials, then you shouldn't.
I mean, you're not allowed to.
But in Europe, on the other hand, any information, if it's public or not, is considered personal
information that is protected by the GDPR.
So if someone wants to use personal data, they need a good reason and the permission.
like you ask you borrow a book from the library if you don't ask it's not allowed so that's
something that is very very relevant in the context of AI and LLMs and should I do you want to
I mean I could go on and on and on I don't want to like take over your show no no I mean I think
just right there we hit on so many different different
different points that are so important, you know, making sure data is accurate and reliable,
only using it for the correct purposes, which I think is super important.
But also, you know, transparency and, you know, understanding that different countries,
you know, have different rules and regulations.
I think it's important because I think, you know, even here on the show, a lot of times
we're talking about how things impact the U.S.
But, you know, as you see here, you know, Maybritt is joining us from Europe, you know,
Val right here with a comment is joining us from South America.
So we do have people tuning in from all over, you know, Bronwyn joining us from South Africa.
So thank you all for joining and thank you for your comments.
But one thing, Katharina, that I wanted to talk a little bit more about is kind of how, you know, companies or even individuals or just us as society, how do we strike a balance between leveraging AI's capabilities?
because we always hear about it and all of the things that generative AI can do or make is so exciting.
So how do we strike that balance between AI's capabilities and making sure data kind of remains how it should,
which is private or only the data that we are wanting a company to collect for the purposes?
So how do we strike that balance?
Adobe just introduced an entirely new way to create,
bringing the power and precision of its creative suite into one conversational experience.
Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative
AI Studio.
Powered by Adobe's creative agent, Firefly AI Assistant lets you start with your vision, just
describe what you want, and shape the outcome as it takes form with the Assistant.
The Assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe
Creative Cloud apps, including Photoshop, Illustrator, Premier, Lightroom Express, and more to help
bring your ideas to life.
You can also get started with creative skills, a growing library of pre-built
workflows for common creative tasks, like batch editing photos, creating mood boards, portrait
retouching, and creating social variations.
Every step the assistant takes is visible so you can refine, redirect, or take over at any
time.
You stay in the driver's seat as the creative director.
Adobe Firefly AI assistant now in public beta.
See it today at Firefly.
adobe.com.
Well, that's a very, very good question to ask.
And I don't know if that's too,
goes too far from like the main topic
of your show, but I have done a lot of work
and research and privacy enhancing technologies.
So that's usually the answer I give.
So a whole new field has emerged
with like a vibrant startup community.
Google is using it.
Microsoft is implementing it in the products,
something that is called privacy preserving machine learning and a huge research ecosystem as well.
So because it's true that with traditional privacy protections, often the utility of the data decreases.
It might even decrease the utility of the data while not even protecting the data very well,
which was the first intention.
So one example is the protection of personal health.
information under HIPAA. It's a classic example. So that's the US health insurance portability
and accountability act. So there are a lot of entities covered under HIPA. So if you process personal
health information, you have to comply with this law. It's a federal law in the United States.
And under HIPA, personal health information should be de-identified for protecting patient
privacy. So to support research, reducing risk, promoting data sharing between hospitals so that we can
have better insights in this personal health information. And one accepted method for this is a safe
harbor method. And that involves that you remove 18 direct identifiers from the data like name,
social security number, medical record numbers, et cetera. So that law was from the 90s. And they really put this
into law because usually law is tech neutral.
So they formulated in a phrase it in a way that, you know, it can go with time.
But HIPAA really said, strip those 18 identifiers from personal health information,
then you're more or less good to go.
But in fact, you strip so many identifiers for the reason of privacy protection that it is,
that you lose a lot of information.
Plus, it is not even good privacy protection because there are many re-identifications.
attacks that can be very successfully, you know,
conducted on this hip-up protected health data.
And so now here is this new ecosystem,
this new technologies and approaches that have emerged
over the last couple of years that can unlock the valuable
insights from the data while protecting privacy.
And those are, so as to mention a few of them,
differential privacy,
so differential privacy is kind of,
you randomized responses in a data set.
So you cannot say 100% is your data in this data set or not.
You can mathematically prove that it's not possible to tell if your data was in the
dataset and contributed to the output, yes or no.
Then we have synthetic data.
So you build a new data set that mimics the patterns from the original data set,
and then you can use it.
or there's something new called, I think it's a very poetic name, homomorphic encryption.
It's like, you know, it's on a maturity curve.
It's going up.
But it means that you encrypt data and you can still process and have, like, you know,
conduct computation on this encrypted data without decrypting it.
Pretty similar is trusted execution environments.
That's a hardware solution where you also work, you have like an,
secure enclave and the encrypted data goes in, only inside it's decrypted and processed and
computed upon and then it leaves the enclave in an encrypted way. Or we have secure multi-party
computation, so meaning you can have a common computation and output, but you will never know
what actually, what was the input, actually. A classic example is often you can conduct an analysis
of incomes. What's the average income, for example, or what's the disparage income?
between female and male incomes,
but you will not see what the actual input was.
So those are some of those privacy-enhancing technologies,
and they are politically supported globally.
So we have the US National Strategy to Advance Privacy Preserving.
Data sharing analytics, early this year,
I wrote an article somewhere, maybe we can post it or whatever.
Like, there's so many, there's so much policy support for those pets
because they're so promising, because some of them, as I mentioned,
and they even protect data in use.
So we do know that protection of data addressed.
So securing data when it's stored,
that's state of the art when it's sent.
So you protect data in transit.
But some of those new technologies can also protect data while in use.
And that's a pretty big thing.
So we will get there.
And of course, like, yeah, please.
Wow.
No, I'm just saying like, I am, y'all,
I talk about AI.
every single day. And I don't know about you all listening and tuning in. I am getting a first class
education right now from Catherine. Like I have so many notes. And if you're like me and if you can't
keep up with everything, you know, I'm trying to Google things. I'm writing things down. Don't worry.
We are going to be sharing everything in the newsletter. But I do have, you know, one more big question.
I think, Katharina, before maybe we can get to a comment or two. So if you do have a question,
make sure to get it in now.
But I want to talk about, you know, because you kind of talked about how data privacy
and AI governance is so much different in different countries.
But one thing that I think is really going to be on people's minds, especially here in the
U.S., is when we talk about AI's impact on democracy, right?
Because I think especially here in the U.S., we've already seen it, different political
groups using AI in probably ways it was not intended to to make it seem like a political
opponent maybe said something or did something they did not do. So I mean, can you just talk
just quickly about kind of the risks of AI just on democracy and how it might influence
public opinion elections and even just how we understand democracy? Yeah, I mean,
That's a big concern for sure.
And with the help of AI or machine learning, you for sure can aim or try to influence public opinion in elections.
So first of all, with election campaigns, for example, you can target specific groups with tailored messages.
I mean, that's nothing new, but you can identify, because you can analyze this vast amount of data in such a great way, you can identify potential swing voters or craft messages.
to really influence the decisions, micro-targeting,
so that all leads to manipulation.
Manipulation can also be achieved by, you know,
AI generated deep fake videos.
We had examples where, like, people were already, like,
you know, videos were made as if they had made those videos,
and that was not those people.
You have filter bubbles, more or less,
so AI algorithms can personalize content,
showing individuals information that only aligns with the existing beliefs.
We have, of course, security risks.
That's a big thing.
So AI can be used by malicious actors to hack into election systems,
manipulate voter registration data, disrupt the voting process itself.
And transparency and accountability, I think that's also a big topic also in this regard.
So the use of AI in elections and public discourse raises the questions
who is responsible for the AI algorithms that influence public opinion,
how can we ensure also in this context that AI is used ethical and responsibility
in responsibility in political sphere?
And this is why in general also with like, you know,
I'm sure you talked about hallucinations by LLMs or services like JCHBT,
like wrong outputs.
I think that education is really so important.
I don't know exactly how it works that AI education will find
its way into our public education system. But I recently started an initiative just here for the
Bay Area. So any school in the Bay Area that wants me and explain AI to them, like to the kids,
I'm happy to do that because I think that's just so important, just as we have like not every single
book just because a book, it's a good book, right? And it's not always like, write what it's in that
book just because it's a book. And the same is true with actually anything that is on the internet
and any output by any system, you can never 100% trust.
You should always use your own, I don't know, common sense or prove it with different sources.
So we have to raise this awareness that we're still self-responsible for content that we get
from the internet and from really super cool applications like JetGBT, et cetera.
Yeah, absolutely.
And I couldn't agree more with you, Katharina, about the need for better AI.
education in the school systems. Absolutely because it seems like so many schools, I just had an
episode on this recently. So many schools are, you know, shutting it down or not allowing, you know,
students to use it when, personally, I think that's not a good idea. But, okay, so we've covered so
many things, but I do hope we can get, get a question in her two from our audience. So Nancy saying,
good morning. Glad to be back on the live show. Great to have you, Nancy. Cecilia, saying,
is indeed one of the most important assets to manage right after our relationships and strategies.
Couldn't agree more. So a question here, if you wanted to take this one, so Ben is saying,
thanks for the comments. And he's asking, is it realistic to expect AI models to be fully transparent
and explainable? I think that's a great question. What would you respond to on that one,
Katherina? Yes. Yes, yes, yes. Very good question. So I think for some
models or some applications like white box applications it is possible so i mean the majority of
AI is machine learning so if we have a simple i don't know recommender system and we the model is a
decision tree so i mean it's really decision tree like i don't know you know uh i don't know
apples or oranges and then does it have a warm or not or whatever so in the end you have a
healthy apple or whatever this
is an explainable model.
It already gets a little bit more complicated
if you have a simple model that is still tricky
with a random forest.
We have hundreds of decision trees
and then you compare the outputs of those
or compare them and you find one model.
But of course we have black box models
and there are many approaches trying to solve
the explainability issue,
but I do not see that the issue is solved.
So I think in the future, or actually already right now, I think we will see more AI-specific applications, I mean, sector-specific applications of AI, so that models are built from the onset on with explainability in mind.
So I know, like I recently came across one company focusing only on lender decisions and really having this approach from the very beginning on.
And this is very natural, a very, you know, a foreseeable development, I would say, because we have those principles, privacy by design.
So you have to build privacy in the whole architecture from the design phase on.
Or we have security by design.
So now we have responsible AI by design.
So you have to build them from the beginning.
I think with a lot of models that have already been built,
you will probably not achieve full transparency and explainability.
So I'm totally with you here.
Wow.
What a great response tackling that from all possible angles.
So I think we have time for hopefully one more question here, Kathrina.
So May Britt asking, because you know, you talked already about how different countries are handling
privacy differently. So Maybrot's question is, do you find a specific country or area more
effective in their AI privacy and AI governments? Because obviously, you know, the EU and some
other, you know, countries throughout throughout the world kind of govern their AI privacy and
data much differently than here in the USA, which essentially, at least for now, is self-governance
from the largest companies here in the U.S.
So what's your take?
Are there certain, you know, in your opinion, in your experience,
are there certain areas or countries that are maybe more effective
at data governance and AI privacy?
I mean, yes, of course, the first thought is the EU has already this golden standard of data
protection with the GDPR and is coming up with the EU AI Act,
which will also have extraterritorial effect.
because if you will offer your services in the EU,
you will also have to comply with the EU AI Act,
classifying AI systems into different risk categories, etc.
But, I mean, what is effective?
I mean, if there's also this concern that those,
that the new AI Act is so effective
that it will actually might, you know, hinder,
hinder
help me here with the word
like you know the startup scene
like you know make it more difficult
growth yeah it might stop
companies from being able to grow fast yeah
yeah so I mean what is effective
and also I just wanted to mention because you said
you talked about the US I mentioned it
we do already have a lot of regulations
also in the US which applied to AI
it's not that there is nothing here
so we have a lot of sectoral law
there are a lot of state laws now popping
up which tackle AI. A lot of AI issues are also regulated under privacy law. So it's not
nothing. It's just so complex. It's like a patchwork. So more effective, but I think more effective
would mean there's legal clarity. So I think if we have one law in the EU and maybe one law or
executive order or whatever in the United States on a federal level, that will make it more
effective and I think this will be coming. So this was a kind of, yeah, not so clear of an answer,
but, you know, some thoughts. No, I mean, but that's such a good point because, you know,
one person's definition of effective AI privacy and governance might mean to someone else,
you know, stunting business growth or, you know, keeping companies from fully scaling at the pace
to which they might want.
So we tackled so many important topics on today's show.
Catherine, thank you so much for joining us
because I know we went all over the horn.
Thank you for sharing your expertise and knowledge with us all.
Thank you so much for joining.
Thank you so much for having me.
Thank you, everyone.
All right.
And just as a quick reminder, like Val is saying,
looking for the newsletter on the topic,
don't worry, go to your everyday AI.com.
or if you're listening on the podcast, just look at the show notes.
You'll find a link there because we're going to be sharing some of the articles that
Kathrina mentioned that she wrote.
And we had so much in this episode dealing with AI privacy and government.
So don't worry, go sign up at your EverydayAI.com.
So thank you for joining us.
And we hope to see you back again for another edition of Everyday AI.
Thanks, y'all.
Meet Firefly AI Assistant, now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words and the assistant handles the rest,
orchestrating multi-step workflows across Adobe Creative Cloud apps,
including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adop.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next time.
