Everyday AI Podcast – An AI and ChatGPT Podcast - EP 90: How To Tackle AI Privacy and Governance

Episode Date: August 29, 2023

Data privacy has become a major concern for businesses and individuals when it comes to using generative AI tools. What happens to our data? How can we make sure to govern AI properly? Today Katharina... Koerner from Tech Diplomacy Network joins us to discuss AI privacy and governance. Newsletter: Sign up for our free daily newsletterMore on this: Episode PageJoin the discussion: Ask Katharina and Jordan questions about AI and researchUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTimestamps:[00:01:10] Daily AI news[00:04:20] Understanding AI governance and privacy[00:09:40] How to balance using AI but keeping privacy[00:12:45] Privacy-enhancing technologies for secure data sharing[00:16:00] AI's impact on democracy[00:20:20] Can AI models be fully transparent?[00:23:00] Are certain countries ahead in AI governance?Topics Covered in This Episode:- Importance of data privacy and concerns about AI tools- Announcement of Google's new way of watermarking AI images- Microsoft president's warning about the need for human control in AI- Release of the enterprise version of ChatGPT by OpenAI- Introduction of guest expert Katharina Koerner from the Tech Diplomacy Network- General explanation of AI governance and privacy- Globally accepted privacy principles and their application to AI systems- Specific issues related to the origin of data used by large language models- Differences in regulations around web scraping between the US and EuropeKeywords:AI, data privacy, AI tools, generative AI tools, AI images, deep mind, Synth ID, watermark, label images, Microsoft president, AI needs human control, tool AI news, AI governance, ChatGPT enterprise, OpenAI, ChatGPT, tech diplomacy network, privacy principles, data quality, data collection limitation, use limitation, transparency, accountabilitySend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the all-in-one creative AI studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Data, privacy, and AI aren't always two things that go hand in hand.
Starting point is 00:00:54 It's something that I think so many businesses and individuals are worried about when it comes to using different generative AI tools is what happens with my data? We're going to tackle that today with an expert in AI privacy, who I'm very excited to have on the show. So if you're joining us live, please take part in the conversation, ask some questions. I think it's going to be a very informative conversation. So my name is Jordan Wilson and this is Everyday AI. It's a daily live stream podcast and free daily newsletter helping everyday people like me, like you, not just make sense of what's going on in the world of AI, but how we can actually make it work for us. All right. So before we get in our discussion on data privacy,
Starting point is 00:01:42 and AI governance. Let's go ahead and go over what's happening in the world of AI news. So we actually have a couple of things related to privacy and governance. So starting off, Google just announced a new way that they are going to be watermarking AI images. So they obviously announced this a couple months back at their annual conference. But it was just released now, you know, within the last couple of hours, that Deep Mind, which is essentially Google's AI specialty arm, so to speak. So this is a deep mind product, and it's called Synth ID,
Starting point is 00:02:19 and it will watermark and label images that have been created with AI. So it's going to be interesting to see how that one plays out. All right. Next, the Microsoft president just said that AI needs human control to be safe. All right. So in an exclusive interview with CNBC, Microsoft president, Brad Smith warned, and I quote, that AI has the potential to become both a tool and a weapon. So this is obviously something we've heard about before.
Starting point is 00:02:52 But if you wanted to check out more on what Brad Smith said, make sure to check out that section in the newsletter. All right. And our third news story of the day, definitely. It might be last, but not the least. So chat GPT enterprise has been released. So this was announced months ago, and we covered it on the show when it was announced. But OpenAI has released finally the enterprise version of ChatGBT, GBT, a much more locked down version, kind of geared more toward data privacy.
Starting point is 00:03:26 So this is something that Open AI has been working on for many months. I don't even know what all of these terms mean. It's SOC2 compliant. It's a certain level of compliance with data, I believe. So a couple other things, details about this chat GPT enterprise. Not everything's been released. So at least right now, there's no price tag on it yet. But we do know it will be faster with no caps or limits on GPT4 and longer context windows
Starting point is 00:04:00 as well. So exciting news on the enterprise front, because I know that so many companies are a little hesitant with their data, and they don't necessarily want to hand it over to Open AI or some of these larger companies. So what a great transition piece for our experts for today. We're going to be talking AI privacy and governance. So I'm extremely excited to bring on our guests for today. So please welcome to the show, Catherine Corner, from the Tech Diplomacy Network. work. Catherine, thank you for joining us. Thank you for inviting me. Great to finish shows.
Starting point is 00:04:40 Absolutely. This is going to be a fun one. This is such a topic that I think a lot of people are talking about. But, Catherine, if you could, just start us off very general, kind of break down this complex world of AI governance and privacy, just kind of so the everyday person can understand. What does AI governance and privacy even mean to the rest of us? Thank you. Thank you. Thank you for that question. Super important question. So I thought I would start with mentioning that we have some globally accepted privacy principles, which are embedded in many, many privacy regulations around the world, which are, by the way, popping up here and everywhere, like just getting more and more. And of course, all of those privacy principles also apply to AI systems in case
Starting point is 00:05:33 the process personal data. So, which are those? So first of all, we have data quality, for example. That means making sure the data used by AI systems is accurate and reliable. So, for example, if you're teaching a robot to recognize different kinds of fruits, if it learns, if this data is, if data is, it learns from a full of mistakes, for example, calling apples oranges, that wouldn't be very helpful. So second one is data collection limitations.
Starting point is 00:06:03 So data minimization is a very important principle. Always use just the data you really need to collect for a specific task and do not gather extra information. So for example, if you have a fitness app on your phone, if it's asking for a location, but it doesn't need that to count your steps, it's not following this principle. Then we have very important principle of purpose specification. That means the data you collect and process as an organization
Starting point is 00:06:31 should only be and only be used for the purpose it was collected for. So let's say you sign up for a gaming app. So the email you use for that gaming app to send you game-related stuff should of course not be used to advertise other products. Then we have use limitation. This is meant in that way that you should not use data for anything else other than the original purpose without permission. So you have photos on your social media platform.
Starting point is 00:07:04 All of a sudden, those photos are used for something else. That would be a breach of this principle. And then what is super important, and I think a lot of people are very aware of this, is transparency. This means being open about what data is collected, why it's collected, how it's used. Let's say we order something in a restaurant or we have a recipe. Of course, we want to know which ingredients are used. Maybe I'm allergic to something. so you trusted me or more if you know what was used.
Starting point is 00:07:34 And then lastly, I want to mention accountability. If something goes wrong, there should be someone responsible. So those are some privacy principles that also apply to AI systems. And of course, as expected, there are some particular issues on top of that when it comes to AI. So for example, a big issue is the origin of the data that large language models use. The basis for services just as JetGBT, as you mentioned. And here, usually the data is crawled from the internet, and we have different regulations around the world. It's pretty complex, the world of data scraping, I mean the regulations about web scraping. So in the United
Starting point is 00:08:18 States, in general, the law says if information is freely available in the public, businesses can use it. So if you have, I don't know, if you wrote something on a poster on a public, bulletin board, anyone can read it, anyone can use it, except for when the website says you can take the information or if it's behind a paywall or if it's behind your credentials, then you shouldn't. I mean, you're not allowed to. But in Europe, on the other hand, any information, if it's public or not, is considered personal information that is protected by the GDPR. So if someone wants to use personal data, they need a good reason and the permission.
Starting point is 00:08:55 like you ask you borrow a book from the library if you don't ask it's not allowed so that's something that is very very relevant in the context of AI and LLMs and should I do you want to I mean I could go on and on and on I don't want to like take over your show no no I mean I think just right there we hit on so many different different different points that are so important, you know, making sure data is accurate and reliable, only using it for the correct purposes, which I think is super important. But also, you know, transparency and, you know, understanding that different countries, you know, have different rules and regulations.
Starting point is 00:09:42 I think it's important because I think, you know, even here on the show, a lot of times we're talking about how things impact the U.S. But, you know, as you see here, you know, Maybritt is joining us from Europe, you know, Val right here with a comment is joining us from South America. So we do have people tuning in from all over, you know, Bronwyn joining us from South Africa. So thank you all for joining and thank you for your comments. But one thing, Katharina, that I wanted to talk a little bit more about is kind of how, you know, companies or even individuals or just us as society, how do we strike a balance between leveraging AI's capabilities? because we always hear about it and all of the things that generative AI can do or make is so exciting.
Starting point is 00:10:29 So how do we strike that balance between AI's capabilities and making sure data kind of remains how it should, which is private or only the data that we are wanting a company to collect for the purposes? So how do we strike that balance? Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's creative agent, Firefly AI Assistant lets you start with your vision, just
Starting point is 00:11:10 describe what you want, and shape the outcome as it takes form with the Assistant. The Assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premier, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any
Starting point is 00:11:47 time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at Firefly. adobe.com. Well, that's a very, very good question to ask. And I don't know if that's too, goes too far from like the main topic
Starting point is 00:12:12 of your show, but I have done a lot of work and research and privacy enhancing technologies. So that's usually the answer I give. So a whole new field has emerged with like a vibrant startup community. Google is using it. Microsoft is implementing it in the products, something that is called privacy preserving machine learning and a huge research ecosystem as well.
Starting point is 00:12:37 So because it's true that with traditional privacy protections, often the utility of the data decreases. It might even decrease the utility of the data while not even protecting the data very well, which was the first intention. So one example is the protection of personal health. information under HIPAA. It's a classic example. So that's the US health insurance portability and accountability act. So there are a lot of entities covered under HIPA. So if you process personal health information, you have to comply with this law. It's a federal law in the United States. And under HIPA, personal health information should be de-identified for protecting patient
Starting point is 00:13:23 privacy. So to support research, reducing risk, promoting data sharing between hospitals so that we can have better insights in this personal health information. And one accepted method for this is a safe harbor method. And that involves that you remove 18 direct identifiers from the data like name, social security number, medical record numbers, et cetera. So that law was from the 90s. And they really put this into law because usually law is tech neutral. So they formulated in a phrase it in a way that, you know, it can go with time. But HIPAA really said, strip those 18 identifiers from personal health information, then you're more or less good to go.
Starting point is 00:14:07 But in fact, you strip so many identifiers for the reason of privacy protection that it is, that you lose a lot of information. Plus, it is not even good privacy protection because there are many re-identifications. attacks that can be very successfully, you know, conducted on this hip-up protected health data. And so now here is this new ecosystem, this new technologies and approaches that have emerged over the last couple of years that can unlock the valuable
Starting point is 00:14:41 insights from the data while protecting privacy. And those are, so as to mention a few of them, differential privacy, so differential privacy is kind of, you randomized responses in a data set. So you cannot say 100% is your data in this data set or not. You can mathematically prove that it's not possible to tell if your data was in the dataset and contributed to the output, yes or no.
Starting point is 00:15:10 Then we have synthetic data. So you build a new data set that mimics the patterns from the original data set, and then you can use it. or there's something new called, I think it's a very poetic name, homomorphic encryption. It's like, you know, it's on a maturity curve. It's going up. But it means that you encrypt data and you can still process and have, like, you know, conduct computation on this encrypted data without decrypting it.
Starting point is 00:15:41 Pretty similar is trusted execution environments. That's a hardware solution where you also work, you have like an, secure enclave and the encrypted data goes in, only inside it's decrypted and processed and computed upon and then it leaves the enclave in an encrypted way. Or we have secure multi-party computation, so meaning you can have a common computation and output, but you will never know what actually, what was the input, actually. A classic example is often you can conduct an analysis of incomes. What's the average income, for example, or what's the disparage income? between female and male incomes,
Starting point is 00:16:21 but you will not see what the actual input was. So those are some of those privacy-enhancing technologies, and they are politically supported globally. So we have the US National Strategy to Advance Privacy Preserving. Data sharing analytics, early this year, I wrote an article somewhere, maybe we can post it or whatever. Like, there's so many, there's so much policy support for those pets because they're so promising, because some of them, as I mentioned,
Starting point is 00:16:47 and they even protect data in use. So we do know that protection of data addressed. So securing data when it's stored, that's state of the art when it's sent. So you protect data in transit. But some of those new technologies can also protect data while in use. And that's a pretty big thing. So we will get there.
Starting point is 00:17:09 And of course, like, yeah, please. Wow. No, I'm just saying like, I am, y'all, I talk about AI. every single day. And I don't know about you all listening and tuning in. I am getting a first class education right now from Catherine. Like I have so many notes. And if you're like me and if you can't keep up with everything, you know, I'm trying to Google things. I'm writing things down. Don't worry. We are going to be sharing everything in the newsletter. But I do have, you know, one more big question.
Starting point is 00:17:41 I think, Katharina, before maybe we can get to a comment or two. So if you do have a question, make sure to get it in now. But I want to talk about, you know, because you kind of talked about how data privacy and AI governance is so much different in different countries. But one thing that I think is really going to be on people's minds, especially here in the U.S., is when we talk about AI's impact on democracy, right? Because I think especially here in the U.S., we've already seen it, different political groups using AI in probably ways it was not intended to to make it seem like a political
Starting point is 00:18:20 opponent maybe said something or did something they did not do. So I mean, can you just talk just quickly about kind of the risks of AI just on democracy and how it might influence public opinion elections and even just how we understand democracy? Yeah, I mean, That's a big concern for sure. And with the help of AI or machine learning, you for sure can aim or try to influence public opinion in elections. So first of all, with election campaigns, for example, you can target specific groups with tailored messages. I mean, that's nothing new, but you can identify, because you can analyze this vast amount of data in such a great way, you can identify potential swing voters or craft messages. to really influence the decisions, micro-targeting,
Starting point is 00:19:16 so that all leads to manipulation. Manipulation can also be achieved by, you know, AI generated deep fake videos. We had examples where, like, people were already, like, you know, videos were made as if they had made those videos, and that was not those people. You have filter bubbles, more or less, so AI algorithms can personalize content,
Starting point is 00:19:42 showing individuals information that only aligns with the existing beliefs. We have, of course, security risks. That's a big thing. So AI can be used by malicious actors to hack into election systems, manipulate voter registration data, disrupt the voting process itself. And transparency and accountability, I think that's also a big topic also in this regard. So the use of AI in elections and public discourse raises the questions who is responsible for the AI algorithms that influence public opinion,
Starting point is 00:20:15 how can we ensure also in this context that AI is used ethical and responsibility in responsibility in political sphere? And this is why in general also with like, you know, I'm sure you talked about hallucinations by LLMs or services like JCHBT, like wrong outputs. I think that education is really so important. I don't know exactly how it works that AI education will find its way into our public education system. But I recently started an initiative just here for the
Starting point is 00:20:47 Bay Area. So any school in the Bay Area that wants me and explain AI to them, like to the kids, I'm happy to do that because I think that's just so important, just as we have like not every single book just because a book, it's a good book, right? And it's not always like, write what it's in that book just because it's a book. And the same is true with actually anything that is on the internet and any output by any system, you can never 100% trust. You should always use your own, I don't know, common sense or prove it with different sources. So we have to raise this awareness that we're still self-responsible for content that we get from the internet and from really super cool applications like JetGBT, et cetera.
Starting point is 00:21:31 Yeah, absolutely. And I couldn't agree more with you, Katharina, about the need for better AI. education in the school systems. Absolutely because it seems like so many schools, I just had an episode on this recently. So many schools are, you know, shutting it down or not allowing, you know, students to use it when, personally, I think that's not a good idea. But, okay, so we've covered so many things, but I do hope we can get, get a question in her two from our audience. So Nancy saying, good morning. Glad to be back on the live show. Great to have you, Nancy. Cecilia, saying, is indeed one of the most important assets to manage right after our relationships and strategies.
Starting point is 00:22:14 Couldn't agree more. So a question here, if you wanted to take this one, so Ben is saying, thanks for the comments. And he's asking, is it realistic to expect AI models to be fully transparent and explainable? I think that's a great question. What would you respond to on that one, Katherina? Yes. Yes, yes, yes. Very good question. So I think for some models or some applications like white box applications it is possible so i mean the majority of AI is machine learning so if we have a simple i don't know recommender system and we the model is a decision tree so i mean it's really decision tree like i don't know you know uh i don't know apples or oranges and then does it have a warm or not or whatever so in the end you have a
Starting point is 00:23:06 healthy apple or whatever this is an explainable model. It already gets a little bit more complicated if you have a simple model that is still tricky with a random forest. We have hundreds of decision trees and then you compare the outputs of those or compare them and you find one model.
Starting point is 00:23:28 But of course we have black box models and there are many approaches trying to solve the explainability issue, but I do not see that the issue is solved. So I think in the future, or actually already right now, I think we will see more AI-specific applications, I mean, sector-specific applications of AI, so that models are built from the onset on with explainability in mind. So I know, like I recently came across one company focusing only on lender decisions and really having this approach from the very beginning on. And this is very natural, a very, you know, a foreseeable development, I would say, because we have those principles, privacy by design. So you have to build privacy in the whole architecture from the design phase on.
Starting point is 00:24:24 Or we have security by design. So now we have responsible AI by design. So you have to build them from the beginning. I think with a lot of models that have already been built, you will probably not achieve full transparency and explainability. So I'm totally with you here. Wow. What a great response tackling that from all possible angles.
Starting point is 00:24:44 So I think we have time for hopefully one more question here, Kathrina. So May Britt asking, because you know, you talked already about how different countries are handling privacy differently. So Maybrot's question is, do you find a specific country or area more effective in their AI privacy and AI governments? Because obviously, you know, the EU and some other, you know, countries throughout throughout the world kind of govern their AI privacy and data much differently than here in the USA, which essentially, at least for now, is self-governance from the largest companies here in the U.S. So what's your take?
Starting point is 00:25:27 Are there certain, you know, in your opinion, in your experience, are there certain areas or countries that are maybe more effective at data governance and AI privacy? I mean, yes, of course, the first thought is the EU has already this golden standard of data protection with the GDPR and is coming up with the EU AI Act, which will also have extraterritorial effect. because if you will offer your services in the EU, you will also have to comply with the EU AI Act,
Starting point is 00:25:59 classifying AI systems into different risk categories, etc. But, I mean, what is effective? I mean, if there's also this concern that those, that the new AI Act is so effective that it will actually might, you know, hinder, hinder help me here with the word like you know the startup scene
Starting point is 00:26:26 like you know make it more difficult growth yeah it might stop companies from being able to grow fast yeah yeah so I mean what is effective and also I just wanted to mention because you said you talked about the US I mentioned it we do already have a lot of regulations also in the US which applied to AI
Starting point is 00:26:43 it's not that there is nothing here so we have a lot of sectoral law there are a lot of state laws now popping up which tackle AI. A lot of AI issues are also regulated under privacy law. So it's not nothing. It's just so complex. It's like a patchwork. So more effective, but I think more effective would mean there's legal clarity. So I think if we have one law in the EU and maybe one law or executive order or whatever in the United States on a federal level, that will make it more effective and I think this will be coming. So this was a kind of, yeah, not so clear of an answer,
Starting point is 00:27:22 but, you know, some thoughts. No, I mean, but that's such a good point because, you know, one person's definition of effective AI privacy and governance might mean to someone else, you know, stunting business growth or, you know, keeping companies from fully scaling at the pace to which they might want. So we tackled so many important topics on today's show. Catherine, thank you so much for joining us because I know we went all over the horn. Thank you for sharing your expertise and knowledge with us all.
Starting point is 00:27:57 Thank you so much for joining. Thank you so much for having me. Thank you, everyone. All right. And just as a quick reminder, like Val is saying, looking for the newsletter on the topic, don't worry, go to your everyday AI.com. or if you're listening on the podcast, just look at the show notes.
Starting point is 00:28:13 You'll find a link there because we're going to be sharing some of the articles that Kathrina mentioned that she wrote. And we had so much in this episode dealing with AI privacy and government. So don't worry, go sign up at your EverydayAI.com. So thank you for joining us. And we hope to see you back again for another edition of Everyday AI. Thanks, y'all. Meet Firefly AI Assistant, now live in Adobe Firefly, the Allman One Creative AI Studio.
Starting point is 00:28:46 Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adop.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us.
Starting point is 00:29:20 If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.