Storage Developer Conference - #153: Data Preservation and Retention 101

Episode Date: September 14, 2021

...

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcasts. You are listening to SDC Podcast, Episode 153. Hello, everybody, and welcome to this session that's Data Preservation and Retention 101. Thank you for being here and welcome to this year's version of the Software Developers Conference put on by the Storage Networking Industry Association.
Starting point is 00:01:01 And a little bit about myself before we begin. I've been involved with SNEA for quite a long time and I currently co-chair the Data Protection and Privacy Committee within SNEA. Also involved with some other standards organizations that you see here IEEE and in addition to SNEA and also a member of the American Bar Association of Science and Technology Law Section primarily focused on legal issues that really as it relates to privacy so have privacy. So I have been a security and privacy professional for over 30 years in the industry and I'm happy to be here today. So let's get started. So an abstract on what we're covering today. Hopefully that's why you're here. Basically we're going to cover preservation and retention and
Starting point is 00:02:05 unfortunately those terms get used interchangeably quite a bit and sometimes incorrectly so what we're going to do is this is really define the difference between preservation and retention we talk about some issues and considerations as it relates to preservation and retention we'll also cover some guidelines, best practices, and then lastly we're going to finish with some key takeaways. So grab a cup of coffee and let's dig right in. So first to define what a business record is.
Starting point is 00:02:41 This is really going to be important because if you take anything away from this session, it's really understanding that from a preservation and retention perspective, it's all about making sure that business records are what's preserved and retained. It's important to define what a business record is because if it's not a business record, then you don't really have to worry about appropriate preservation and retention of that data. So let's start by defining what a business record is.
Starting point is 00:03:20 An easy way to think about it is basically a basically a record and by the way now that i've talked about business record for the rest of this session i will probably just use the term record and you'll you know that i will be talking about business record so a record is a documentary material any media really that's created and received in the normal course of business. And there's specific pieces to that. But by the way, when I say any media, that actually could even be paper. actually referring to and we'll be discussing digital data not necessarily business records that are on physical paper so keep that in mind because we have to set the appropriate context for what we're going to cover today so the
Starting point is 00:04:16 business record is really again any documentary material and then the key is really the first sublet that, that's worth preserving. It's either temporary or permanent, but it's worth preserving because it provides evidence of the organization's policies, procedures, activities, decisions, and typically has some amount of technical, administrative, historical, and or some legal value. And so, you know, sort of a tug-in-cheek example at the end here is that things may not, not everything may be considered a business record. A funny example is sort of the lunch menu example, which may not necessarily be considered a business record, and you may not want to have to worry about appropriately preserving and maintaining it from a retention perspective. So this is really important. So it's really the basis for what we're going to talk about today so understand that when we talk about a business record it's it's really all the stuff that you need to preserve within your organization because it has
Starting point is 00:05:31 value and that value could be technical administrative historical or or maybe potentially some legal value to it or a combination thereof so keep that in the back your mind as we go through this so So, you know, what the heck is the big deal? And that's sort of what we've covered already. But one of the ISO documents, this technical report that's mentioned here at the top, kind of an interesting way of looking at a business record. And as it defines it here, it constitutes the business memory of daily business action. So that's an interesting way of thinking about what a business record really is. So, you know,
Starting point is 00:06:11 now that we know that business records are what we want to appropriately preserve and retain, then it's about why are we doing it? Well, we have to make sure that we're achieving regulatory compliance. We'll talk about some of those regulations in a minute. We want to guard against maybe some adverse litigation that may happen in the future. We'll talk a little bit about that. And basically, these are records that support the current and maybe some future management decisions that you may want to make within your company. So to achieve this goal, the records have to be retained and, of course, appropriately preserved. And so that's what we're
Starting point is 00:06:58 going to talk about today. So I think we've set the groundwork now for what a business record is. And what we'll do is we'll now cover data preservation and retention. For the rest of the session, I will also typically talk about preservation and I will talk about retention. Just keep in the back of your mind also that I'm talking about data preservation and data retention. And again, the context being digital, in other words, it's stored on media. So first of all, we'll start with defining what data preservation is. And what data preservation is, is the process and operations involved in ensuring the ability to read, interpret, authenticate, protect, secure the data, but also the metadata throughout the lifecycle.
Starting point is 00:07:49 So this takes on a couple different focus areas. One focus area is usability. Well, we want to preserve data from a usability perspective. You have to be able to use the data, in other words, access it and all that. But there's also a focus area referred to as legal, and that is from a legal perspective, making sure that the data is there to address any evidentiary requirements in the case of potential litigation in the future. So those are sort of the major drivers behind preservation at sort of a high level. We'll talk a little bit about regulations in a minute.
Starting point is 00:08:37 So keep that in mind. And, again, I'll probably refer to data preservation as preservation in and of itself as we move ahead. So keep in the back your mind data preservation processes and operations as it relates to your data records. Now we're going to define what retention is. So data retention, think of data retention as really is the definition of the policies for meeting the legal and business needs. And then it goes into preserving, obviously, the existence and integrity of the data for a specific period of time and or until certain events have transpired. So it's all about those policies around why we're going to keep the data. And by the way, how long we're going to keep it based on some regulations or maybe some other reasons that we're going to talk about in a minute.
Starting point is 00:09:31 So, you know, an interesting thing about, you know, retention in and of itself is you're probably retaining the data for maybe a combination of legal and business needs or potential legal issues. You want to make sure that you're compliant for certain regulations that you may have to, you're obligated to be in control of. So we'll talk about what some of those regulations are in a minute so Interesting example that we cover here is You know, let's say that you know, it's deemed that email is is not business records, which which is actually
Starting point is 00:10:23 Pretty unusual, but let's just for a hypothetical example not business records, which is actually pretty unusual. But let's just, for a hypothetical example, you may want to keep email for six months. But if it's not a business record, then you might want to then just eliminate the email after a very short period of time. If there is, in fact, an e-discovery event, meaning you are actually, your company is presented with a lawsuit and they go into what they call an e-discovery event, that means that you have to turn over evidentiary items, and that may include many different things.
Starting point is 00:11:07 So typically what will happen is a legal hold will be placed on all of your digital data that's deemed business records, although it's actually just about everything that they can grab, and then they're going to figure out later on if it's technically a business record or not. What opposing counsel will do is actually search on specific keywords based on whatever the lawsuit is about. And then those keyword searches will pull up a reference of all these different records that could be associated or help them with the given lawsuit. So that's sort of the back story of what happens in an e-discovery event in terms of things get placed on a legal hold. When we say the e-discovery event overrides normal policy,
Starting point is 00:12:12 that's what we mean is as soon as you're hit with an e-discovery event, you are not allowed to start deleting data, even if that was part of your retention policy. So go back to the crazy example of email. If you said, hey, no, no, we're deleting this email because that was part of our retention policy, which we have set in stone as our policy for this type of business record, which could be in this example email. as soon as the ediscovery event starts you're not allowed to to delete anything and that's what we mean by the overriding of your normal retention policies so keep that in the back of your mind so a little bit of the data preservation, I'm sorry, prevention versus retention.
Starting point is 00:13:08 Excuse me. Think of preservation as everything to do with the processes and the setup and procedures of maintaining that data. And think of retention as the actual policies that you set for your company to maintain your business records for a defined period of time. So retention has everything to do with policies, and then preservation has everything to do with all the procedures put in place to do the preservation based on the policies that you set up so it's a good way of of keeping uh in the back of your mind what preservation is versus retention so a couple of interesting parts of preservation. You'll hear a lot about this concept of authenticity. So it's really, you know,
Starting point is 00:14:09 a property of information's objects, content, and metadata that identifies that it's currently what it originally was and verifies that its contents hasn't changed. And that's really what authenticity is all about. So you have to have processes in place. How you do it, there's many different ways of doing this
Starting point is 00:14:31 from a technological standpoint and also from a procedural standpoint. But what you're doing when we talk about authenticity of the data is ensuring that that actual data, the content of the data, hasn't changed over a period of time. So you know what's involved usually you have some verification process making sure that it is the actual original data. There's some auditing going on of the access of that data in other words who's accessing it when sort of like a an audit trail if you will and then a whole bunch of different ways of detecting change you can do hashing you can you will typically set up audit
Starting point is 00:15:20 trails and there's a whole bunch of other ways of detecting change of these business records that you need to preserve. So some of the activities when we talk about retention are things like metadata management. So I've sort of mentioned the word metadata a couple times so far. Metadata is really data about the data that you're storing. So for example metadata will would be things like what's the name of the file, who owns the file, or the business record if you will, how big is it, when was it last accessed? And so on. That's really the metadata. It's sort of the description of the data. And you have to, it's important to maintain the metadata,
Starting point is 00:16:17 just as important to maintain the metadata as it is the actual data itself. There's been some interesting legal disputes over the years that have gone to court that actually had to do with not necessarily the data itself, but improper modification of the metadata to make it look like something was done to the data when it really wasn't. So it's really making sure that you do all the appropriate things for not just the data, but also the metadata itself. So there's other activities. There's discovery.
Starting point is 00:16:57 There's classification of the data. That's very important, obviously. Control of information. How many copies? What geographic locations are you keeping them in versions if there's migration involved we'll talk about that a little bit later there's this concept of services there's preservation protection security availability integrity we talked a little bit about authenticity already.
Starting point is 00:17:30 Lastly, the bottom middle one is disposition and deletion and has obviously things to do with deleting the data. Although when we talk about disposition, it's not necessarily doing an actual deletion of the data at the end. It could actually be doing something else with the data based on policies that are set up. For example, it could be putting the data into an archive. So that would be an example. Some people refer to it as a long-term archive, and that could be different things.
Starting point is 00:17:59 It could be different types of media and so on, which we'll talk a little bit about a little later. So from a records management standpoint, when we talk about retention, you classify the data, but then you, for each given set or classification buckets for your data, you have to set up retention schedules and decide how long that's, obviously, as part of the policy, how long the data is going to be retained, and then what you're going to do with that data at the end of the retention and then making sure that you're doing all the right things with that data throughout its life cycle including at the end when you're either going to delete it or do an appropriate disposition of the data at the end. Records management is a very important piece of the retention life cycle.
Starting point is 00:19:06 So there's other pieces of the life cycle processes. So there's a bunch of processes that at a high level we would sort of classify it in the following ways. We would talk about, you know, first off, you know, appraising your data. You know, what do you have and how are you going to classify it? How are you going to stick it into multiple buckets? And then, of course, set up policies based on each of those respective data buckets. And then next, you're going to ingest the data. You bring it in in some digital form, whatever that happens to be based on your business environment.
Starting point is 00:19:45 You store the data. You preserve the data based on specific preservation actions. And then, of course, there's going to be access of that data throughout a given lifetime. And then lastly, at the end of its retention schedule as part of its life cycle, you will do a disposition of the data at the very end. So what are some of the issues and considerations? That's sort of what we'll talk about next. Well, why in the heck is preservation a problem in the first place?
Starting point is 00:20:15 What's the big deal? And who the heck cares? So first off, you know, very often data preservation is at the bottom, not always, but sometimes at the bottom of the IT hierarchy. Sometimes lacks adequate funding. You know, very often this is not, you know, a quote moneymaker or a profit center for a given organization. So it's something that you have to do very often's because of legal or some compliance requirements that you have. And you're doing it, but, you know, you don't really want to spend a lot of time.
Starting point is 00:20:52 You don't want to spend a lot of effort. You certainly don't want to typically spend a lot of money doing it. And some people like to think of it as, you know, mitigating risk. It's almost like buying an insurance policy. And it depends on how much the organization is going to consider what that insurance policy or risk profile is going to mean to their business and how seriously they take it. So what are some of the drivers? Well, some of the drivers are relatively new,
Starting point is 00:21:24 and some of the compliance things that we'll talk about in a minute, some of those regulations, and some of the legal that have been released in the last few years, GDPR is a great example, the state of California just at the beginning of this year, 2020, implemented or enacted into law in January of this year, the CCPA, the California Consumer Protection Act. So there's a bunch of new regulations coming out all the time, both at the state, federal, and in some cases a global level where it might affect you as an organization regardless of where you do business or where you're located. GDPR tends to affect, could potentially affect a corporation regardless of where you are because all you have to do is be retaining data as part of your business that is data that's representing data of a European Union citizen. Well, you don't have to be a corporation that's based in the EU, the European Union, to necessarily have data that is identifying European Union residents or citizens.
Starting point is 00:23:01 So that typically would then affect many companies, including those that are in other countries outside the EU. So those are some of the examples of where the issues of compliance actually could be very, very far reaching. The last bullet here is actually one that's often overlooked, and that's the, of why preservation is often a problem, and that's the failure to collaborate. So a lot of organizations we find have these silos of responsibility, and they don't necessarily talk to each other, either the groups or the subgroups, and some might refer to it as the line of business within an organization. Some people call it the business unit, depending on the structure of the organization, when they don't really collaborate amongst the different groups within the organization, and that could cause some problems.
Starting point is 00:24:06 So what are some of the regulations more specifically? I talked about a few of them already. GDPR was one of them. You know, certainly here in the U.S., you'll hear a lot about SEC, which is the Security and Exchange Commission. SOX stands for Sarbanes-Oxley. HIPAA is the Health Information Privacy Portability and Accountability Act. FRCP is another financial regulation in the U.S.'re required to deal with from an organizational standpoint. A good example is intellectual property litigation that might be against your company.
Starting point is 00:24:58 From a corporate governance perspective, sort of on the other end of the spectrum, maybe outside of the regulatory space, you may have within your business specific internal requirements. So things like your own IP or intellectual property that you want to obviously control and make sure that it doesn't get into the wrong hands from a competitive perspective. So you will have controls on the data that have nothing to do with external regulations at all. It could be just the fact that you want to maintain and keep secret your secret sauce, if you will, your intellectual property.
Starting point is 00:25:46 There's other things like HR documents that has all kinds of PII or personal identifying information and other documents as well that could actually be internal corporate governance and has nothing to do with external regulations. So from a practices standpoint, when we talk about preservation, it's important to understand what are the requirements. And unfortunately, if you don't know and you're not sharing that information amongst the groups, that's not good. A lot of the companies that we run across still rely really just on backups, and that's not good.
Starting point is 00:26:25 You really want to have an entire preservation and retention policy architecture set up and not just rely on a backup or two. Certainly if you want to record the tape and then worry about having to deal with things like losing it on a truck going to somewhere on a side of a mountain that's that's that's a that could be fraught with peril not to say that that's necessarily bad you know using tape as as a backup that's not the case at all. It's just want to make sure that you have other
Starting point is 00:27:06 preservation and retention architectures in place to make sure that you're doing the right things for all of these compliance issues that you have to deal with, which goes far beyond backups, which are typically done just to make sure you can restore in the case of a sudden emergency, be it either internal or external emergency like a hurricane or something. Some of the other issues that we see are issues of migration. So you have to migrate data very often. You know, if you have disk media technology that is only good for a certain period of time, you have to be able to migrate from the older technology to newer technology. set up as policies and procedures to make sure that it is done seamlessly, but also making sure it's done so that there's no interruption in service, things like that, to your organization.
Starting point is 00:28:14 And then, you know, what are the forces that require that migration to happen in the first place? So, for example, if an application changes, well, whether or not that forces that migration to happen in the first place. So, for example, if an application changes, well, whether or not that forces a migration, all of this must be planned. It's not that, you know, things don't happen. In fact, that's the problem. Things do happen. You just have to make sure it's appropriately planned for.
Starting point is 00:28:40 We talked about collaboration a little bit earlier, and collaboration is important from all these different groups within your organization. Only some are listed here, but just to give you a feel, legal, your records managers, and very often we don't use the term anymore, records managers. We used to call this a RIM, records information managers. There are actually RIM departments within organizations. RIM is still a very common term used in the government space. But, you know, other groups like IT, business operations, that business and operations might be your line of business
Starting point is 00:29:20 or business unit within a larger organization, for example. Then, of course, your security group or security department. And either archivists, which is, again, very common, used as a department within or a group of people within government entities, not necessarily in the private sector space, although that might fall under either IT or a line of business or the business unit very often. Not always, but it'll change from organization to organization. Also changes from sector to sector in terms of industry, depending on how important that is. So another question that comes up is, hey, what about data loss? So, you know, data loss is a huge, especially in the world of storage.
Starting point is 00:30:18 You know, we could, you know, do a separate session just on this because there's actually a lot to cover here. But, you know, just to skim the surface, you know, there know there's things like you know that i don't even have covered on the slides for example bit rot is a common one that we would talk about in the storage industry and there are very common ways of dealing with with issues like that been around for many many years so that issue has and there's many different error correction type of algorithms and technology to deal with with that but the the main thing here is how much data loss when is it a problem making sure that regardless of what it is, how much it is, and when it's a problem, that you are doing the right things to properly preserve that data or the business records, if you will, and then also make sure that that data is not just preserved appropriately with all the right processes and controls, but also that you're retaining it for a specific period of time, all based, again, on policies.
Starting point is 00:31:29 How long do you have to retain that specific type of data or business record? So from a security services standpoint, think about when we have talked about preservation and retention. Well, a lot of that, when we talk about things like authenticity and data integrity, that sort of falls under this whole heading of security services. And so some of those services are listed here. So identification and authentication services, you know, obviously confirms the identities of users as they go to access. Access control, you certainly want to set that up for preventing unauthorized access to the business records. There's security service that's referred to as data integrity services, ensures that records are not altered or destroyed in an unauthorized manner. Not that records are not altered or not necessarily that they're not destroyed because they are,
Starting point is 00:32:35 but the key being is that it's not done in an unauthorized manner because obviously you do have disposition of data, meaning it could be destroyed at the end, maybe. But this, we're talking about data integrity services, making sure none of that happens in an unauthorized manner. There's also this concept in the security space of data confidentiality service. You certainly want to make sure that records are not accessed by unauthorized folks. And then lastly, non-repudiation service. Well, you want to make sure that whoever the engaged parties are cannot deny involvement, meaning you're actually tracking the access to the data, modification of the data, and the list goes on and on,
Starting point is 00:33:25 so that in the case of any type of question, litigation, whatever the case may be, it's non-repudiated, meaning nobody can deny because you have the logs and the appropriate history there to prove what's been done from an involvement perspective. So a little bit about best practices and, you know, what do you want to make sure that you're doing. So from a retention perspective, we have this sort of a common term of you want to use applications that are preservation aware. And we used to use a term quite a bit. It's not used very much anymore.
Starting point is 00:34:15 We refer to it as ILM or information lifecycle management. management although the the the actual technology is still used we just very often don't necessarily call it ILM anymore but you certainly want to make sure that you have applications that are taking care of the preservation and retention of the data that are aware of what that life cycle is. If you remember, we talked about life cycle earlier, where you're actually classifying data, you're storing it, you're accessing it. At the very end, it's finally going to be appropriately disposed of
Starting point is 00:35:02 or deleted or some type of disposition. And you want to make sure that the technology that you're using is doing all the right things from a lifecycle management perspective. Certainly you want to conduct the records inventory. You have to make sure that that's constantly adhered to. The interesting thing about the inventory too is making sure that that lines up with your retention policies because you need to make sure that the records are there for the life of the retention requirements of the data record or the business record.
Starting point is 00:35:49 So a couple of examples in the healthcare space, you might see data that's required to, let's say, for example, medical images. In some states, actually, medical images of a given patient have to be retained for the life of that patient. So it could be 80 to 100 years. Who knows? It's just however long that patient lives. In some states and for some regulations in the medical field, certain images, depending on what the regulation is and geographic location and what state and so on,
Starting point is 00:36:31 certain images can't be deleted ever. It has to be retained forever. So making sure that that inventory has the capability of keeping inventory of that data for whatever that retention period is. Sounds easy, but that's actually a little bit more complicated because if you think about it logically, it's okay to keep a record for a few years of what happens to a business record. It's a much different thing when you're dealing with a business record
Starting point is 00:37:09 that has to be around for 80 years or 100 years or longer in the case of forever. How do you deal with that? That's an interesting question. A lot of people will jokingly say, well, that's not going to be my problem because I won't be here. But certainly from a business perspective, you have to make sure that's appropriately accounted for and do the appropriate preparations to make sure it takes place. And lastly, identify vital records, publish, and then educate and implement. You have to make sure that not only do you set up the policies and procedures to make sure that you're doing all the right things for appropriate retention of data, it also must be adhered to through educating all the given groups within the business to make sure it's properly implemented.
Starting point is 00:38:01 And the other thing that we often talk about, and I mentioned it earlier, is that, yes, metadata counts. You have to make sure that you're doing all of the appropriate things from a preservation perspective on the metadata. Very often you'll see this term, especially in certain industries, ESI or electronically stored information. You know, that's one of the basis of this session is we're going to talk about, you know, data that's electronically stored or if you want to use the term digitally stored. And you want to make sure that you have the ability to not just make sure that there is appropriate integrity for the data, but same thing applies to the metadata. are in place to make sure that that data is not altered inappropriately
Starting point is 00:39:08 or in an unauthorized manner. And then the metadata is important because I think I might have mentioned that there's been some interesting litigation taking place in the last 10 years or so that actually had to do with not necessarily the data, but the metadata and the fact that it was not appropriately cared for or protected so that it was not, in fact, immutable like it should have been. And there was some hanky-panky going on where they were trying to modify the metadata to make it look like something it wasn't from an access perspective. So metadata is important.
Starting point is 00:39:55 You've got to make sure that all the appropriate policies and procedures are around metadata just like the data. Another thing that's important is the best practices from a what to how and who. So what are the typical processes? I list them on the left-hand side. These are the things that have to be set in stone in terms of you run through the goals and your strategy, you have to make sure that you have C-level sponsorship within your organization on whatever these policies and procedures are going to be from a preservation and retention perspective to make sure that you have the backing from your C-level executives.
Starting point is 00:40:46 We've seen some interesting happenings in the corporate world in the last five to ten years based on certain litigation that's taken place where it's not just the C-level, but also the board level, that there's been some actions against those folks as well. So this is actually becoming more and more of a hot topic as things go on, mostly because of some of the issues we've been seeing in the legal realm as as it relates to lawsuits and all kinds of different litigation, and not just from the fact that you're not preserving and retaining the data just because of one given legal compliance issue. Sometimes it has to do with other things like a data breach,
Starting point is 00:41:43 and that's a whole other subject, which we could probably do a separate session on just that in and of itself. So some of the typical frameworks, there's many different frameworks that can be used to assist with setting up the appropriate policies and procedures, things like service management. There's information governance frameworks to help with this. There's compliance and risk reduction frameworks. Information lifecycle management frameworks from multiple standards organizations that can be used to assist with setting up these appropriate policies and procedures for retention and preservation.
Starting point is 00:42:26 And lastly, the stakeholders. We sort of mentioned who some of these players are in the past in terms of groups within your organization, IT, your RIM, or Records and Information Management. That also could be, if not that, it could be part of the business, a specific business unit or LOB, line of business. Obviously, the legal department, your security group, finance, risk management, and the list goes on, actually. It could be other groups as well, but those are some of the more common ones. You also want to make sure that you're solving the disconnects. And this is, oddly enough, one of the, it seems kind of strange to even mention,
Starting point is 00:43:07 but I'll go through it again because it's really so important. That is that concept of this failure to collaborate. You really want to make sure that all stakeholders are within the organization and are assisting in setting the requirements and making sure that you have that C-level and board-level commitment to make sure that they are going to put all the appropriate resources in front of you. That includes not just people and bodies to move things or to set up technology appropriately. It's also making sure that you have the appropriate resources from a financial perspective. So it's not just having a couple of bodies to do this stuff. It's making
Starting point is 00:43:52 sure that there's a commitment to setting it in place financially as well, because it's not cheap, as you can imagine, to do this in the right way. And lastly, reducing complexity. And this is very often a difficult one for companies and making sure that when you go through the classification process of what are the business records you have and then how do I classify them? What buckets do I put them in? You know, it certainly is best if you can have less buckets. It's very difficult to have a small number of buckets in most industries, unfortunately. So certainly less buckets is best, obviously reducing complexity in that way. But whatever you need to be retained for whatever buckets there are, you just have to do what you can, but try and keep it as simple as
Starting point is 00:44:46 possible. As simple as the business will allow you to based on the type of industry that you're in. And then, of course, do all the right things from an implementation standpoint, including deletion and disposition as appropriate at the end. We find that a lot of companies get caught at the very end when they seem to be doing all the right things. They're classifying their data. They're making sure that integrity is there. Nothing's mucking with the data. Nobody's accessing that, not allowed to access it. They're doing all the right things, and what they're screwing up on is at the very end, well, what do I do now?
Starting point is 00:45:32 It's classified to be retained for five years, and I'm at the end of my five-year period. What do I do? Well, it's making sure you do the right things at the end. And very often it is as simple as a deletion, but you have to do that correctly because it's not just doing a delete command on a couple of records. It's applying all the appropriate policies and procedures to do the deletion, yes, but make sure that it's properly acknowledged and logged. So you have to be able to prove that not only did you delete it, but you kept it immutable for the period of time that it needed to be saved, and making sure then that you did the right things at the very end.
Starting point is 00:46:28 And it might be deleting and it might be archiving it to some other medium, be it optical or some other type of physical storage medium. It could be tape, whatever the case may be, and making sure that that's done. But it's also making sure, well, can you prove that in a court of law later on that you did it? And, oh, by the way, if you did do it and it's appropriately in a different place on a different media because that's what your policies and procedures dictated, prove that it is there. So very often it's a matter of making sure that you can prove that you did all the right things. And in the case of keeping things, can you appropriately retrieve that data
Starting point is 00:47:18 at the appropriate time if asked to in a court of law? And believe it or not, that happens. And so making sure that all the checks and balances are in place to not only abide by the policies and procedures, but make sure all the appropriate logging, classification, and sort of the checks of doing a check every once in a while, did that really work as planned? And can I really go through a process whereby I can prove that what I say happened really did happen?
Starting point is 00:47:59 And that's where a lot of people get caught up at the very, very end of their life cycle. So keep that in mind. Also, a change in mentality. In the old days, for us old folks that can remember, it's that old archive mentality. You want to change that to a retention and preservation from the very, very beginning of the data life cycle. So, you know, think of it going from an event at the end of the information life cycle when we talk about disposition and really thinking about it as a requirement and a policy at the creation of the business record. So, you know, changing that mentality of, oh, yeah, this is just some disposition
Starting point is 00:48:47 of something I have to worry about at the very end, you're actually going to set up these policies at the very beginning of a given business record, and that will allow you to make sure that you're doing all the right things throughout the life cycle of a given business record. Oddly enough, it's something that is almost an afterthought for those that haven't gone through the entire process. And just keep in mind, I made a note here at the end that this doesn't affect legal hold. So if you do have some disposition policies in place
Starting point is 00:49:19 and you're hit with a legal hold, you're not allowed to continue to do the deletions during this legal hold process or what we call the e-discovery phase. Those disposition policies have to be suspended during the e-discovery phase. So just keep that in the back of your mind. So from a storage perspective, there's a lot we can talk about storage. In fact, because this is a storage developers conference,
Starting point is 00:49:48 we're almost obligated to talk about the actual storage media in and of itself. From a retention and preservation perspective, media is important because it's going to make sure you have to set it up so that it's making sure that you're protected from things like unauthorized access and loss and tampering destruction and theft and all kinds of other things you have to make sure that that media is physical and logically migratable and then that also the media contains the right attributes. So if it's got to be warm because it's for immutability purposes, then you make sure that it's appropriately, that technology is set up to appropriately support that,
Starting point is 00:50:39 be it tape or optical or spinning disk or SSD or whatever the case may be. And then from a compliant infrastructure perspective, you want to make sure that you're using the appropriate management tools, their compliance-based applications, to do enterprise content management. If they're huge databases, which many times they are, you want to make sure that all those applications have the right management capabilities to do the appropriate preservation and retention capabilities that you really need once you set up your policies. And then, of course, make sure that the storage infrastructure has the necessary retention and preservation attributes. Again,
Starting point is 00:51:26 it's got to be able to support security, confidentiality, and the list goes on. I mentioned them all here. We talk about self-healing storage systems, of course, hopefully eliminating the need for a true physical migration. You can do it logically and hopefully do it online so that you're not you don't have any downtime so you want to be able to plan for that sort of migration and then also for obvious reasons make sure this is all auditable and monitorable so this stuff has to be you have to be able to monitor and then have all the appropriate audit features available to maintain compliance so from a disposition and sanitization perspective talked a little bit about disposition before so
Starting point is 00:52:16 at the end of a retention requirement there may be a need to dispose of the data when we talk about disposition, disposition doesn't necessarily mean that you're destroying the data. It could be that you're archiving it or whatever the policy states. Now, there are sometimes a need for disposing the data to the point where we call that media sanitization because you want to reuse the technology, the media, for example, or you want to make sure that it's not able to be recovered by anybody else. And they refer to that as rendering access to the data as infeasible for a given level of effort. That's by definition
Starting point is 00:53:00 what sanitization is. If you guys are interested, there's a great white paper that was written by the security twig within SNEA that I mentioned here at the bottom of the slide that'll go into detail on sanitization and some of those capabilities. So some of the key takeaways certainly you want to preserve and maintain what's required by your business and legal requirements again what are you retaining you're retaining the business records and only the business records. And that's probably the biggest takeaway here. We're always talking about business records and not everything necessarily. So then, you know, next, you know, you certainly want to create
Starting point is 00:53:40 and adhere to the appropriate best practices. You want to collaborate, you want to identify, classify, and set your requirements. The processes and procedures are going to be very, very important in making sure that everybody knows them and you're abiding by them. Lastly, this is probably also important, one of the more important ones is update and adapt. Your business is going to be changing.
Starting point is 00:54:04 Your regulatory requirements are going to change over time, whether you like it or not. You have to update your policies and adapt to those policies. And, again, it's education. It's retraining. It's making sure that collaboration is taking place throughout your organization continually. So please take a moment to rate the session.
Starting point is 00:54:26 We appreciate your feedback. We're always trying to do a better job as time goes on. We would love to hear from you. And thank you for attending this session and have a great day. Thanks for listening. If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with your peers in the storage developer community. For additional information about the Storage Developer Conference, visit www.storagedeveloper.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.