Data is created every day, and the sheer volume of it means many organizations struggle to manage their data, leading to:
- Unstructured data sprawl
- Difficulties maintaining compliance obligations
- Retaining sensitive data unnecessarily
Our Advice
Critical Insight
Focus your efforts on the data where the highest risk levels are hiding, and work towards implementing an automated process. Manual efforts will always carry the most risk.
Impact and Result
The best way to resolve the difficulties with building and effective data retention program is to:
- Identify your retention requirements
- Develop a retention schedule and risk profile for your data processes and types
- Use the above outputs to determine where the greatest risks lie and plan to reduce them as much as possible.
By focusing on the high-risk areas, you won't lose precious time managing data retention.
Build an Effective Data Retention Program
Treat the data risks that will derail your retention schedule
Analyst Perspective
Overcome retention challenges by identifying and treating high-risk data types.
Data retention is a challenge for many organizations. Ideally, we would be able to fully automate our retention and deletion of records and never think about it again. But even organizations that do data retention well are often forced to use some semi-automated or manual processes to adhere to their retention schedules. In other words, there is no perfect solution to resolve data-retention challenges once and for all.
However, by identifying the data types and processes which are most prone to failure (i.e. those that cannot be fully automated), we can explore options to reduce that risk as much as possible and make those remaining manual and semi-automated processes more manageable. By prioritizing our high-risk data flows, we will be able to work more efficiently and determine which data repositories need the most oversight.
Logan Rohde
|
|
Alan Tang
|
|
Isabelle Hertanto
|
Build an Effective Data Retention Program
Treat the data risks that will derail your retention schedule
EXECUTIVE BRIEF
Executive Summary
Your Challenge
Data is created every day, and the sheer volume of it means many organizations struggle to manage their data, leading to:
Because the problem continues to grow over time, many organizations struggle even to identify where the greatest data risks lie. |
Common Obstacles
Data retention is a full-time job that usually receives less than part-time hours, meaning that organizations:
Taken together, these factors prevent many organizations from following their own retention schedule. |
Info-Tech’s Approach
The best way to resolve the difficulties with building an effective data retention program is to:
By focusing on the high-risk areas, you will manage your data retention processes more efficiently. |
Info-Tech Insight
Focus your efforts on data with the highest risk levels, and work towards implementing an automated process. Manual efforts will always carry the most risk.
Your challengeThis research is designed to help organizations that are looking to:
|
Cost of a data breach in 2021$4.24 Million($161 per record) (Source: IBM & Ponemon) |
Common obstacles
These barriers make this challenge difficult to address for many organizations:
- Lack of visibility into data flows
- No established system to classify data by type and sensitivity
- Need to satisfy multiple (conflicting) regulations
- No way to determine the retention-related risk of data repositories
- Administrative overhead in managing a manual data-deletion process
Organizations must manage impossibly high volumes of data dispersed across multiple internal and external repositories, making it difficult to maintain visibility and control over data flows and retention obligations.
42.5% — Annual increase in data volume for organizations.
70% — Percentage of data distributed across edge and cloud repositories. (Source: Seagate, 2020)
Info-Tech’s methodology to build an effective data retention program
1. Set your governance requirements | 2. Complete data retention schedule and risk assessment | 3. Manage manual data retention processes | |
Phase steps | 1.1 Identify data retention laws and regulations
|
2.1 Develop a data retention schedule
|
3.1 Identify cases where manual retention is necessary |
Phase outcomes |
|
|
|
The greatest risks will be in data repositories without automation
Prioritize the riskiest data
Focus your efforts on data with the highest risk levels, and work towards implementing an automated process; manual efforts will always carry the most risk.
Link retention to governance
Successful data retention is closely linked with security governance, compliance, and data classification. Without these guardrails, most organizations struggle to establish a reliable data retention schedule.
Retention schedules do not delete data on their own
A retention schedule is necessary, but having one won’t ensure retention-related risks are managed effectively. Rather, the key lies in identifying risky data processes, types, and repositories, and finding solutions to lower those risks.
Not everything is automatable
Some manual deletion should be expected. Very few retention programs run on automation alone. Manual deletion is manageable provided we have a plan to deal with it.
Two kinds of regulation
Identify conflicts in your obligations. Some regulations and laws dictate that data must be retained, while others demand that data be deleted once it is no longer in active use. You may have to make a judgment call regarding the most appropriate retention period. This should be done in conjunction with your legal counsel.
Find your data’s first instance
Establish a single source of truth for your data. This will allow you to go to the source and delete the first instance of the data (as per your retention schedule), and then plan to purge the secondary, tertiary, etc. instances on a regular basis.
Key deliverable:
|
Blueprint deliverablesEach step of this blueprint is accompanied by supporting deliverables to help you accomplish your goals: |
|
Data Retention Policy Template |
||
Data Retention RACI |
Blueprint benefits
IT Benefits
|
Business Benefits
|
Measure the value of this blueprint
This blueprint helps organizations to:
| |
Key exercises: 1.1 Identify and document data retention laws and regulations 1.2 Identify data types and sensitive data 1.3 Identify data repositories 1.4 Develop a data retention policy 1.5 Determine roles and responsibilities | 2.1 Complete data retention schedule 2.2 Plan to address risky data types 3.1 Identify cases where manual retention is necessary |
Info-Tech Project Value
$45.00 | Average hourly wage of a privacy and compliance officer |
x 760 hours = $34,200 |
Average total time to complete the following data retention related projects: |
Using this Blueprint:
10 hours (GI Calls) |
|
$34,200 – 2,250 = $31,950 |
Estimated cost and time savings from this blueprint |
Info-Tech offers various levels of support to best suit your needs
DIY Toolkit |
Guided Implementation |
Workshop |
Consulting |
"Our team has already made this critical project a priority, and we have the time and capability, but some guidance along the way would be helpful." | "Our team knows that we need to fix a process, but we need assistance to determine where to focus. Some check-ins along the way would help keep us on track." | "We need to hit the ground running and get this project kicked off immediately. Our team has the ability to take this over once we get a framework and strategy in place." | "Our team does not have the time or the knowledge to take this project on. We need assistance through the entirety of this project." |
Diagnostics and consistent frameworks used throughout all four options |
Guided Implementation
A Guided Implementation (GI) is a series of calls with an Info-Tech analyst to help implement our best practices in your organization.
A typical GI is 4 to 8 calls over the course of 1 to 3 months.
What does a typical GI on this topic look like?
Phase 1 |
Phase 2 |
Phase 3 |
Call #1: Gather data retention requirements | Call #2: Draft data retention schedule
Call #3: Identify risky data. |
Call #4: Prioritize data repositories for risk treatment
Call #5: Determine where manual process is necessary |
Build an Effective Data Retention Program
Phase 1
Set your governance requirements
Phase 1 1.1 Identify and document data retention laws and regulations 1.2 Identify data types and sensitive data 1.3 Identify data repositories 1.4 Develop a data retention policy 1.5 Determine roles and responsibilities | Phase 2 2.1 Complete data retention schedule 2.2 Plan to address risky data types | Phase 3 3.1 Identify cases where manual retention is necessary |
This phase will walk you through the following activities:
- Identify and document data retention laws and regulations
- Identify data types and sensitive data
- Identify data repositories
- Develop a data retention policy
- Determine roles and responsibilities
Outcomes of this phase
- Awareness of applicable laws and regulations
- Understanding of the relevant data classifications and data types
- Knowledge of all data repositories and locations
- Formalized data retention policy
- Consensus of individual accountabilities and responsibilities for data retention across the organization
1.1 Identify and document data retention laws and regulations
1 hourInput: Ask participants to identify and document applicable data retention laws and regulations. Identify relevant retention requirements from the laws and regulations.
Output: Documented list of data retention laws, regulations and relevant requirements
Materials: Whiteboard/flip charts, Sticky notes, Pen/marker
Participants: IT representative, Security officer, Privacy officer, Legal counsel, Senior management team (optional), Business representative (optional)
- Bring together relevant stakeholders from the organization. This can include those mentioned in the participants list.
- Identify applicable laws and regulations.
- Identify articles that set forth data retention requirements.
- Identify data types that are regulated by the laws.
- Identify the specific data retention requirements.
- Document all the information above in the table below.
Law/Regulation | Article | Data Type | Retention Requirement |
Data retention laws and regulations
- Determine which laws or regulations you are currently subject to or will be obligated to comply with in the future. If you are not subject to anything now, align your target-state compliance objectives with the most restrictive regulation currently in place. This will set you up to handle any new laws passed in your jurisdiction.
- Consider planned expansion into new markets and how data sovereignty or data residency laws influence data retention rules.
- Review your privacy program to identify the laws or regulations that dictate how data containing personally identifiable information (PII) should be retained or deleted.
Info-Tech Insight
Beware of conflicts in your obligations. Some regulations and laws dictate that data must be retained, while others demand that data be deleted once it is no longer in active use. You may have to make a judgment call regarding what the right retention period is. This should be done in conjunction with your legal counsel.
Examples of laws that set forth requirements for data retention
An organization must not retain personal information for a period longer than necessary to fulfil the purposes described in the notice or to comply with applicable laws.
EU - GDPR | Canada - The Privacy Act | US - HIPAA security rules | Norway - Regulation 1107/2018 |
GDPR Recital 39, Article 5(1)(e), and Article 17 stipulate that personal data should not be stored for longer than is necessary for the purposes for which the personal data are processed; personal data may be stored for longer periods insofar as the personal data will be processed solely for archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes. | Personal information concerning an individual that has been used by a government institution for an administrative purpose shall be retained by the institution:
a) for at least two years following the last time the personal information was used for an administrative purpose unless the individual consents to its disposal; and, b) where a request for access to the information has been received, until such time as the individual has had the opportunity to exercise all his rights under the Act. |
Maintain 6 years of policies, procedures, and records
German - Federal Data Protection Act (BDSG) According to the German Federal Data Protection Act (BDSG), personal data shall be erased if they are processed for own purposes, as soon as knowledge of them is no longer needed to carry out the purpose for which they were recorded. |
Norway Regulation 1107/2018 on camera surveillance in the workplace (“the Camera Surveillance Regulation”) stipulates that camera recordings must be deleted no later than 7 days after the recordings have been made and may only be stored for up to 30 days if it is likely that the recordings will be handed to law enforcement agencies in connection with the investigation of criminal offences. |
1.2 Identify data types and sensitive data
1 hour
Input: Ask participants to identify data types within the organization. Ask participants to identify personal data. Ask participants to document data sensitivity and classification level (optional).
Output: Documented data types, personal data, data sensitivity and classification level (optional)
Materials: Whiteboard/flip charts, Sticky notes, Pen/marker
Participants: IT representative, Security officer, Privacy officer, Legal counsel, Senior management team (optional), Business representative (optional)
- Bring together relevant stakeholders from the organization. This can include those mentioned in the participants list.
- Identify and document data types within the organization.
- Identify and document types of personal data.
- Identify and document data sensitivity and classification level (optional).
- Document all the information above in the Data Retention Schedule and Risk Identification Tool.
Download the Data Retention Schedule and Risk Identification Tool
Data types and sensitive data
- Identifying data types will help you organize your data in groups for which general retention periods can be determined, allowing you to deal with larger chunks of data rather than individual records. These groups should be formed based on similarity of content and policy requirements (e.g. the data must be held for the minimum period).
- Data sensitivity will factor into decisions around how long a given data type should be retained, as well as the level of protection it will need while it is retained. Remember, certain types of data, like intellectual property, will need to be retained indefinitely. But these data types are also highly sensitive and will always require a higher degree of protection.
- Personal data retention is an integral part of the overall data retention program.
- Draw from your data classification scheme and use your predefined record types and data sensitivity levels to set retention requirements for the retention schedule.
- Draw from your information security and privacy programs to help you quickly identify PII and other risky data more quickly.
Info-Tech Insight
Mistakes do happen. Be sure to review the records within each data type to ensure no important legal or regulatory stipulations will interfere with your plans to treat all this data the same.
Personal data retention is part of the overall data retention program
Examples of personal data includeTraditional PII:
Personally identifiable information |
Personal data:
Any information relating to an identified or identifiable person |
Sensitive personal data:
Special categories of personal data (some regulations, like GDPR, expand their scope to include these) |
Full name (if not common) | First, middle, and last names | Biometrics data: retinal scans, voice signatures, or facial geometry |
Home address | IP address | Health information: patient identification number or health records |
Date of birth | Email address or another online identifier | Political opinions |
Social Security number | Social media post | Trade union membership |
Banking information | Location data | Sexual orientation and/or gender identity |
Passport number | Photograph | Religious and/or philosophical beliefs |
Etc. | Etc. | Ethnic origin and/or race |