Build a Minimum Viable Product for Data Classification With Microsoft 365

Our systems detected an issue with your IP. If you think this is an error please submit your concerns via our contact form.

Resources are the primary obstacle to getting a foot hold in O365 governance, whether it is funding or FTE resources.
Data is segmented and is difficult to analyze when you can’t see it or manage the relationships between sources.
Organizations expect results early and quickly and a common obstacle is that building a proper data classification framework can take more than two years and the business can't wait that long.

Our Advice

Critical Insight

Data classification is the lynchpin to ANY effective governance of O/M365 and your objective is to navigate through this easily and effectively and build a robust, secure, and viable governance model.
Start your journey by identifying what and where your data is and how much data you have. You need to understand what sensitive data you have and where it is stored before you can protect it or govern that data.
Ensure there is a high-level leader who is the champion of the governance objective.

Impact and Result

Using least complex sensitivity labels in your classification are your building blocks to compliance and security in your data management schema; they are your foundational steps.

Build a Minimum Viable Product for Data Classification With Microsoft 365 Research & Tools

1. Build a Data Classification MVP for M365 Deck – A guide for how to build a minimum-viable product for data classification that end users will actually use.

Discover where your data resides, what governance helps you do, and what types of data you're classifying. Then build your data and security protection baselines for your retention policy, sensitivity labels, workload containers, and both forced and unforced policies.

Build a Minimum Viable Product for Data Classification With Microsoft 365

Kick-start your governance with data classification users will actually use!

Executive Summary

Info-Tech Insight

Creating an MVP gets you started in data governance
Information protection and governance are not something you do once and then you are done. It is a constant process where you start with the basics (a minimum-viable product or MVP) and enhance your schema over time. The objective of the MVP is reducing obstacles to establishing an initial governance position, and then enabling rapid development of the solution to address a variety of real risks, including data loss prevention (DLP), data retention, legal holds, and data labeling.
Define your information and protection strategy
The initial strategy is to start looking across your organization and identifying your customer data, regulatory data, and sensitive information. To have a successful data protection strategy you will include lifecycle management, risk management, data protection policies, and DLP. All key stakeholders need to be kept in the loop. Ensure you keep track of all available data and conduct a risk analysis early. Remember, data is your highest valued intangible asset.
Planning and resourcing are central to getting started on MVP
A governance plan and governance decisions are your initial focus. Create a team of stakeholders that include IT and business leaders (including Legal, Finance, HR, and Risk), and ensure there is a top-level leader who is the champion of the governance objective, which is to ensure your data is safe, secure, and not prone to leakage or theft, and maintain confidentiality where it is warranted.

Executive Summary

Your Challenge

Today, the amount of data companies are gathering is growing at an explosive rate. New tools are enabling unforeseen channels and ways of collaborating.
Combined with increased regulatory oversight and reporting obligations, this makes the discovery and management of data a massive undertaking. IT can’t find and protect the data when the business has difficulty defining its data.
The challenge is to build a framework that can easily categorize and classify data yet allows for sufficient regulatory compliance and granularity to be useful. Also, to do it now because tomorrow is too late.

Common Obstacles

Data governance has several obstacles that impact a successful launch, especially if governing M365 is not a planned strategy. Below are some of the more common obstacles:

Resources are the primary obstacle to starting O365 governance, whether it is funding or people.
Data is segmented and is difficult to analyze when you can’t see it or manage the relationships between sources.
Organizations expect results early and quickly and a common obstacle is that building a "proper data classification framework” is a 2+ year project and the business can't wait that long.

Info-Tech’s Approach

Start with the basics: build a minimum-viable product (MVP) to get started on the path to sustainable governance.
Identify what and where your data resides, how much data you have, and understand what sensitive data needs to be protected.
Create your team of stakeholders, including Legal, records managers, and privacy officers. Remember, they own the data and should manage it.
Categorization comes before classification, and discovery comes before categorization. Use easy-to-understand terms like high, medium, or low risk.

Info-Tech Insight

Data classification is the lynchpin to any effective governance of O/M365 and your objective is to navigate through this easily and effectively and build a robust, secure, and viable governance model. Start your journey by identifying what and where your data is and how much data do you have. You need to understand what sensitive data you have and where it is stored before you can protect or govern it. Ensure there is a high-level leader who is the champion of the governance objectives. Data classification fulfills the governance objectives of risk mitigation, governance and compliance, efficiency and optimization, and analytics.

Questions you need to ask

Four key questions to kick off your MVP.

1

Know Your Data

Do you know where your critical and sensitive data resides and what is being done with it?

Trying to understand where your information is can be a significant project.

2

Protect Your Data

Do you have control of your data as it traverses across the organization and externally to partners?

You want to protect information wherever it goes through encryption, etc.

3

Prevent Data Loss

Are you able to detect unsafe activities that prevent sharing of sensitive information?

Data loss prevention (DLP) is the practice of detecting and preventing data breaches, exfiltration, or unwanted destruction of sensitive data.

4

Govern Your Data

Are you using multiple solutions (or any) to classify, label, and protect sensitive data?

Many organizations use more than one solution to protect and govern their data, making it difficult to determine if there are any coverage gaps.

Classification tiers

Build your schema.

Info-Tech Insight

Deciding on how granular you go into data classification will chiefly be governed by what industry you are in and your regulatory obligations – the more highly regulated your industry, the more classification levels you will be mandated to enforce. The more complexity you introduce into your organization, the more operational overhead both in cost and resources you will have to endure and build.

Microsoft MIP Topology

Microsoft Information Protection (MIP), which is Microsoft’s Data Classification Services, is the key to achieving your governance goals. Without an MVP, data classification will be overwhelming; simplifying is the first step in achieving governance.

A diagram of multiple offerings all connected to 'MIP Data Classification Service'. Circled is 'Sensitivity Labels' with an arrow pointing back to 'MIP' at the center.
(Source: Microsoft, “Microsoft Purview compliance portal”)

Info-Tech Insight

Using least-complex sensitivity labels in your classification are your building blocks to compliance and security in your data management schema; they are your foundational steps.

MVP RACI Chart

Data governance is a "takes a whole village" kind of effort.

Clarify who is expected to do what with a RACI chart.

	End User	M365 Administrator	Security/ Compliance	Data Owner
Define classification divisions		R	A
Appy classification label to data – at point of creation	A	R
Apply classification label to data – legacy items		R	A
Map classification divisions to relevant policies		R	A
Define governance objectives		R	A
Backup		R		A
Retention		R		A
Establish minimum baseline		A	R

What and where your data resides

Data types that require classification.

M365 Workload Containers

Email Attachments	Site Collections, Sites	Sites	Project Databases
Contacts	Teams and Group Site Collections, Sites	Libraries and Lists	Sites
Metadata	Libraries and Lists	Documents Versions	Libraries and Lists
Teams Conversations	Documents Versions	Metadata	Documents Versions
Teams Chats	Metadata	Permissions Internal Sharing External Sharing	Metadata
	Permissions Internal Sharing External Sharing	Files Shared via Teams Chats	Permissions Internal Sharing External Sharing

Info-Tech Insight

Knowing where your data resides will ensure you do not miss any applicable data that needs to be classified. These are examples of the workload containers; you may have others.

Discover and classify on- premises files using AIP

AIP helps you manage sensitive data prior to migrating to Office 365:

Use discover mode to identify and report on files containing sensitive data.
Use enforce mode to automatically classify, label, and protect files with sensitive data.

Can be configured to scan:

SMB files
SharePoint Server 2016, 2013

Stock image of a laptop uploading to the cloud with a padlock and key in front of it.

Map your network and find over-exposed file shares.
Protect files using MIP encryption.
Inspect the content in file repositories and discover sensitive information.
Classify and label file per MIP policy.

Azure Information Protection scanner helps discover, classify, label, and protect sensitive information in on-premises file servers. You can run the scanner and get immediate insight into risks with on-premises data. Discover mode helps you identify and report on files containing sensitive data (Microsoft Inside Track and CIAOPS, 2022). Enforce mode automatically classifies, labels, and protects files with sensitive data.

Info-Tech Insight

Any asset deployed to the cloud must have approved data classification. Enforcing this policy is a must to control your data.

Understanding governance

Microsoft Information Governance

Information Governance

Retention policies for workloads
Inactive and archive mailboxes

Records Management

Retention labels for items
Disposition review

Retention and Deletion

‹——— Connectors for Third-Party Data ———›

Information governance manages your content lifecycle using solutions to import, store, and classify business-critical data so you can keep what you need and delete what you do not. Backup should not be used as a retention methodology since information governance is managed as a “living entity” and backup is a stored information block that is “suspended in time.”

Records management uses intelligent classification to automate and simplify the retention schedule for regulatory, legal, and business-critical records in your organization. It is for that discrete set of content that needs to be immutable.

(Source: Microsoft, “Microsoft Purview compliance portal”)

Retention and backup policy decision

Retention is not backup.

Info-Tech Insight

Retention is not backup. Retention means something different: “the content must be available for discovery and legal document production while being able to defend its provenance, chain of custody, and its deletion or destruction” (AvePoint Blog, 2021).

Microsoft Responsibility (Microsoft Protection) Weeks to Months	Customer Responsibility (DLP, Backup, Retention Policy) Months to Years
Loss of service due to natural disaster or data center outage	Loss of data due to departing employees or deactivated accounts
Loss of service due to hardware or infrastructure failure	Loss of data due to malicious insiders or hackers deleting content
Short-term (30 days) user error with recycle bin/ version history (including OneDrive “File Restore”)	Loss of data due to malware or ransomware
Short-term (14 days) administrative error with soft- delete for groups, mailboxes, or service-led rollback	Recovery from prolonged outages
	Long-term accidental deletion coverage with selective rollback

Understand retention policy

What are retention policies used for? Why you need them as part of your MVP?

Do not confuse retention labels and policies with backup.

Remember: “retention [policies are] auto-applied whereas retention label policies are only applied if the content is tagged with the associated retention label” (AvePoint Blog, 2021).

E-discovery tool retention policies are not turned on automatically.

Retention policies are not a backup tool – when you activate this feature you are unable to delete anyone.

“Data retention policy tools enable a business to:

“Decide proactively whether to retain content, delete content, or retain and then delete the content when needed.
“Apply a policy to all content or just content meeting certain conditions, such as items with specific keywords or specific types of sensitive information.
“Apply a single policy to the entire organization or specific locations or users.
“Maintain discoverability of content for lawyers and auditors, while protecting it from change or access by other users. […] ‘Retention Policies’ are different than ‘Retention Label Policies’ – they do the same thing – but a retention policy is auto-applied, whereas retention label policies are only applied if the content is tagged with the associated retention label.

“It is also important to remember that ‘Retention Label Policies’ do not move a copy of the content to the ‘Preservation Holds’ folder until the content under policy is changed next.” (Source: AvePoint Blog, 2021)

Definitions

Data classification is a focused term used in the fields of cybersecurity and information governance to describe the process of identifying, categorizing, and protecting content according to its sensitivity or impact level. In its most basic form, data classification is a means of protecting your data from unauthorized disclosure, alteration, or destruction based on how sensitive or impactful it is.

Once data is classified, you can then create policies; sensitive data types, trainable classifiers, and sensitivity labels function as inputs to policies. Policies define behaviors, like if there will be a default label, if labeling is mandatory, what locations the label will be applied to, and under what conditions. A policy is created when you configure Microsoft 365 to publish or automatically apply sensitive information types, trainable classifiers, or labels.

Sensitivity label policies show one or more labels to Office apps (like Outlook and Word), SharePoint sites, and Office 365 groups. Once published, users can apply the labels to protect their content.

Data loss prevention (DLP) policies help identify and protect your organization's sensitive info (Microsoft Docs, April 2022). For example, you can set up policies to help make sure information in email and documents is not shared with the wrong people. DLP policies can use sensitive information types and retention labels to identify content containing information that might need protection.

Retention policies and retention label policies help you keep what you want and get rid of what you do not. They also play a significant role in records management.

Data examples for MVP classification

Examples of the type of data you consider to be Confidential, Internal, or Public.
This will help you determine what to classify and where it is.

Internal Personal, Employment, and Job Performance Data

Social Security Number
Date of birth
Marital status
Job application data
Mailing address
Resume
Background checks
Interview notes
Employment contract
Pay rate
Bonuses
Benefits
Performance reviews
Disciplinary notes or warnings

Confidential Information

Business and marketing plans
Company initiatives
Customer information and lists
Information relating to intellectual property
Invention or patent
Research data
Passwords and IT-related information
Information received from third parties
Company financial account information
Social Security Number
Payroll and personnel records
Health information
Self-restricted personal data
Credit card information

Internal Data

Sales data
Website data
Customer information
Job application data
Financial data
Marketing data
Resource data

Public Data

Press releases
Job descriptions
Marketing material intended for general public
Research publications

New container sensitivity labels (MIP)

New container sensitivity labels

	Public	Private
Privacy	Membership to group is open; anyone can join “Everyone except external guest” ACL onsite; content available in search to all tenants	Only owner can add members No access beyond the group membership until someone shares it or changes permissions
	Allowed	Not Allowed
External guest policy	Membership to group is open; anyone can join “Everyone except external guest” ACL onsite; content available in search to all tenants	Only owner can add members No access beyond the group membership until someone shares it or changes permissions

What users will see when they create or label a Team/Group/Site

Table of what users will see when they create or label a team/group/site highlighting 'External guest policy' and 'Privacy policy options' as referenced above.
(Source: Microsoft, “Microsoft Purview compliance portal”)

Info-Tech Insights

Why you need sensitivity container labels:

Manage privacy of Teams Sites and M365 Groups
Manage external user access to SPO sites and teams
Manage external sharing from SPO sites
Manage access from unmanaged devices

Data protection and security baselines

Data Protection Baseline

“Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline" (Microsoft Docs, June 2022). This baseline assessment has a set of controls for key regulations and standards for data protection and general data governance. This baseline draws elements primarily from NIST CSF (National Institute of Standards and Technology Cybersecurity Framework) and ISO (International Organization for Standardization) as well as from FedRAMP (Federal Risk and Authorization Management Program) and GDPR (General Data Protection Regulation of the European Union).

Security Baseline

The final stage in M365 governance is security. You need to implement a governance policy that clearly defines storage locations for certain types of data and who has permission to access it. You need to record and track who accesses content and how they share it externally. “Part of your process should involve monitoring unusual external sharing to ensure staff only share documents that they are allowed to” (Rencore, 2021).

Info-Tech Insights

Controls are already in place to set data protection policy. This assists in the MVP activities.
Finally, you need to set your security baseline to ensure proper permissions are in place.

Prerequisite baseline

Security

MFA or SSO to access from anywhere, any device

Banned password list

BYOD sync with corporate network

Users

Sign out inactive users automatically

Enable guest users

External sharing

Block client forwarding rules

Resources

Account lockout threshold

OneDrive

SharePoint

Controls

Sensitivity labels, retention labels and policies, DLP

Mobile application management policy

Building baselines

Sensitivity Profiles: Public, Internal, Confidential; Subcategory: Highly Confidential

Microsoft 365 Collaboration Protection Profiles

Sensitivity	Public	External Collaboration	Internal	Highly Confidential
Description	Data that is specifically prepared for public consumption	Not approved for public consumption, but OK for external collaboration	External collaboration highly discouraged and must be justified	Data of the highest sensitivity: avoid oversharing, internal collaboration only
Label details	No content marking No encryption Public site External collaboration allowed Unmanaged devices: allow full access	No content marking No encryption Private site External collaboration allowed Unmanaged devices: allow full access	Content marking Encryption Private site External collaboration allowed but monitored Unmanaged devices: limited web access	Content marking Encryption Private site External collaboration disabled Unmanaged devices: block access
Teams or Site details	Public Team or Site open discovery, guests are allowed	Private Team or Site members are invited, guests are allowed		Private Team or Site members are invited, guests are not allowed
DLP	None		Warn	Block

Please Note: Global/Compliance Admins go to the 365 Groups platform, the compliance center (Purview), and Teams services (Source: Microsoft Documentation, “Microsoft Purview compliance documentation”)

Info-Tech Insights

Building baseline profiles will be a part of your MVP. You will understand what type of information you are addressing and label it accordingly.
Sensitivity labels are a way to classify your organization's data in a way that specifies how sensitive the data is. This helps you decrease risks in sharing information that shouldn't be accessible to anyone outside your organization or department. Applying sensitivity labels allows you to protect all your data easily.

MVP activities

PRIMARY ACTIVITIES	Define Your Governance The objective of the MVP is reducing barriers to establishing an initial governance position, and then enabling rapid progression of the solution to address a variety of tangible risks, including DLP, data retention, legal holds, and labeling. Decide on your classification labels early.					CATEGORIZATION CLASSIFICATION	MVP
	Data Discovery and Management AIP (Azure Information Protection) scanner helps discover, classify, label, and protect sensitive information in on-premises file servers. You can run the scanner and get immediate insight into risks with on-premises data.
	Baseline Setup Building baseline profiles will be a part of your MVP. You will understand what type of information you are addressing and label it accordingly. Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline.
	Default M365 settings Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline. This baseline assessment has a set of controls for key regulations and standards for data protection and general data governance.
SUPPORT ACTIVITIES	Retention Policy Retention policy is auto-applied. Decide whether to retain content, delete content, or retain and then delete the content.	Sensitivity Labels Automatically enforce policies on groups through labels; classify groups.	Workload Containers M365: SharePoint, Teams, OneDrive, and Exchange, where your data is stored for labels and policies.	Unforced Policies Written policies that are not enforceable by controls in Compliance Manager such as acceptable use policy.	Forced Policies Restrict sharing controls to outside organizations. Enforce prefix or suffix to group or team names.

ACME Company MVP for M/O365

PRIMARY ACTIVITIES	Define Your Governance Focus on ability to use legal hold and GDPR compliance.					CATEGORIZATION CLASSIFICATION	MVP
	Data Discovery and Management Three classification levels (public, internal, confidential), which are applied by the user when data is created. Same three levels are used for AIP to scan legacy sources.
	Baseline Setup All data must at least be classified before it is uploaded to an M/O365 cloud service.
	Default M365 settings Turn on templates 1 8 the letter q and the number z
SUPPORT ACTIVITIES	Retention Policy Retention policy is auto-applied. Decide whether to retain content, delete content, or retain and then delete the content.	Sensitivity Labels Automatically enforce policies on groups through labels; classify groups.	Workload Containers M365: SharePoint, Teams, OneDrive, and Exchange, where your data is stored for labels and policies.	Unforced Policies Written policies that are not enforceable by controls in Compliance Manager such as acceptable use policy.	Forced Policies Restrict sharing controls to outside organizations. Enforce prefix or suffix to group or team names.

About Info-Tech

Info-Tech Research Group is the world’s fastest-growing information technology research and advisory company, proudly serving over 30,000 IT professionals.

We produce unbiased and highly relevant research to help CIOs and IT leaders make strategic, timely, and well-informed decisions. We partner closely with IT teams to provide everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

What Is a Blueprint?

A blueprint is designed to be a roadmap, containing a methodology and the tools and templates you need to solve your IT problems.

Each blueprint can be accompanied by a Guided Implementation that provides you access to our world-class analysts to help you get through the project.

Table of Contents

Talk to an Analyst

Our analyst calls are focused on helping our members use the research we produce, and our experts will guide you to successful project completion.

Book an Analyst Call on This Topic

You can start as early as tomorrow morning. Our analysts will explain the process during your first call.

Get Advice From a Subject Matter Expert

Each call will focus on explaining the material and helping you to plan your project, interpret and analyze the results of each project step, and set the direction for your next project step.

Authors

John Donovan

John Annand

Contributors

Björn Erkens, Product Owner, Rencore governance

Search Code: 98939
Last Revised: June 21, 2022

TAGS:

Governance, M365, O365, MVP, Minimum Viable Product, Schema, Data Classification, Azure Purview, eDiscovery, AIP, DLP, Policies, Sensitivity Labels, Retention Policies, Office 365, LFBP

Build a Minimum Viable Product for Data Classification With Microsoft 365

Kick-start your governance with data classification users will actually use!

Our Advice

Critical Insight

Impact and Result

Build a Minimum Viable Product for Data Classification With Microsoft 365 Research & Tools

1. Build a Data Classification MVP for M365 Deck – A guide for how to build a minimum-viable product for data classification that end users will actually use.

Build a Minimum Viable Product for Data Classification With Microsoft 365

Kick-start your governance with data classification users will actually use!

Executive Summary

Info-Tech Insight

Executive Summary

Info-Tech Insight

Questions you need to ask

Four key questions to kick off your MVP.

1

Know Your Data

2

Protect Your Data

3

Prevent Data Loss

4

Govern Your Data

Classification tiers

Build your schema.

Info-Tech Insight

Microsoft MIP Topology

Info-Tech Insight

MVP RACI Chart

What and where your data resides

Data types that require classification.

Info-Tech Insight

Discover and classify on- premises files using AIP

Info-Tech Insight

Understanding governance

Microsoft Information Governance

Retention and backup policy decision

Retention is not backup.

Info-Tech Insight

Understand retention policy

Definitions

Data examples for MVP classification

Internal Personal, Employment, and Job Performance Data

Confidential Information

Internal Data

Public Data

New container sensitivity labels (MIP)

New container sensitivity labels

Info-Tech Insights

Data protection and security baselines

Info-Tech Insights

Prerequisite baseline

Building baselines

Microsoft 365 Collaboration Protection Profiles

Info-Tech Insights

MVP activities

ACME Company MVP for M/O365

About Info-Tech

What Is a Blueprint?

Share on Social

Talk to an Analyst

Book an Analyst Call on This Topic

Get Advice From a Subject Matter Expert

Authors

Contributors

Related Content: End-User Computing Applications

This content is exclusive to members.

Get instant access by signing up!

Title