Data governance
Key Takeaways:
- The top priorities of data governance include data quality, security, compliance, and business value.
- While data governance is about security and quality and involves rules and policies, it’s also meant to help teams (including marketing) access and unlock data value.
- The tools organizations use to manage data matter, and they can make governance easier or more difficult.
Table of Contents
What is data governance?
Data governance is the rules, policies, standards, and strategy a company uses to ensure its data is secure, that employees can access and use the data they need, and that data stays updated and accurate. A data governance strategy clearly defines the roles and responsibilities within an organization, including who oversees data and has access to it, and dictates how data can be used.
Data governance provides the structure for collecting, storing, accessing, and using data. Strong governance leads to better data visibility and control because, through policies for data protection, companies can democratize data knowing that only the right people can access it.
Data governance is key to customer data infrastructure because it defines how customer data should be structured and stored.
How organizations store data is also a crucial part of data governance. Keeping customer data in a central location like a cloud data warehouse gives organizations a single source of truth for their data. With a data warehouse, an organization can analyze and understand its data and activate the data from the warehouse. The warehouse ensures teams safely and securely access data and uses it to propel the business forward — that’s strong governance in action.
Why is data governance important?
Data governance is important because it ensures data is stored and used appropriately. It upholds goals like:
- Data quality - Data governance helps organizations keep data updated and accurate so it can be used effectively.
- Data security - Data governance policies ensure data stays protected from leaks, hacks, or attacks. Access control, including user roles, plays an important part.
- Compliance - Data governance provides a framework to ensure data aligns with local and international privacy regulations.
- Business value - Data governance protects data from bad actors and ensures employees who need data can access it to drive improvements and business value, including operational efficiency, higher conversion rates, and more revenue.
When an organization has poor (or non-existent) data governance practices, it leaves its data (including its customers’ private and sensitive information) vulnerable to hackers and major cybersecurity risks. But poor governance also risks reputational damage and significant fines imposed by major data regulations like GDPR and CCPA:
- Enacted in 2018, the General Data Protection Regulation (GDPR) is a European Union regulation that imposes requirements for how and why businesses collect personal data. Financial penalties start at 10 million euros.
- The California Consumer Privacy Act (CCPA), which went into effect in 2020, is a California state regulation that upholds consumers’ rights to know and access the personal data that companies collect from them and learn which third parties can access the data. Penalties for non-compliance include fines and potential consumer lawsuits. CCPA raises the bar for organizations regarding identity resolution because the regulation requires companies to have a single view of their customer data.
These two regulations are far from the only data regulations, but they’re two of the most prominent. Companies need to understand which regulations apply to their data collection, storage, and access, and build regulatory compliance into their data governance strategy to avoid financial strain and broken trust with customers.
How does data governance differ from data management?
Data governance and data management are closely related, but they’re distinct concepts.
Data governance practices relate to the defined structures, policies, and procedures that dictate how data is handled. In contrast, data management is the implementation of those rules within an organization in real-time. As an analogy, think of data governance as a “recipe” (with detailed ingredients and instructions), while data management is like the process of making the recipe.
Data management is broader than governance. It involves not just the policies and standards that governance is concerned with, but also the architecture and data models used for processing, storage, and security of customer data. For example, data management applies to the processes teams use to activate data with a packaged or composable CDP to launch marketing campaigns.
What are the benefits of data governance?
Data governance practices are essential for any organization, but especially those with vast amounts of customer data, including personal or sensitive information. These governance initiatives offer valuable benefits like the following:
- Ensures data quality by breaking down data silos and centralizing data for a single source of truth, which provides accuracy and fosters collaboration from teams across the company.
- Maintains data security by defining policies and procedures that protect sensitive information. This includes access controls and standards for preventing threat actors from hacking and compromising data.
- Upholds regulatory compliance by implementing processes that help organizations stay in good standing, avoid risks, and prevent non-compliance with potential future regulations. This might include creating guidelines for deleting and updating data, ensuring consumer access to their data, or defining how data can be used.
- Drives business value and contributes to better decision-making by ensuring teams that need access to data (like marketing, sales, or customer support) have the access they need.
- Improves operational efficiency by reducing inaccurate or duplicate data entries through centralized and democratized data management. Data governance practices can prevent multiple people from completing the same data-related task twice or making a decision based on outdated information.
- Reduces data-related risks through documented processes to prevent data exposure to unauthorized viewers (both within and outside a company).
What are the drawbacks and challenges of data governance?
While data governance is an important tool, it can also be difficult for organizations to implement or cause operational impacts. These challenges include:
- Data bottlenecks, which slow down innovation and project completion because security protocols limit data access for certain teams
- Significant overhead due to the upfront investments of both time and cost to implement a framework
- Scaling the data governance program as the organization and its data grow, and adapting policies to meet the organization’s needs over time
- Balancing data security, compliance, and access control with teams’ needs to access and manage data for business use
- Limiting responsibility for data governance to just one team, like IT, rather than a cross-functional data governance team
- Convincing teams and stakeholders that the importance of data governance outweighs any operational challenges
- Garnering support for the idea that data governance isn’t just about rules and controls, but about unlocking value and access in a secure way
These challenges can slow organizations down as their teams seek to leverage data to drive business value. For instance, if the proper access controls aren’t created, a marketing team might not have access to the data they need to reach and convert customers.
On the other hand, too many teams having access to data is a security or compliance risk and can result in breaches or major fines.
That’s why organizations need a thoughtful data governance strategy that both protects and democratizes data.
Data governance use cases
Data governance is implemented differently across industries. Here are a few examples of how it can look in practice:
- The consumer packaged goods (CPG) industry generates lots of consumer data, especially through online orders and its supply chain. Brands need strong data governance practices to ensure data security and quality across both big-box vendors and individual consumers. That way, their teams can access and use their data to improve supply chain operations, personalize marketing, and embrace omnichannel sales.
- Finance is a heavily regulated industry, especially around data. For instance, the Fair Credit Reporting Act protects information collected by consumer reporting agencies and limits what data can be associated or linked (for example, contact data to credit scores). If institutions or fintech brands don’t comply, they are subject to FCRA regulations around credit reports. A data governance framework helps companies in the finance industry stay in good standing by securely storing and managing data and dictating how teams leverage data.
- In industries like sports and entertainment, consumers aren’t just buyers — they’re loyal fans. They provide their data to buy tickets, stay updated, or express their love of the brand. Data governance for sports organizations helps them uphold data privacy for their fans while also personalizing outreach to sell tickets or merchandise. Sports brands also need to protect players’ privacy and information, as coaches use tech to collect data on performance.
What is a data governance framework?
A data governance framework is the structure of rules, policies, standards, processes, and tools that make up data governance for an organization. All of the pieces work together, and when teams and employees implement these pieces in tandem, data stays secure, accurate, compliant, and accessible.
Components of a data governance framework
Some of the key components in a data governance framework include:
- Data policies - These well-defined rules clarify the requirements for data usage in the organization, including how it is stored, used, and shared, and who can access it.
- Data quality standards - Data quality standards define how an organization will maintain its data to ensure that it’s accurate, up-to-date, complete, and reliable. These standards may define constraints like how long data is stored, how often it is updated, and how it’s cleansed, validated, or monitored.
- Data stewardship - Data stewardship concerns the roles and responsibilities of data, and it also involves standards for data access to ensure that the appropriate teams can leverage data to drive business value.
- Metadata management - Metadata is data about data, including its context, structure, and origin. Metadata management involves its organization and storage, which needs to follow similar security, integrity, compliance, and quality standards as the data itself.
- Security measures - Security measures are all the safeguards a company puts in place within its data governance strategy to keep data from being destroyed, disseminated, or accessed by unauthorized users. These policies prevent data leaks and attacks and involve access controls, authorization, encryption, and monitoring practices.
Some organizations look to the Medallion Architecture for their data governance framework for its ability to guide policies around data storage. This structure helps consumers know what to expect regarding their data. The Medallion Architecture has three tiers:
- The Bronze Tier, which contains raw data brought from external sources and systems.
- The Silver Tier, which contains cleansed and conformed data for self-serve analytics and machine learning.
- The Gold Tier, which contains data that’s ready for consumption in project-specific databases.
This tiered structure allows organizations to manage their data governance by ensuring secure access to teams that need it and allowing them to flexibly control access based on how the data will be used. Meanwhile, companies can reassure consumers that their data is safe in every tier.
Data governance best practices
How can an organization implement data governance as efficiently and effectively as possible? Here are some of the data governance best practices industry leaders recommend:
Define roles and responsibilities
One of the core data governance principles is to establish clear data ownership. For instance, data stewards oversee the quality and security of particular datasets, and it’s critical to establish these roles as part of defining policies and data structures.
Consider designating someone to oversee data democratization, who advocates for agility and access within data governance initiatives — that way, data can drive business and stakeholder value.
While data governance doesn’t have to be implemented top-down, the chief data officer (or a similar role) should drive the effort forward and educate stakeholders about the value of data governance.
Educate teams on the real-time benefits of governance
Because governance involves controls and restrictions, implementation often encounters internal resistance. But data leaders can overcome those challenges by focusing on how governance serves business goals. While data governance does improve the defensive posture, it’s also intended to let organizations unlock their data and do more with it. Focus on educating teams about how governance can provide greater data access.
Create data quality standards
Data quality standards ensure everyone knows the expectations and goals for collecting and storing data. It’s not just a way of controlling the data but also keeping it high-quality to benefit the business.
For example, one framework that organizations can use is the FAIR Guiding Principles for data quality. Data should be:
- Findable
- Accessible
- Interoperable
- Reusable
These guidelines can help an organization uphold data quality requirements to ensure that the data supports the business goals.
Implement regular data governance training
Data governance is an ongoing process. Create a cadence of data governance training, both for those who implement the framework and for those who access and leverage data. This training should remind teams of procedures and requirements while updating them on any regulation changes or how data is stored.
Balance agility with structure
Build a data governance program that includes policies around data access as well as security. Integrate these two goals so that stakeholders don’t see governance as burdensome or restrictive but as a way to access data safely and strategically.
When done right, governance is about unlocking data and fostering secure collaboration to use it. This may require a structure like the Medallion framework that affords some teams freedom with test environments while imposing more policies for “production” environments.
Build on existing data management
Organizations can drive smoother governance implementation by highlighting and building on existing processes. Rather than taking a top-down approach to governance, leverage teams’ current approaches to data. This can make governance seem less demanding, so it becomes an extension of what users already do.
Find opportunities to automate
To avoid overwhelming teams, look for ways to automate manual processes, especially when it comes to large volumes of data and enterprise data governance. Use artificial intelligence and other emerging technologies to automate and streamline data quality, stewardship, and access control policies. Try to make the user experience as intuitive as possible so that governance becomes embedded in the data engineering lifecycle.
How to implement data governance
Following a set data governance process can help organizations ensure an effective data governance program:
Orient teams to the concept of governance
Start by showing teams — those that will implement the framework and teams that access the data — why data governance is important and the long-term value that will come from it.
Establish policies
Create policies for the data with input from multiple business units as well as security, compliance, and quality leaders. Policy types might include security, access, privacy, retention, and quality.
Create data management processes
Define the processes that teams will use to maintain data in keeping with the established policies, such as ongoing processes to clean and remove data or confirm access and authorization.
Monitor data quality
Use data quality tools that can observe issues and ensure that all standards and requirements are continually met.
Build a cadence of ongoing communication
After the initial implementation, conduct regular audits to ensure that data governance is successful and is solving problems for the organization. Keep track of poor data quality and security issues, and update policies to avoid them in the future.
Roles that help manage and maintain data governance
Data governance teams are not siloed within one department — a wide range of stakeholders and roles participate in its implementation. These roles include:
- Data steward - Data stewards are responsible for the quality and security of specific datasets within an organization, and they play a central role in implementing and enforcing data governance policies and procedures. They collaborate with other roles like data owners and custodians to uphold data requirements.
- Compliance manager - The data compliance manager is responsible for understanding current laws, regulations, and standards as they apply to the organization’s data. They set and enforce policies that help the organization stay compliant and monitor upcoming changes to regulations. They work closely with legal and regulatory teams.
- Solutions architect - A solutions architect oversees and implements an organization's tools for data governance, including platforms for data storage, compliance, and monitoring. Before implementation, they gauge organizational needs across teams and deliver those needs as they design the components of the data governance program.
- Data owner - Typically a senior-level role, data owners make decisions about data access and security, and they oversee many of the policies involved with data governance. They’re ultimately responsible for setting many requirements within the data governance program, so their decisions impact every other stakeholder.
- Data democratization lead - This role ensures teams have appropriate access to data. They also help break down silos, bottlenecks, and barriers that would prevent employees from leveraging data and prioritize self-service data and data literacy.
- Data custodian - Data custodians play a central role in storing and transporting data, and they implement the technical elements involved in data management. They’re responsible for ongoing data protection and access in keeping with data governance standards and policies. While the custodian role shares some similarities with the steward role, they’re more responsible for the database structure, environment, and security controls.
- Data governance committee - Some companies appoint a data governance committee, which brings together individuals from various teams and business units who provide input on policies and standards. They help to set standards and resolve issues with the data governance program.
Data governance tools and vendors
A data governance program requires a robust tech stack with governance tools that support the organization’s policies and processes. Here are some of the key categories of vendors:
- Data catalog tools - Data catalog tools use metadata to help organizations keep an inventory of all data assets so users can quickly and easily access data. Two of the industry-leading tools in this category are Atlan and Collibra. Atlan supports different data experiences and interfaces for different user types, and Collibra offers a user-friendly search experience for data. Additionally, Informatica offers AI-powered data catalog and data management capabilities.
- Data monitoring - Data monitoring tools help organizations review and track data to ensure its quality so that it can power strategic decision-making. One of the most common data monitoring tools is Datadog, which IT and DevOps teams use to monitor databases and identify bottlenecks to improve performance.
- Compliance tools - Data compliance software puts policies in place to keep organizations compliant with regulatory requirements like GDPR. The governance software does this by gathering user consent when collecting data, dictating how long data can be stored, and ensuring consumers know how their data is being used. Alation is a solution that offers both data catalog capabilities and compliance by detecting personally identifiable information (PII) and helping organizations manage privacy policies.
- Data discovery tools - Data discovery tools enable organizations to analyze large datasets and uncover patterns that let them more effectively use their data. Data warehousing solutions play an important role in data discovery, along with solutions like Atlan which also support data catalog capabilities.
In some cases, customer data infrastructure tools may provide guidance on data governance.
Data governance implementation may also involve a solution that grants teams self-serve access to data without compromising its security. For instance, a composable CDP can give marketing teams the ability to drive value from data (with personalized marketing campaigns and targeted outreach) while it stays in the data warehouse. This ensures safe and secure data management, in keeping with strong data governance, while teams can leverage the data to drive value.
More from the University
Looking for guidance on your Data Warehouse?
Supercharge your favorite marketing and sales tools with intelligent customer audiences built in BigQuery, Snowflake, or Redshift.