Data clean room
Key Takeaways:
- Data clean rooms remove all PII from customer data so collaborators can share, access, and analyze it without breaching data privacy regulations.
- Many data warehouse providers offer data clean room solutions and frameworks as a service.
- There are several types of data clean rooms available, including clean rooms as a service, media data, private data, and walled garden data clean rooms.
Table of Contents
What is a data clean room?
A data clean room is a secure environment to share, combine, and analyze customer data.
Unlike traditional data-sharing methods, which typically involve sending raw datasets in spreadsheets, a data clean room protects sensitive customer information and restricts which users can access the data insights.
Data clean rooms remove all personally identifiable information (PII) from customer data so collaborators can share, access, and analyze it without breaching data privacy regulations, including the California Consumer Privacy Act (CCPA) and the European Union’s General Data Protection Regulation (GDPR).
Multiple company divisions or separate companies can use data clean rooms for secure data collaboration and data sharing. Data brokers and aggregators, advertising platforms, and companies across industries all benefit from data clean rooms.
Where did data clean rooms come from?
Privacy concerns and regulations like CCPA and GDPR created the need for data clean rooms starting in the mid-2010s. Marketing teams, advertising companies, and any organization sharing data — especially global companies — need a secure and compliant way to do so.
Several large companies started using data clean rooms in 2017. Google released its Ads Data Hub in May 2017 to share online advertising data. Facebook began using infrastructure similar to a data clean room in June 2017 to share data with some customers.
The phasing out of third-party cookies in the 2020s also accelerated the need for data enhancement solutions, allowing companies to share and analyze customer data without revealing confidential information.
How does a data clean room work?
A data clean room provides full control over how data is consumed, combined, and analyzed. Data clean rooms enforce strict security measures and privacy protections, including encrypting PII, meaning no data or information tied to a specific user is accessible.
All types of companies can use a data clean room. The following steps are typically involved in accessing and using a data clean room:
- User authorization - Any organization or user must pass a detailed authorization process to ensure that only approved users can access the data clean room.
- Data preparation - All parties must agree on what data they will share and how it will be used. They must prepare their first-party data for upload and include a common marker or field by which the data clean room can identify matching entries and merge data.
- Data ingestion - Participating parties upload their datasets into the data clean room. An organization’s first-party data platform or data warehouse is commonly the source of this data.
- Anonymization - The data clean room applies privacy protection measures, including anonymizing the data to protect sensitive information.
- Aggregation - The data clean room matches data to enrich the entries.
- Analysis - Approved users collaborate on how to analyze the data and identify what insights they want to gain from it.
What types of data go into a data clean room?
Data clean rooms can contain any number of datasets and data types depending on what each party seeks to learn.
It is a best practice to only upload data that is relevant to the analysis goal. For example, a retailer could offer a range of product types but use a data clean room to analyze its health and wellness products specifically. In this case, the retailer will upload only its health and wellness product data and will leave out demographic details that are irrelevant to the analysis goals, like if it is assessing female consumer behaviors.
The data clean room can identify trends in the data and help organizations easily recognize connections in their customer data and make more informed decisions.
Everything spanning zero-party data, first-party data, and metadata can go into a data clean room. A few data categories can include:
- Advertising or marketing data - Data and interactions from advertising campaigns, including demographics, channels, click-through rates, and conversions.
- E-commerce or retail data - Customer purchase behavior data, including demographics, shopping preferences, and sales.
- Government data - Citizen data, including census data, public health information, and voting information.
- Healthcare data - Anonymized patient data, including clinical trial results or health-related information.
- Social media data - Information from social media networks, including engagement networks, social interactions, user-generated content, and more.
Data clean rooms and cloud data warehouses
An enterprise data warehouse is a centralized repository to gather and store customer data. Data warehouses often serve as the source of data for a clean room.
Many data warehouse providers offer data clean room solutions and frameworks as a service, including AWS Clean Rooms, Google Cloud Platform BigQuery clean rooms, Databricks Clean Rooms, and Snowflake Global Data Clean Room, among others.
Types of data clean rooms
There are several types of data clean rooms available, some of which are tailored to specific use cases or industries. A traditional data clean room stores all data in a single physical location, and a distributed clean room stores all data in the cloud.
The common data clean room types include:
- Clean rooms as a service - Many data clean rooms exist as a service, meaning they are provided by independent vendors or platforms. These clean rooms are often the most flexible with what data can be used and how users can configure the clean room. Examples include AppsFlyer, Habu, InfoSum, LiveRamp, and Snowflake.
- Media data clean rooms - Media data clean rooms are provided by media and advertising companies, often through partnerships with a clean room as a service. NBCUniversal’s Audience Insights Hub is one example.
- Private data clean room - A private data clean room is built from scratch by an organization to analyze its data or work with data partners. Any organization can create its own data clean room, however, it requires advanced software engineering knowledge and carries high risk if it is not built correctly.
- Walled garden data clean rooms - Walled gardens operate within a closed ecosystem and are offered by big tech providers. These clean rooms provide information from specific platforms, typically without the ability to combine data from other sources. Examples of walled garden data clean rooms include Amazon Marketing Cloud (AMC), Facebook Advanced Analytics (FAA), and Google Ads Data Hub (ADH).
Why should an organization use a data clean room?
Organizations use data clean rooms to make sense of large datasets and form a more complete picture of their audience and customer behavior. Data often exists in silos. Without cookies, it is harder for companies to track their customer behavior across the web.
Marketing and sales teams especially need the ability to augment their customer data with that from other organizations for improved customer segmentation and to inform their cross-channel marketing and omnichannel marketing strategies.
Retail and e-commerce companies, for example, can use data clean rooms to collaborate with advertisers and media companies to better understand how their advertising campaigns result in direct sales. By combining their first-party data with that from other companies or vendors, retailers can enrich their customer profiles and gain a deeper understanding of their customer journey.
For companies across industries, data clean rooms offer the following benefits:
- First-party data monetization - Organizations can offer to exchange or sell data to give other companies a competitive advantage without breaching privacy regulations.
- Data-driven customer engagement - Organizations gain a deeper understanding of their customer behavior and track how activities across online and offline channels drive or hinder customer engagement.
- Richer customer journeys - Organizations can deliver more engaging customer journeys faster and more accurately by combining datasets and analyzing them using a data clean room.
Data clean room use cases
Data clean rooms empower advertising and marketing teams to execute more informed campaigns. Specifically, data clean rooms enable:
Audience analysis
Use customer datasets to identify trends, common behaviors, and purchase patterns. Data clean rooms can accelerate customer persona development and help organizations create more accurate customer groups leveraging past purchasing behavior.
A clothing retailer selling to high-wealth individuals, for example, could collaborate with a complementary retailer also targeting this niche audience in a data clean room to improve each brand’s understanding of their audience and identify cross-sell or co-marketing opportunities.
Measurement and attribution
Track customer activities across channels to understand if specific investments, like a paid social media ad or targeted email, led to the desired outcome or a customer sale.
A telecomm company, for example, could use a walled garden data clean room to understand the typical journey their customer takes after seeing an ad on Facebook, including whether the ad resulted in a sale.
Profile enrichment and identity resolution
Augment existing customer profiles with data from other organizations or departments, which supports identity resolution. Data clean rooms can fill in the gaps on customer profiles to help advertisers, marketers, and sales teams better approach each individual or group them in a customer persona.
A credit card company, for example, can collaborate with a hotel chain through a data clean room to better understand their customer travel behaviors and identify the right travel benefits to promote with specific customer segments.
Challenges of data clean rooms
There are several potential challenges and drawbacks when implementing and operating a data clean room.
The initial hurdle — which is not unique to data clean rooms — is data lifecycle management and the ongoing need for clean datasets. Many organizations lack a strategy to continually assess their data quality, address duplicate entries, and create processes to ensure data stays accurate. This requires a significant, albeit useful, investment to first clean and standardize the company data before sending it to the data clean room, as well as ongoing governance and training to maintain the data quality.
The common data clean room challenges and barriers include:
- Data standardization - Building on the hurdle of cleaning the company data, organizations often manage many different data formats. This can complicate the data-cleaning process.
- Partner identification - Organizations must identify valuable partners they can trust to share their data. Outside of media and advertising company partners, it can be difficult for companies to identify other organizations willing to trade data.
- Privacy concerns - Although a data clean room will anonymize data, there is still a risk that sensitive information could be exposed if a company fails to properly prepare its data for ingestion.
Data clean room alternatives
There are two primary alternative solutions to data clean rooms, and each presents inherent advantages and challenges.
- Browser-based tracking - Companies can request customers and website visitors to allow specific levels of tracking. This provides real-time insights into browser history, searches, ad clicks, and other activities. Browser-based tracking can monitor activities across devices while respecting user anonymity. The challenge, however, is earning the opt-in from website visitors. Many websites immediately ask visitors to accept or decline the use of cookies or alternative tracking methods.
- Universal IDs - Universal IDs replace the need for third-party cookies by using email addresses or other markers to create a unique ID for specific users. Universal IDs track user activity across platforms and devices, however, organizations must implement privacy measures to protect customer anonymity and avoid legal issues.
What's the difference between a data clean room and a CDP?
Customer data platforms (CDPs) and composable CDPs allow organizations to easily collect and manage first party data. CDPs do not anonymize data, however, which introduces security risks if organizations want to enter data partnerships with other companies.
A data clean room works with a CDP to extract data, anonymize it, and enforce security protocols. The data clean room also enables multiple companies to share and combine data, whereas a CDP is focused on just one organization’s data.
How to implement a data clean room
Before implementing a data clean room, an organization must understand the purpose and goal of the data clean room.
Representatives from the executive leadership, security, IT, and data science teams should collaborate to align on the implementation goals and oversee the process. For example:
- The chief security officer and their team will identify the necessary security measures that need to be in place in the data clean room.
- The chief data officer or the data team will identify the data sources for the data clean room and implement procedures to prepare that data for extraction.
- The legal team or legal counsel will advise on any legal considerations and verify that the organization’s plan will adhere to all privacy laws.
- The chief information officer will advise on the implementation and movement of enterprise data in our out of the business.
- The chief sales or marketing officer will advise on the use cases and expected ROI regarding the need for a clean room.
After the team is aligned on the data clean room goals, available resources, necessary security protocols, and data practices, the team can then identify the appropriate infrastructure for the data clean room (as explained in the above section on types of data clean rooms) and create the essential documentation to govern its use.
If the data clean room goal is to partner with other organizations for data enrichment, then the partners must agree on what data each party will share and which insights they seek to gain.
Once the data clean room process is in place, it’s critical to perform ongoing audits and assessments to ensure the data clean room meets its intended use and identify issues that should be addressed.
More from the University
Looking for guidance on your Data Warehouse?
Supercharge your favorite marketing and sales tools with intelligent customer audiences built in BigQuery, Snowflake, or Redshift.