A data management platform (DMP) collects, organizes, and activates data used to target digital ads. The DMP is a crucial component of the modern ad tech stack built to target interest, behavioral or demographic audience attributes.
Publishers use DMPs to facilitate audience-enabled targeting based on first-party data that they collect. DMPs can also provide third-party data that they collect themselves or license from other companies.
How do DMPs collect data?
Publishers can feed data collected on their websites and apps into DMPs through file uploads, APIs, or pixels. DMPs can collect data for a client to use themselves (first-party data) or aggregate data from multiple sources to package and sell (third-party data).
What data points do DMPs collect?
DMPs typically collect device or browser identifiers, with each identifier representing an individual. These identifiers include browser cookies, mobile identifiers like Apple IDFA or Android AAID, and even connected TV (CTV) device identifiers like Roku RIDA and Samsung TIFA.
An IP address is also a very common identifier collected. User IP addresses are rising in popularity for use in activating CTV data since device identifiers on smart TV platforms are still nascent. An IP address provides a semi-stable user identifier usable across all CTV devices.
Publishers are also starting to base first-party data collection on PPIDs (publisher-provided identifiers) tied to user logins. These are unique identifiers assigned by a publisher to any user that logs into their website or app. PPIDs are gaining usage as an alternative to mobile identifiers and cookies.
How is data organized on a DMP?
DMP customers organize identifiers into "segments" of audiences, and they classify these segments based on behavior, demographics, or interest. Examples of these segment types are below:
An e-commerce site can place a user's identifier into a "did not check out" segment if the user added an item to a cart, but did not finish checking out. The segment is behavioral since it is based on an action or behavior.
If a news site conducts voluntary surveys on their audience, they can learn things like age, gender, or household income. Publishers can then tie these demographic data points to specific users.
If a user mainly browses articles about finance on a publisher's website or app, then that publisher could place the user in a "financially savvy" interest-based segment.
Some DMPs also offer tools to create compound segments using AND / OR expressions.
How do DMPs activate data for targeting?
The data on a DMP is useless without pushing it to an ad platform for activation and targeting. DMPs typically integrate with either a DSP (Demand Side Platform) or SSP (Supply Side Platform) and allow their customers to "push" data to these destinations so they can apply the data to campaigns for targeting.
Once DMP clients organize data into segments, the DMP will push out two different files to their integrations partners: a taxonomy file and an audience data file. These data files are either tab-separated (TSV) or comma-separated (CSV).
Note: Some DMPs require ad platforms to use a segment metadata API rather than a taxonomy file. These API endpoints typically offer the same data points found in taxonomy files.
The taxonomy file typically contains up to three columns: segment id, segment name, and price (CPM). Each row represents an individual segment.
|segment ID||segment name||price|
|456||HH Income > $100k||$1.50|
|789||My First-Party Segment|
The segment ID is a numerical or alphanumerical value assigned to a segment in a DMP. Ad platforms reference this same segment ID value in the audience data file (see "audience data file" below) to match audiences (identifiers) to targeted segments.
The segment name is the human-readable name of a segment expressed as a string. For example, "Auto Enthusiasts" or "Household Income > $100k".
Ad platforms display the segment name to end-users in the platform's UI, usually on a campaign targeting page or in reporting.
DMPs will push a CPM value with a segment if they, or the partner who owns the data, expect to be compensated for the segment's use. The price is displayed in the ad platform UI to end-users and used to calculate billing.
Audience Data File
The audience data file contains the mapping of audiences to segment IDs. "Audiences" are the individual identifiers used to identify a user.
So the first column is the identifier and the second column is the segment ID (or multiple segment IDs). Each row of the file represents a unique audience to segment mapping.
DMPs usually deliver separate files for each identifier type but sometimes also send one single file containing all identifier types.
Device IDs or identifiers vary based on the platform from which the audience originated. Some DMPs refer to identifiers as MAIDs (mobile advertiser IDs) or IFAs (Identifiers for advertisers). The most common identifiers and associated platforms are below:
|Android||AAID / GAID|
|Android / Google TV||AAID|
|IP Address||All CTV Platforms|
The cookie ID delivered in desktop audience data files depends on the cookie syncing setup between the DMP and the ad platform. If you want to learn how cookie syncing works, check out this article:
DMPs deliver audience data files via either FTP or Amazon S3 and typically drop them on a daily cadence.
Full refresh vs. Incremental
There are two types of audience data files, full refresh or incremental.
Full Refresh Files
In the aptly named "full refresh" files, DMPs refresh all audience to segment ID mappings every delivery.
If a mapping that existed in a prior delivery and no longer exists in a new delivery, that audience no longer belongs to that segment. This implies that the ad platform should remove the audience from the segment.
Incremental deliveries only include new identifier to segment ID mappings. It is incremental because the DMP is only adding to (incrementing) the data set.
If a DMP includes an identifier to segment mapping, ad platforms should add new audiences to the segments. Ad platforms should only remove an audience once they reach a TTL threshold.
TTL or "Time to live" is a mechanism in computer programming or networking that defines the lifespan of data. DMPs and ad platforms agree to a TTL policy that is applicable across all data deliveries. The TTL policy outlines how long an audience should exist in a segment.
This policy is needed for incremental deliveries since there is no implication of removing an audience like in full refreshes. If a TTL is thirty days, then the ad platform should expire an audience from a segment thirty days after the DMP added an audience to an incremental delivery.
Some DMPs do support audience deletes in audience data files. DMPs usually express deletes as a third column in the audience data files. So the second column would list the segments to add an audience to, and the third column would represent the list of segments to remove the audience.
The ad platform will store the identifier to segment mapping for use in campaign targeting.
How is audience data used for targeting?
A DSP or SSP will make audience segments available as targeting options in their platform. The segment names, extracted from the taxonomy file, will be displayed as the values available for targeting. Users can then apply audience segments to a campaign the same way they would add any targeting values such as country or device OS.
Once an ad platform receives an ad request, it will read the cookie or device identifier sent in an ad request and match the identifier to its associated audience segments.
The platform will determine a campaign is eligible to serve if the audience initiating the ad request matches the segments applied to a campaign.
What's the difference between first-party and third-party data?
The terms "first-party" and "third-party" denote the ownership of the data relative to the entity using that data. If a company is using data it collected itself, then it is "first-party" data. If a company uses data from another company, it is "third-party" data.
There are typically two different types of first-party data: Advertiser first-party data and publisher first-party data.
Publisher First-Party Data
Publisher First-Party Data is data collected directly by a publisher from its owned and operated apps and websites. For example, publishers could assign users to interest segments based on the articles they read or videos they watched.
Publishers could then enrich deals they sell with their first-party data. Inventory enriched with unique data only accessible through a publisher can elevate the CPM of potential campaigns.
Advertiser First-Party Data
If an advertiser collects the data and uses it to target their campaigns, it is Advertiser First-Party Data. Advertisers collect this data based on consumer purchase behavior or actions users take on their e-commerce websites and apps.
One example of how advertisers can use this data is retargeting. Advertisers can use their first-party data to retarget a user with ads of items similar to what they previously bought or viewed.
Third-party data is a collection of data, often from multiple sources, that is packaged together by interest, demographic or behavioral intent, to be sold to other companies.
Data brokers and even the DMPs themselves collect and aggregate data to make it available for anyone to use for a fixed cost.
Data brokers can collect data through various direct or indirect methods. They can directly collect consumer browsing history via pixels embedded on partner websites. Data brokers can also buy demographic information from credit reporting companies or web activity from ISPs for interest and behavior data.
Smart TVs create a treasure trove of Automatic Content Recognition (ACR) data for some companies. Whether you watch TV from an over the air antenna or stream from Netflix, most Smart TVs now monitor every piece of content and ad you watch. This data is then transmitted, organized, stored, and sold to the highest bidder.
If you are interested in learning more about Automatic Content Recognition, check out this post:
Advertisers can license third-party data and apply it to their ad campaigns but must pay the data broker a specific CPM rate.
For example, a data broker may collect data from travel websites and create a "travel-enthusiasts" segment set at a $2 CPM. An advertiser can apply this segment to their campaign, but they must pay $2 to the data broker for every thousand impressions delivered. So if the advertiser delivers 100,000 impressions on a campaign, they owe $200.
Most DMPs surprisingly rely on the ad platform using the data to self-report usage. Relying on the honor-system in today's highly programmatic advertising world may seem quite odd, but it is the path of least resistance.
Even if DMPs built out infrastructure to support real-time reporting via a pixel, they would still rely on the ad platform to properly pass segment IDs applicable to a given impression, which still relies on the honor system.
Ad platforms are responsible for paying any data fees associated with a segment. Ad platforms usually pass data fees onto advertisers or publishers, depending on the business arrangement. Billing reports may include separate line items for data fees that represent (impressions / 1000 * segment CPM).
The DSP or SSP then pays out these data fees to the DMP. The DMP retains the revenue earned or passes along payment to data partners if they are acting as a clearinghouse or conduit to pass data to the ad platforms.
What is the future of DMPs?
Device identifiers and cookies are the crucial foundational values required to onboard and target data, and they may soon go extinct. Apple is effectively killing their IDFA, Google is figuring out how to deprecate cookies in Chrome, and Connected TV device IDs may be the only identifiers left standing.
So will DMPs survive this identifier-apocalypse? Maybe some of them.
Some DMPs are trying to adopt identifiers directly collected by publishers and advertisers to serve as a replacement for cookies and device identifiers.
The ad tech industry is coalescing around a hashed email address as the identifier of choice to form the connective tissue between DMPs and ad platforms.
Adopting hashed email addresses as a universal identifier means that in the future, publishers may require users to provide an email to access content, ensuring that they can provide data targeting capabilities to their advertisers.
No matter the eventual solution, DMPs will have to adapt to a more privacy-conscious world. Without an identifier to tie audiences together, advertisers may continue shifting budgets to walled gardens that can provide precise audience targeting through massive amounts of logged-in users.