Contextual data providers face an existential threat

After Google announced their intentions to deprecate third-party cookies in Chrome, contextual data providers undoubtedly rejoiced and went to work singing the praises of their solutions. And why would they not?

How will advertisers accurately target users based on their interests without cookies to track their browsing behavior? Contextual targeting is an existing and sensible solution for interest-based targeting without violating user privacy.

Contextual targeting offers the capability to maintain some form of advanced targeting on the web that advertisers desire in the absence of cookies. Advertisers can use contextual signals to target their campaigns based on keywords found on a web page, rather than tracking users across the web with cookies to target their interests.

But Ryan Barwick of Morning Brew recently broke a story about how publishers may not be so happy with this arrangement.

The Ozone Project, Local Media Consortium, and the Association of Online Publishers—groups representing The Guardian, Vice, the BBC, McClatchy, and others—have raised concerns that some ad-tech firms may have scraped and sold publisher data unfairly, possibly infringing upon a publisher’s “intellectual property.”

Uh oh. Publishers are rightfully concerned about how contextual data vendors can scrape their page content, package their URLs into targeting segments and sell that data for profit.

These publishers are particularly upset about brand safety/verification vendors that run an adjacent contextual targeting business. The publishers behind this movement understand the need for independent verification, but they have not agreed to let those same companies sell what they consider proprietary data.

Publishers are carefully considering their first-party data strategies now that third-party cookies are going away. The value of publisher first-party data has increased dramatically as user identifiers slowly fade to sweet oblivion.

Last week, I wrote about how seller-defined audiences allow publishers to capitalize on their valuable first-party audience data without sacrificing user privacy. The IAB Tech Lab also considered eliminating data leakage in the seller-defined audiences spec — in other words, not allowing others to build targetable audience segments using available publisher data.

Now we know that some publishers believe that the content and metadata of their page represents valuable privacy-safe data that they must protect.

How does contextual advertising work?

Ad servers can target contextually based on self-declared signals in an ad request. Publishers can pass categories and keywords in ad requests that essentially serve as glorified key-value pairs. If an ad server sees keyword="sports" in an ad request, then the ad server can allow any line items targeted to sports to serve.

In the programmatic world, when a DSP receives an ORTB bid request, they receive a page URL that points to the exact page where an ad would run.

A DSP can then pass this page URL to a contextual data provider to scrape the content on that page using a web crawler. A web crawler can inspect words directly from the HTML of the website. The contextual vendor would then categorize the page into either standardized contextual segments or custom segments created by an advertiser.

The contextual vendor provides these page URL <> segment mappings to a DSP. So next time the DSP sees that URL, they can match it up to a targetable contextual segment (the first opportunity gets burned for contextual targeting purposes if the URL was not previously categorized).

These segments can be categories like exercise, golf, technology, or brand safety segments to exclude pages about war, crime, sex, drugs, rock 'n' roll, or anything deemed unsavory from a brand perspective.

Ad verification vendors could also piggyback off existing pixel calls from ads they run alongside or anywhere their tech is installed, rather than waiting on any signals from a DSP partner to go and scrape a page. If one of the ads they protect lands on an uncategorized page URL, they can use that as a signal to crawl it.

The Morning Brew article mentions that ad verification companies told publishers they cannot decouple verification from contextual scanning. This stance is purely a business decision and not a technical one.

Why are publishers concerned?

Publishers can create a significant value proposition if they prevent advertisers from applying contextual data outside any publisher direct deals.

What if the advertiser could only buy based on contextual signals through a direct deal with a publisher? Or what if the contextual data provider had to give a publisher a cut of the data fees they earn when advertisers target that publisher's website with contextual data?

Contextual data can significantly enhance the value of any advertising opportunity by allowing advertisers to find relevant audiences more receptive to their brand messaging.

Sometimes, advertisers can reasonably assume the interests of a user based on a root domain in an ad request. If an advertiser sees an ad request originate from adtechexplained.com, they can confidently believe the user is interested in advertising technology.

But general news sites like The Guardian or the BBC attract a wide swath of users with vastly different interests. Contextual data can uncover narrow interests by inspecting the content found at a specific page URL.

Maybe some of these users only read financial news, which would make them an attractive target for crypto apps or financial services. Contextual data providers can scan the words on the page and allow advertisers to programmatically buy ads alongside news stories about business & finance from the open market at severely depressed prices.

Why would an advertiser ever do a deal directly with a publisher if they can scoop up the same inventory at a lower price precisely targeted to the context of their choosing? Publishers can benefit by thwarting advertisers from enriching their programmatic campaigns with contextual targeting segments.

Publishers can supercharge their sales pitch by making contextual targeting a differentiator over less direct programmatic channels.

Is contextual metadata intellectual property?

That is the question at hand. If an article is openly available on the web, do contextual data providers have the right to scrape the page content, package it and sell it to marketers?

Publishers may argue that selling contextual information equates to selling the content itself. It would be like ripping the content from their site and selling it to someone as an original work. They could equate selling information about their intellectual property as intellectual property theft.

Ad verification companies doubling as contextual targeting solutions have leverage — advertisers require their services to combat fraud and ensure brand safety. Pure-play contextual data providers may have a less-defensible position given that they only sell contextual data and may not have a direct relationship with the buyer or seller.

It is worth mentioning that contextual data could increase the value of otherwise less desirable ad opportunities. Contextual data could enrich remnant inventory percolating down through programmatic channels, creating revenue for publishers that might not otherwise materialize.

Morning Brew reported that the Association of Online Publishers has raised their concerns with one of the world's largest advertising agency groups, WPP, and the UK’s Information Commissioner’s Office.

The group of publishers may try to place business pressure on WPP to sway them away from using contextual data from certain sources. But going to the UK’s Information Commissioner’s Office indicates that they are also possibly pursuing legal avenues.

These publishers have focused their ire on ad verification companies that also offer contextual targeting solutions. Although, pure-play contextual data providers may find themselves caught up in the ensuing wake.

If these publishers have their way, contextual targeting outside their direct sales channels could become a thing of the past (unless they get their piece of the pie).