Topics API - The Google FLoC replacement explained
After a bombardment of criticism in the wake of the Federated Learning of Cohorts proposal and associated origin trial, Google has released a new proposal to replace FLoC: The Topics API.
What is The Topics API?
The Topics API replaces FLoC and introduces a new way to facilitate interest-based advertising on the web without third-party cookies. The Topics API also offers enhanced privacy protection and more user control.
The clock is ticking on Google's promise to eliminate third-party cookies by mid to late 2023, and the Topics API may bring them one step closer to that lofty goal.
The proposal outlines a method for browsers to assign human-readable topic names to users. Advertisers could then access the list of topics assigned to a user to serve them more relevant ads based on their interests.
Google announced that they will launch a developer trial of the Topics API in their web browser, Chrome, soon. Like the other Privacy Sandbox proposals, any web browsers are free to adopt and follow the Topics API proposal — although it is unlikely that privacy-focused browsers like Firefox or Brave will do so.
How does The Topics API work?
The Topics API boils down to three key areas you should understand:
Assigning topics to websites
Assigning topics to a user
Using the Topics API to deliver relevant advertising
How are topics assigned to websites?
Humans will map an initial set of websites to topics, and Google will use the human-curated data set to create a machine learning model to automate the process.
For the initial test, Google will host the mapping of websites to topics. Google will map websites to a topic from a published standardized taxonomy. (Taxonomy is a fancy way of saying an organized list of categories & subcategories).
It is important to note that Google will only map hostnames to topics, not individual URLs. Meaning only the domain portion of a URL is used (the bold part of the URL below):
Websites are categorized only at the domain level, not the individual page. So Google would probably place my domain into this topic from the taxonomy:
/Business & Industrial/Advertising & Marketing
Although Google will treat subdomains separately. So if I decided to start an NFT investment advice side-hustle at:
nfts.adtechexplained.com
Then Google could classify that questionable venture separately under:
/Finance/Investing
Google wants to classify domains and not individual pages to avoid privacy concerns by keeping things general and not specific.
For example, someone may not mind sharing that they go to webmd.com to find medical information, but they might not necessarily want advertisers to know about their Irritable Bowel Syndrome.
You can prevent Google from categorizing a website by implementing the following header:
Permissions-Policy: browsing-topics=()
Headers can communicate information to a browser or server loading a website.
How are topics assigned to users?
Chrome will calculate and store the top five topics for a user weekly. Chrome determines the top five topics by analyzing the number of visits to each domain using browsing history.
Each page view counts as a visit, but Chrome will only use the hostname (domain) to increment the number of visits assigned to any one topic.
Chrome will not leverage any browsing history stored in the cloud (on a server, off-device), so topics can vary for the same user across their different devices.
Google believes that:
Users should be able to understand the API, recognize what is being said about them, know when it’s in use, and be able to enable or disable it.
The proposal notes that since the topics are human-readable, users will understand the interests Chrome broadcasts to advertisers. The human readability of the topics is in direct contrast with the ambiguity of FLoC cohort IDs.
Chrome will also inform users about the Topics API and let them know how to remove specific topics or disable the Topics API altogether.
Using the Topics API for advertising
Anybody allowed to execute code on a website can call the Topics API to receive the topics assigned to a user based on the websites they visit. Ad platforms can use the returned topics to decide which ads to display to the user.
Ad platforms need to call a JavaScript function on a page where they are allowed to execute code (like a site they are displaying ads on) to access the topics stored in the browser:
document.browsingTopics();
The function will return up to three topics in random order, one from each of the three preceding weeks. Chrome will store a set of the top five topics each week, plus a sixth random topic.
There is a 5% chance that a browser will return the sixth random topic — ensuring that each topic has a minimum amount of members.
Topics with a small number of members could open the door for fingerprinting and can also provide plausible deniability — in case you ever get grilled on your Google Topics, I suppose.
"Honey, I swear I was browsing Travel & Transportation - Honeymoons & Romantic Getaways for you!"
The browser returns the same topics for anybody asking on a particular domain. So any ad platform running on adtechexplained.com would receive the same topics for each week-long time frame. This restriction eliminates any possibility of API callers sharing topics to extract extra user data.
Once the browser returns a topic for anybody calling the API on a website, the same topic will return within the set of three for three weeks — guaranteeing that an ad platform would learn at most one new topic each week.
However, if the same ad platform ran ads on a different website, they could see an entirely different set of topics for the same user on the separate site. Google intentionally designed this feature to make it hard to match the same user by identifying the same topic set.
Google did not want the Topics API to provide any additional information than the technology it is replacing (third-party cookies) does.
Consequently, API callers (ad platforms) can only be eligible to receive a topic if they called the Topics API on a website categorized as that specific topic.
Confused yet? Let me explain with an example.
Google Ads can only know you are interested in sports if you visited a sports website running Google Ads (and they called the Topics API). But Google Ads only needs to run on one website assigned the topic of sports, like espn.com or nfl.com. After that, every site with Google Ads will know you are interested in sports.
They did this to mimic the effort and relationships required to use cookies to accomplish the same goal.
With cookies, an ad platform can only learn you are interested in sports from visiting espn.com by dropping a cookie and putting you in a sports enthusiast audience segment. They have to have a direct or indirect business relationship with a website and run code on espn.com to drop a cookie.
If Chrome returned topics based on browsing history to any ad platform that asks, then the Topics API gives the ad platform more for doing less. The ad platform does not have to work with the website or execute any code to reap the insights it provides.
The ad platform can sit back and collect topics gathered from websites that they do not work with in any capacity. Google probably realized that users and regulators would not take too kindly to pumping out more information than third-party cookies already do, so they implemented this restriction.
What do advertisers do with Topics?
Once an ad platform collects topics, advertisers can use the topics to target users based on interest.
An ad platform would need to make the standardized topics list available as a targeting option, and advertisers can attach the topic values as inclusions or exclusions on their campaigns
An ad server can call for topics when an ad unit loads and then pass the topic values to their platform for targeting. Simultaneously, SSPs can send bid requests containing the topic values to DSPs.
Any platform involved in an ad transaction can decide whether they want to return an ad based on the topics provided.
Why did Google create the Topics API?
Google created the Topics API in response to criticisms of the initial proposal to enable interest-based advertising without third-party cookies, FLoC.
Many people and organizations, including the Electronic Frontier Foundation, pointed out that a cohort or FLoC ID can serve as an additional data point to identify individual users since only a thousand or so users are in any given cohort.
In contrast, each topic would have many more users assigned, making them an unuseful fingerprinting vector.
Companies like Brave, a competing web browser, also noted that FLoC could expose sensitive categories to advertisers. With the Topics API, Google can publicly share the taxonomy and prove that sensitive categories like religion, gender, race, or sexual orientation do not appear in the list.
The Topics API also provides much greater user control compared to FLoC. Chrome users will be in complete command over how advertising platforms target them.
Users can view which human-readable topics Chrome assigns them to and choose to delete some topics or turn off the feature altogether. FLoC grouped users into generated FLoC IDs that had no meaning to humans.
Now that the Topics API proposal is out in the wild, everyone will have their chance to support or criticize the new initiative. It is a step in the right direction from a privacy standpoint, but some may view any effort supporting interest-based advertising as a step in the wrong direction.
Photo by lalo Hernandez on Unsplash
Reply