The demise last week of FLoC is not the end of the story for Google’s plans to prop up surveillance-based advertising once cookies are phased out.
As a replacement for third party tracking cookies, Google was – until last week when it was killed off – trialling a new system for delivering personally targeted ads called FLoC. FLoC’s objective was to hide individuals in a crowd, and keep a person’s web history ‘private’ on their browser. But it turned out that this initiative was not quite as ‘privacy first’ as Google wanted us to believe.
Nor will its touted replacement – ‘Topics’ – necessarily be much better at preventing privacy harms.
What is changing about the AdTech ecosystem
In his brief history of online advertising, Dan Stinton, the Managing Director of Guardian Australia and former Head of Digital at Yahoo7, explains that “most advertisers or their advertising agencies… purchase consumer data from third parties (credit card purchases, for example), aggregate this with their own first-party data (customer email addresses, for example), and then follow those consumers across the web by dropping cookies on their web browsers and serving targeted ads”.
For a couple of decades now, they have done so in the name of serving up ‘relevant’ ads, targeted to appropriate customer segments. However Stinton writes that at some point “segmentation (became) consumer profiling, which is where the potential for harm really exists”, and that “relevant ads morphed to become industrial-scale behaviour modification”.
The Cambridge Analytica scandal opened the world’s eyes to the impact of surveillance-based advertising, and the realisation that the AdTech ecosystem, initially developed to enable ‘free’ online platforms supported by advertising revenue, has resulted in harms well beyond being subject to unwanted ads.
Fast forward a few years, and community sentiment has shifted. Where the public goes, legislators and courts – and even big business – eventually follow. First we saw Apple and Mozilla block tracking cookies by default in their web browsers, and then when Apple blocked app-based tracking as well unless iPhone customers opted in, only very small numbers of people consented to let the tracking continue. Privacy-first providers of digital services which offer alternatives to the dominant Google and Facebook suite of surveillance-driven products, such as DuckDuckGo (search engine), Signal (messaging) and Protonmail (email), are also growing in market share.
In parallel, European courts have made findings against the use of tracking cookies without consent; and European privacy regulators have cracked down on the difficult-to-use opt-out mechanisms used by Facebook and Google. Meanwhile the European Parliament is considering a Digital Services Act to regulate online behavioural advertising, recent amendments to the California Consumer Privacy Act have jumped into the regulation of digital ‘dark patterns’, and a Bill to ban surveillance advertising has just been introduced into the US Congress.
Sensing the tide turning on community expectations around privacy and online tracking, and a new market for privacy-enhancing tech, Google announced it would address privacy concerns by also phasing out third party tracking cookies on its Chrome browser. Since Chrome is the dominant browser used globally, the final demise of the third party cookie is now scheduled to occur in 2023.
But the end of third party tracking cookies is not the full story when it comes to surveillance, profiling and personalised targeting, based on your online movements.
Inside the birdcage
Once third party tracking cookies are gone, more and more companies will require their customers to log in to their website or access services through an app. This means that the customer’s use of that company’s website or app can be tracked, without needing cookies. That tracking generates what’s called ‘first party data’. When customers use their email address to log in to a site (or download and use an app), their email address becomes a unique identifier, which can then be matched up with ‘first party data’ from other companies, for customers who use the same email address across different logins.
(Our recent blog about the ABC offered an example of even a publicly funded broadcaster succumbing to the drive to collect more ‘first party’ customer data, and then use hashed email addresses to enable ad re-targeting on third party platforms like Facebook and Google.)
But how about outside the birdcage?
Flying, but still tagged
While plenty of companies will push their customers inside their own birdcages, that still leaves plenty of web activity happening when you are not logged into sites. But just because you’re not logged in doesn’t mean you are as a free as a bird; you can still be tracked, profiled and targeted.
As part of its planned phase-out of third party cookies, in 2021 Google proposed FLoC – or Federated Learning of Cohorts – as a new browser standard. The objective of FLoC was to “allow sites to guess your interests without being able to uniquely identify you”.
Google started using machine learning to develop algorithms which reviewed each individual’s web search and online browsing activity to profile them, and place them into ‘cohorts’ of 1,000 or more people with similar demographics, qualities or interests, before allowing advertisers and others to target their ads or content to individuals in that cohort. While advertisers, in theory, were not supposed to learn the identity of anyone in the cohort, or their particular browsing history, they were still able to reach the precise individuals they want to target.
FLoC therefore still allowed individuated targeting or differential treatment of the individual by an advertiser, via Google as the ‘middle man’ who knows all your secrets, even as Google promised to prevent identification of the individual to the advertiser.
The result was highly privacy-invasive for Chrome browser users included in the FLoC trials (which included Australians): your intimacy and honesty turned against you, your hopes, fears, questions and plans extracted and exploited by Google to track, profile and segment you into multiple ‘cohorts’, so they can make a buck targeting you with personalised ads.
Plus in fact identification or additional, intrusive leaking of attribute data about individuals to third parties could also be possible from FLoC. This is because the Chrome browser on an individual’s device would tell every website they visit what that individual’s ‘FLoC ID’ is. A FLoC ID tells the website operator that this particular individual is in the cohort ‘ABCDE’, which means they have a certain profile which reflects the habits, interests and potentially demographics of people in that cohort (e.g. young women interested in physical well-being, or middle-aged men interested in cricket), as determined from their recent online activity.
That way, a publisher (i.e. a website which hosts paid third party ads) can show the ‘right’ kind of ads to that person. Advertisers will have already told the publisher to show their ad to people with certain profiles; so a person profiled as interested in physical well-being might be shown an ad for yoga gear or diet pills, and a person profiled as interested in cricket might be shown an ad for cricket bats or sports betting. So when an individual with a FLoC ID of ‘ABCDE’ landed on a particular website, the publisher would know what kind of ad to display.
Being FLoC’d together does not guarantee privacy
There are two risks associated with this type of online behavioural surveillance and targeting, even if individuals are ‘hidden’ within cohorts, or allocated loose ‘Topics’.
First, websites or advertisers could potentially reverse-engineer from some cohorts the likelihood that certain individuals visited particular websites.
Second, if a website operator already knows other information about that user, either because they are tracking the user’s IP address or the individual has had to log in to the publisher’s birdcage – e.g. the individual subscribes to read the Sydney Morning Herald, or has a free account to watch SBS On Demand – the publisher can combine their ‘first party data’ (i.e. what they learn about their customer from what the customer reads or watches within the confines of that site) with the new information inferred from the fact that that individual is now known to be in cohort ‘ABCDE’ – for example, that this person is likely to be a young woman interested in physical well-being.
This may be no better using ‘Topics’ instead of FLoC. Topics will apparently still use an individual’s recent browsing history to group them into up to 15 ‘baskets’ out of about 350 ‘interest’ categories, based on the IAB’s Audience Taxonomy instead of FLoC’s AI-built cohorts. As well as categories built around demographics (gender, age, marital status, income level, employment type etc), the IAB’s taxonomy has ‘interest’ categories such as #404: Healthy Living > Weight Loss; and #624: Sports > Cricket. Publishers will be shown three of the 15 baskets at random.
However FLoC was particularly egregious, because of the tendency of its algorithms to create ‘cohorts’ based around not only ‘interests’ but also particularly sensitive matters such as ethnicity, sexuality, religion, political leanings and health conditions. (The IAB taxonomy on which Topics will be based may not be entirely immune from allowing publishers to infer sensitive personal information from its ‘interest’ categories either; for example interest #503 is Music and Audio > Religious, while #521 is Music and Audio > World/International Music.)
Just for a moment consider the extent to which even public interest health information websites leak data to Google about who visits their sites: an investigation by The Markup found that even non-profits supposed to be protecting their clients, like Planned Parenthood in the US (which offers information on contraceptives and abortions), addiction treatment centres and mental health service providers, are leaking information about their web users to Google and Facebook.
Now think about combining that surveillance of online behaviour, with the power of inferences drawn from people’s Google search terms and click-throughs, and you can start to see how FLoC could enable highly intrusive profiling and personalised targeting at an individual level.
Even FLoC developers admitted that owners of walled sites (such as the Sydney Morning Herald or SBS in my example) “could record and reveal” each customer’s cohort, which means that “information about an individual’s interests may eventually become public”. The GitHub site for FLoC described this, in somewhat of an understatement, as “not ideal”.
For example, the Sydney Morning Herald could potentially find out which of its subscribers are interested in abortions, anti-depression medication, substance abuse, gambling or suicide; who is questioning their religion or exploring their sexuality; and how they are profiled by Google in terms of their age, gender, ethnicity, political leanings, income, employment status and education level. It could then add that to its own ‘first party’ customer data, and potentially share it with others. Because each user’s FLoC ID was continually updated to reflect their latest online activity, website operators could infer more and more about their subscribers over time.
While Google has said that ‘Topics’, at this stage, will shy away from any demographic categories, it is still proposed to be continuously updated to reflect each individual’s browsing history over time.
Publishers can then use this information to sell ad space to those offering arguably harmful messaging (e.g. promoting sports betting to gambling addicts, or promoting pro-anorexia content to teenage girls) as easily as they can target beneficial messaging (e.g. promoting Headspace centres to vulnerable young people). Individuated messaging and content can also as easily exclude people from opportunities as include them.
The risks are not confined to publishers selling ad space, because the FLoC ID was shared with all websites you visit, not just publishers hosting third party ads. So even government websites could have gleaned information about you from your FLoC ID. Are we comfortable with the ATO or Centrelink knowing that Google has profiled someone as interested in crypto-currency?
So there’s plenty to be concerned about from a privacy perspective. However perpetuating online privacy harms was not the only criticism of FLoC. The competition regulator in the UK, for example, raised concerns about the effect of FLoC and other of Google’s ‘Privacy Sandbox’ initiatives. In the words of Cory Doctorow, writing for the Electronic Frontier Foundation, “the advertisers that rely on non-Google ad-targeting will have to move to Google, and pay for their services… Google’s version of protecting our privacy is appointing itself the gatekeeper who decides when we’re spied on while skimming from advertisers with nowhere else to go”.
Can we ever fly free?
FLoC may have been dumped for now, but whether it is ‘Topics’ or something else which Google ultimately uses to replace third party tracking cookies, there appears little appetite from Google to join its rivals in moving away from surveillance-based online behavioural targeting any time soon.
So what can you do?
First, choose your browser and devices wisely. Tracking cookies are already blocked in Apple’s Safari and Mozilla’s Firefox, and Apple devices are much better at blocking third party tracking both inside apps and on the open web as well.
Second, if you are using Google’s Chrome as your browser, try to find out if you were included in the global FLoC trials, using EFF’s ‘Am I FLoCed?’ tool. Also keep an eye out for instructions on how to opt out of the trials of ‘Topics’, due to start later this month.
Third, if you are a website operator, declare that you do not want your site to be included in your users’ lists of sites for cohort (or ‘interest’ topic) calculations. Government, health, non-profit, human rights and other public interest organisations in particular should strongly consider blocking Topics, in order to protect their users from being subject to profiling and personalised ads or messaging based on any association with their website.
Finally, agitate for law reform. Perhaps it is no coincidence that the trials of FLoC were conducted in 10 countries including Australia, but not in the EU. The major AdTech players and digital platforms like Google and Facebook will keep exploiting our data unless the law stops it.
FLoC is a perfect example of why the law needs to change to keep up with tech: FLoC still allowed individuated targeting, if not identification of users to the advertiser. Topics will do the same. That’s why the current review of the Australian Privacy Act and the definition of ‘personal information’ is so important – we need a law which reflects the role of online identifiers in the AdTech ecosystem, and respects the wishes of the 89% of Australians who believe that the information used to track us online should be protected under the Privacy Act.
Otherwise, in the words of the 80’s band which I now think of as FLoC of Seagulls, we might find that while we can run, we just can’t get away from surveillance-based online targeting:
And I ran, I ran so far away
I just ran, I ran all night and day…
I couldn’t get away