Even in settings you might consider off-limits, like a visit to your cardiologist or gynaecologist, businesses you have probably never heard of are collecting, profiling, trading and monetising your data.
Welcome to the shady world of data brokering, an industry estimated to be worth US$240 billion globally.
Data brokering practices are opaque, which makes it near impossible for consumers or regulators to test for compliance with consumer protection or privacy laws. This is the story of what happened when I decided to help a friend trace how her personal information had been collected, collated into a profile, traded and used to send her unsolicited marketing about her health. In the process, we encountered stonewalling, misleading and contradictory claims, and some super-disturbing practices, now detailed in a complaint to the OAIC, and a submission to the ACCC.
It all started when Justine – not her real name – received a letter in the mail, containing unsolicited marketing, promoting a particular health-related service. This was not a public health program like cancer screening, but a for-profit biotechnology company promoting their own product. Justine had been targeted at her home, based on her gender and age. Our investigation to find out who sold or shared her data has so far taken us through three separate companies, none of whom Justine had ever heard of or had any relationship with. And it’s not over yet.
Along the way, we found one data broker who claims to have data on 85% of the Australian population. By linking together precise location data from our mobile phones, public registers like the electoral roll and our online behaviour (i.e. everything we search, like, share or click on), they can track us both online and offline, profile us, and directly reach us for targeted marketing or messaging.
Despite the supposed protection of our privacy laws, we have been sliced and diced into marketing categories down to a disturbing level of detail, from ‘Tradies’ (small businesses registered at a home address, matched with ‘devices seen at Bunnings’) to ‘Cardiologist visitors’ (people whose phones have been in a cardiologist’s offices, matched with their online ‘content consumption’).
Justine’s case illustrates the disturbing amount of data laundering going on, just to deliver one paper letter. Imagine what is going on with data brokers and behavioural advertising online.
Will the ACCC take on the industry trading in your personal information?
In July this year, as part of their long-running Digital Platform Services Inquiry, the ACCC asked for submissions about the supply of data broker services.
However the ACCC has limited its scope to what it calls third-party data brokers: “businesses that collect data about consumers from a range of third-party sources and sell or share that data with others”. Some of them you may have heard of: companies like Equifax, Oracle, Nielsen and LiveRamp (formerly known as Acxiom). Others you won’t know about.
Excluded from the ACCC’s review are the first-party data brokers: “businesses that collect data on their own consumers and sell or share that data with others”. In other words, the companies you directly interact with, which supply the data brokering ecosystem, pumping brokers with your personal information. Who is in on this lucrative market? Digital platforms like Alphabet (Google), social media companies like Meta, customer loyalty schemes, app developers, banks, retailers, telcos, media outlets, and more.
But the problem is that you cannot effectively regulate the data brokering industry without tackling the entire ecosystem. That ecosystem currently enables companies to sell their ‘first party’ customer data, and/or buy data or ‘insights’ from brokers to ‘enrich’ their customer data. (I use the word ‘enables’ here, rather than ‘allows’, deliberately. While privacy legislation would appear to prohibit the sharing of personal information between unrelated companies, there is such widespread non-compliance with the legislation that it could be said that the current regulatory environment is such as to enable these practices to flourish.)
In any case, the line between ‘first party’ and ‘third party’ brokers, and between data sellers and data buyers, is increasingly fuzzy. (As I detailed in a previous blog, the big media companies in Australia are no longer really in the business of journalism; their profits come from trading in our data. Media companies collect, collate, share and use our data, across different brands, services, platforms and apps, in order to track, profile and target us, without our active agreement to do so. And they use a bunch of BS claims to justify it.)
The companies you might not know of, but who know all about you
Every day, Australian consumers entrust their personal information with organisations in order to facilitate a particular type of transaction such as renting a new home, as a consequence of a major life event such as suffering a medical event, or simply by conducting their day-to-day lives, both online and offline.
This trust should not be abused with the re-purposing of personal information for customer profiling and marketing purposes, at the targeted, individual level, by unrelated companies. Yet that is exactly what is happening.
By way of example, the ACCC’s Issues Paper describes CoreLogic as “a property data and analytics company that produces a range of products including housing affordability reports, construction reports, sales and auction result data and valuation models”.
While the sale of “property insights” sounds benign, when Justine and I started digging, our investigation revealed that a separate data broker known as SMRTR uses CoreLogic property listing and transaction data at the individual level to identify and enable direct marketing to ‘home movers’: sellers, buyers and renters. SMRTR promotes its ability to identify and reach individuals in this category as valuable to insurance companies, energy retailers, and companies selling everything from whitegoods to the local plumber. Why? Because ‘home movers’ are rich pickings: when targeted at the right time, not only might they need a new fridge, but they might be persuaded to switch their home and contents insurance provider, or energy retailer, as part of the moving process. And knowing the moving date means you can also predict annual policy renewal due dates – another time to target potential new customers with an offer.
CoreLogic is not the only source of data used by SMRTR, and ‘home movers’ is not their only category of consumers. In fact SMRTR claims to have “over 500 unique audiences, ready and available for activation in any environment”, which “cover 85% of the population”.
Other market categories include people who are ‘Ready to refinance’, ‘Fun-loving fifties’, business-owners ‘registered at a home address with between 11 and 50 employees’, ‘Toyota buyers’, ‘Tradies’ (drawn from “Devices seen at Bunnings”), ‘Cardiologist visitors’ (“Mobile phone geobehaviour and/or content consumption suggests they have visited a Cardiologist”), ‘Obstetricians And Gynaecologists Visitors’ (likewise based on mobile phone location and online content consumption), and ‘Personal Loan – Behavioural’ (which is drawn from “Devices seen at financial institutions focusing on personal finance, financial planning & tax planning”).
They claim that “All of our data is directly connected to their corresponding audiences via address, email, social, phone and MAID”. (MAID means Mobile Advertising ID, i.e. a mobile phone device identifier).
In other words, the ability to profile you, based on a combination of your mobile phone location data and other data collected about you from your online behaviour, and then target you, direct via mail to your home, email, or via ads you will see online, is being packaged up and sold, by a company you have probably never heard of before now.
While SMRTR doesn’t explain where their mobile phone location data is sourced from, their website provides an example of how they use such data to connect online and offline behaviours, and then deliver personalised advertising or other content accordingly: “While someone looking at pet equipment online may or may not own a dog, someone that visits a pet store in the real-world and dog parks is more than likely to own a dog. It is here that location data can start to help marketers when it comes to connecting the dots between the online and offline worlds”.
In order to connect multiple points of data about individuals, SMRTR uses an ‘Identity Graph’, which combines “a range of privacy-compliant IDs to create a connection between offline and online identities”.
In an earlier version of their website since removed (but never fear, ACCC: we grabbed screenshots), SMRTR was more explicit about what they do, and the fact that their Identity Graph enables them to build profiles on individuals who can then be directly targeted, online or offline. They waxed lyrical about how “insights” drawn from unrelated sources can be “mapped back to individual customers in CRM systems”. They also claimed to “enable the targeting of known customers wherever and whenever they are online”, as well as finding “lookalikes for targeting” for ads.
I believe that Australians are highly unlikely to be aware, and would likely be horrified to learn, that intimate details about their life – such as that a woman has visited a gynaecologist – are being collected from our mobile phone location data, online browsing habits and other sources, collated into customer profiles, monetised, traded between companies, and used by companies for direct, personalised marketing or other messaging, influencing or decision-making purposes, without the participation – let alone consent – of either the woman or her gynaecologist.
And before anyone says “it’s just ads” … data brokerage services and the micro-targeting capabilities they support have been used by bad actors to perpetrate harms from harassing women seeking abortions, to doxing individuals, to electoral interference and disinformation campaigns. Data brokering practices harm consumers, both in terms of the risk posed by data breaches, and the harms which arise from the hyper-personalisation of content consumed in the digital environment.
Isn’t this in breach of the Privacy Act?
In our submission to the ACCC, we argued that the examples we found of the collection of personal information, about identifiable and addressable individuals, is likely to breach APPs 3.5 (unfair means of collection) and 3.6 (indirect collection) in the Privacy Act. This is true of all categories from ‘Tradies’ to ‘Home movers’. However in addition, data points and inferences found in categories relating to pharmaceuticals and healthcare squarely constitute ‘health information’, which is a type of ‘sensitive information’ under the Privacy Act, which also requires consent for its collection (APP 3.3).
SMRTR’s Privacy Policy claims that they don’t collect sensitive information like health information. But given the broad definition of ‘health information’ in the Privacy Act (it includes opinions about a health service provided to an individual), and the inferences such as ‘Cardiologist visitors’ upon which SMRTR trades, I find that hard to believe.
Indeed the Facebook page of SMRTR contains a number of complaints from Australian consumers about SMRTR’s collection and use of their personal information for marketing purposes, including the direct marketing campaigns of pharmaceutical and medical screening companies.
So how did we find out about SMRTR? It all started when Justine and I went digging.
Dig deeper, and you find … a deeper hole
In September 2022, Justine received a direct marketing letter to her home address from a biotechnology company promoting a medical scanning service. The letter arrived not long after Justine’s 70th birthday, and the medical scanning service related to a disease affecting older women. The letter noted that people over 70 can have the scan ‘bulk billed’.
Justine had never heard of the company advertising the medical scanning service, and wanted to find out how the company knew of her address, age and gender, such as to send such a targeted marketing message about a health condition.
For the next six months, Justine tried in vain to find out how her personal information ended up with the biotechnology company. The trail started with the biotechnology company (Amgen) who pointed the finger at one data broker (SMRTR). SMRTR pointed the finger at another data broker (Eight Dragons). Eight Dragons pointed the finger back at Justine, claiming she had supplied the data with consent when she entered a ‘Win a Trip to London’ competition. When challenged on the veracity of this claim, Eight Dragons did not reply to her further enquiries.
(Spoiler alert: there was no competition entry. We have provided evidence to the ACCC that Eight Dragons’ claims that Justine provided her data in a competition entry were not only fanciful, but were contradicted by SMRTR’s records.)
Along the way, Justine discovered that these companies had various forms of information about her, including the number of people in her household, her ‘net worth’ category, and the likelihood of her embarking on a renovation. (BTW if you’re enjoying this yarn so far, there’s plenty more detail about Justine’s data profile at the end of this blog.)
This was the amount of data laundering we discovered, just following the trail of one paper letter, delivered to Justine offline. So imagine what is going on with data brokers and behavioural advertising online, using the same slippery tactics.
I can only hope that the ACCC inquiry will lift the lid and let some sunshine in, so we can understand, interrogate and regulate this industry better.
Meanwhile, the Privacy Act review grinds on. Will the Attorney General finally fix the problems there?
An industry built on widespread non-compliance, plus a gaping loophole
From the data sellers through the data brokers to the data buyers, I believe that many practices within the brokerage industry are non-compliant with the Privacy Act.
For example APP 3.6 has been described by legal academic Dr Katharine Kemp as ‘the forgotten privacy principle’. APP 3.6 says that companies must collect personal information about an individual only from the individual, unless it is unreasonable or impracticable to do so. So if your bank, airline, insurance company or supermarket could ask you directly if you own a dog, have a heart condition, might be pregnant, or are in the market for a new car, the law says they must ask you directly, instead of snooping around behind your back.
Of course the brokers tend not to have any direct relationship with individual consumers; they are by definition middle men. Which means that they could argue that APP 3.6 allows their practices, because it would be unreasonable or impracticable to expect data brokers to approach consumers directly to ask about every intimate detail of their lives.
This gaping loophole in the Privacy Act leads to an absurd result, in which your bank / airline / supermarket / insurance company is not allowed to spy on you … but all the other banks, airlines, etc can. And so can third party data brokers.
It is beyond overdue that this loophole in the Privacy Act was fixed.
But as the law stands today, at least on the demand side of the industry, much of the ‘customer enrichment’ market for companies to buy data from, or use the services of, data brokers in order to collect more information about their existing customers, is likely built on non-compliance with APP 3.6.
Enforce APP 3.6, and you throttle demand.
Likewise, better enforcement of the Privacy Act should disrupt the supply chain for data brokers. I suspect that companies disclosing their customers’ personal information to a data broker may be in breach of APP 6, which limits the disclosure of personal information for unrelated secondary purposes, without customers’ consent.
In particular, I would argue that companies cannot rely on ‘consent’ as their purported ground for compliance with APP 6, unless such consent is truly voluntary, informed, specific, current and demonstrably granted actively and willingly by a person with both the capacity to understand, and in circumstances in which the consumer had the option to make a choice between granting or refusing their consent, without being denied access to the original goods or services for which they were transacting. Our investigation into Justine’s case demonstrated the falsity behind just one example of purported ‘consent’-based data sharing. I suspect the entire industry is built upon a house of cards.
Enforce APP 6, and you choke supply.
Muddying the waters to confuse consumers
I have previously called BS on claims that the data being shared by media companies and others in the advertising industry is not ‘personal information’.
Similarly, claims about data ‘aggregation’, ‘de-identification’, ‘anonymisation’, ‘privacy-first’, ‘privacy-compliant IDs’, ‘privacy-preserving technology’ and the like are commonly used by participants in the data brokering industry to allay the privacy concerns of consumers and regulators. I believe consumers may be misled by such phrases.
For example, while SMRTR claims to consumers on its Facebook page that “Our data universe contains 16m Australians mapped to a variety of data assets and aggregated to protect privacy”, the website aimed at companies wanting to flog their products contradicts that privacy-protecting claim, stating that “All of our data is directly connected to their corresponding audiences via address, email, social, phone and MAID”.
The latest trend of using ‘data clean rooms’ to bring different companies’ datasets together using so-called ‘privacy-preserving technology’ provides fertile ground for more BS claims, which hide the reality: data clean rooms are laundering data.
The dirty little secret hiding in data clean rooms
The industry group Interactive Advertising Bureau (IAB) promotes data clean rooms as an example of the use of “privacy preserving technologies”. Their July 2023 guidance defines a data clean room as “a secure collaboration environment which allows two or more participants to leverage data assets for specific, mutually agreed upon uses, while guaranteeing enforcement of strict data access limitations”.
Similarly, the International Association of Privacy Professionals says that data clean rooms “allow companies to merge or match first-party data sets to create fresh data analytics segments, while withholding personally identifiable information from involved parties”.
Use cases for data clean rooms include “Addressability and activation of audiences by advertisers”, and “Consumer insights and data enrichment”, according to the IAB.
(Wondering what ‘addressability’ means? The IAB defines it as the ability “to uniquely identify an individual or a device between data sets of two or more parties in a given context e.g. targeting individuals with advertisements”. Audience ‘activation’ refers to the process by which an advertiser can find and target their chosen audience via digital advertising channels.)
The key selling point of a data clean room is that two unrelated companies can match their data, and learn new insights about their customers, without either party having access to the other’s raw customer data.
However if Company A can learn new insights about their own customers at the addressable individual level, from some form of data matching process with data about the customers of Company B, according to the OAIC this will constitute a ‘collection by creation’, which must comply with APP 3.
As noted earlier, APP 3.6 prevents collecting information via third parties if a company could ask their customer directly. Furthermore, APP 3.3 prohibits collecting sensitive data like health information unless it is reasonably necessary to do so and the company has the customer’s consent. This poses significant challenges for any customer enrichment or targeting program.
Sure, the method used to garner that insight may have involved de-identification of the data in the middle of the data matching process. But pseudonyms such as hashed identifiers, and/or probabilistic matching techniques, exist to enable links to be drawn between unrelated datasets, such that with the required degree of confidence, the process can establish that customer 12345 from Company A, and customer 67890 from Company B, are the same person. And this way, Company A can learn from Company B that customer 12345 has a car loan, or likes to buy yoga mats.
It doesn’t matter that Company A doesn’t ‘see’ Company B’s ‘raw’ customer data. The output – the ‘enriched’ customer data for Company A – is still ‘personal information’, which is regulated under the Privacy Act. Therefore neither companies A nor B should be collecting or disclosing their customer data via data clean rooms, except in compliance with APPs 3 and 6.
In other words, this practice is surely unlawful, unless every single individual has given their voluntary, informed and specific consent. (And the likelihood of that is roughly commensurate with a snowflake’s chance in hell. Which is apt, given one of the biggest data clean room providers is called Snowflake.)
That this is the case is hinted at in the IAB’s guidance on data clean rooms. It takes until page 25 of the document to find it, but the IAB does admit that “Data enrichment where net new insight or intelligence is appended directly to an underlying raw dataset … (may) violate the privacy and data governance principles of the … Data Contributors”. The IAB counsels companies to “consult with their legal counsel about specific steps they should undertake to ensure compliance”.
Yep, they sure buried the lede on that dirty little secret about data clean rooms: the whole exercise may be in breach of privacy laws.
In our submission to the ACCC, we argued that customer enrichment or audience activation, without consumers’ active participation or consent, is a practice potentially in breach of the Privacy Act. Using privacy-preserving techniques like secure collaboration workspaces, encryption or differential privacy may protect the security of data during the data flows, but they do not make those data flows lawful in the first place.
In other words, to quote marketing industry editor Andrew Birmingham, marketers justify their practices by pointing to the security techniques used by data clean rooms such as encryption, “as if owning a bludgeon that no one understands makes it ok to conk a customer on the head”.
There is another sector which matches data… but ethically
We drew the ACCC’s attention to the fact that the practices described by data brokers or data clean room providers as “privacy-preserving”, “privacy-compliant” or “privacy-first” are similar to the data matching and linkage processes used in the research sector, where linkage via pseudonymous keys and/or probabilistic matching is conducted by trusted middle parties within secure workspaces.
The key difference between the sectors is that in our experience organisations participating in the research sector, such as medical research institutes, universities and government agencies conducting research critical for policy areas from healthcare to education to disability services to child protection, would not dream of suggesting that their data matching and linkage practices are sufficient to ensure compliance with APPs 3 (collection) or 6 (use and disclosure) or State-equivalent privacy principles, let alone render data ‘de-identified’ to the point that privacy laws no longer apply.
Instead, when not grounded in express consent, data matching and linkage for research purposes typically relies on specific exceptions to the privacy principles governing collection, use and disclosure. Those research exceptions have been granted by the legislature on public interest grounds, and are subject to complex and nuanced ethical approval and oversight processes before they can be relied upon. No such exceptions exist for marketing or customer enrichment use cases.
The use of commercial data clean rooms to allow unrelated companies to learn new insights about individuals, in the absence of true consent from those individuals, makes a mockery of our privacy laws. It also makes a mockery of the research sector, and the months and years that researchers put into ensuring that their projects can comply with privacy requirements.
Simply dressing up data matching and targeting capabilities with phrases like ‘privacy-preserving technology’ might fool consumers, but they should no longer fool regulators.
Because saying that you conduct customer data enrichment in a privacy-preserving way is as oxymoronic as saying you kill lambs and eat meat in a vegetarian way.
So what can be done?
As I said at the start of this blog, Australian data brokers have your data, and they are not afraid to use it.
That’s the heart of the problem, isn’t it? They are not afraid.
The data brokers, the companies which supply them, and the companies which buy from them – they are not afraid.
They are not afraid of the Privacy Act. And they are not afraid of the fact that 87% of Australians believe that trading in personal information is not ‘fair and reasonable’.
It is time to strengthen the law, and empower regulators like the ACCC and OAIC, so that the regulatory regime works to protect – and reflect the clear wishes of – the Australian people.
Maybe then the data brokers will be afraid.
Justine’s experience in more detail: the story so far
In September 2022, Justine received a direct marketing letter to her home address from a biotechnology company promoting a medical scanning service. The letter arrived not long after Justine’s 70th birthday, and the medical scanning service related to a disease affecting older women. Both the envelope and the letter noted that people over 70 can have the scan ‘bulk billed’.
Justine had never heard of the company advertising the medical scanning service, and wanted to find out how the company knew of her address, age and gender, such as to send such a targeted marketing message about a health condition.
For the next six months, Justine tried in vain to find out how her personal information ended up with the biotechnology company. The trail led from the biotechnology company (Amgen) to one data brokering company (SMRTR), and then to another (Eight Dragons), before the trail stopped dead.
Along the way, Justine discovered that these companies had various forms of information about her. For example, SMRTR held personal information about Justine including:
- Title
- First name
- Surname
- Gender
- Date of birth
- Home street address
- Home phone number
- A personal email address (accurate and current), and
- A work email address (no longer in use).
She also discovered that SMRTR held two pages worth of ‘Modelled Data’ about her, such as:
- Working status (Retired)
- Owns her own home
- Number of people living in the home
- Low disposable income
- Has no children
- Has no investments
- Has no credit card
- Not ‘high affluence’ or ‘high net worth’
- Not ‘blue collar’
- Is likely to donate to charity
- Is of an average likelihood of renovating
Justine does not believe the explanation given for how SMRTR had all this data about her: that she had supplied these details to Eight Dragons (and consented to their use for direct marketing) when she entered a competition in 2019.
There are five reasons why Justine does not believe SMRTR’s explanation:
- Justine claims to never enter competitions, and does not remember entering a ‘Win a Trip to London’ competition on 1 January 2019.
- The company making the claim (Eight Dragons) had no proof that she had done so, such as a copy of her competition entry form.
- The details allegedly collected directly from her via the competition entry in 2019 included a work email address, that had not been active for many years prior to 2019.
- The data collected about her included inferences such as ‘not blue collar’ and likelihood of donating to charity or renovating, not likely to have been details a competition entry form would ask for.
- In the data held about Justine by SMRTR was a field labelled “SelfReported”, which was explained to mean “An indicator on whether the personal information has been sourced from a supplier which uses self-reported information, i.e. surveys or competition entries, 0=No, 1=Yes”. In Justine’s record, the field was “0”, meaning “No”.
In other words, SMRTR and Eight Dragons told Justine that her personal information had been collected with her consent when she entered a competition, but the information they held about her (supplied to Justine when she exercised her access rights under the Privacy Act) in fact stated the opposite.
So where did all this information come from? We have not been able to find out.
Amgen pointed the finger at SMRTR. SMRTR pointed the finger at Eight Dragons. Eight Dragons pointed the finger back at Justine, and did not reply to her further enquiries.
In May 2023 Justine lodged a complaint with the OAIC. More than three months later, and it is still sitting in the queue, not yet even allocated to a case officer.
For redacted correspondence about Justine’s case, and plenty more dirt about data clean rooms and data brokering, see the Salinger Privacy submission to the ACCC.
Photograph © Reproductive Health Supplies Coalition