The Subprime Data Crisis
Article by:
Marketers have a toxic relationship with data.
We love data, we fetishize data, we wax poetic about the glories of “data-driven marketing.” And don’t get me wrong, data is very important, especially in B2B. But not all data is created equal. We assume most data is good data. But it’s not. Most data is bad data.
And sooner or later, our over-reliance on big bad data is going to cause a crisis.
The entire digital marketing industry runs on data. That should come as no surprise. What did come as a surprise, at least for me, is that we spend almost $60 billion a year on programmatic display. And that entire ecosystem is powered by data, mostly 3rd party data.
But we rarely stop to ask ourselves an important question: how good is that data?
Well, here’s a recent academic paper from MIT, GroupM, and Melbourne Business School. The researchers set out to test the accuracy of programmatic data, focusing on the two most commonly used data points in B2C --- age and gender.
So, how accurate do you think gender targeting is in programmatic? It’s 50% accurate. In other words, marketers would literally be better off flipping a coin. And in a hilarious turn-of-events, it turns out that gender is actually the most accurate targeting facet. For age, the accuracy drops down to 25%. And as a general rule, the more niche you go, the less accurate the data becomes.
As far as I know, few attempts have been made to test the accuracy of 3rd party B2B data. But if we can’t get gender right 50% of the time, how good do you think we are at identifying IT Decision Makers, or Airplane Procurement Specialists? I would guess the accuracy is below 10%.
Now if you don’t believe MIT, GroupM, and the Melbourne Business School, you can go check this out for yourself. This is a secret hiding in plain sight. There are a bunch of sites where you can ask the internet who it thinks you are based on your browsing behavior. I went and checked mine just before writing this article. According to a leading data provider, I live in West Virginia and Coral Gables (wow, what a long commute!), I’m aged 35-39, I have a mortgage, and I’m both married and single (which does sound like an interesting arrangement).
Unfortunately, all that data is wrong. This isn’t big data. It’s bad data.
And according to the researchers, it adds up to $7 billion in wasted ad spend every year.
Now I’m a very cynical person (I’m from New York City), but even I found this hard to believe. So I decided to sit down with Dr. Augustine Fou. Dr. Fou is the world’s leading expert on ad fraud, an unsung hero in our industry. And I asked him a simple question: is the research right? Is it true that digital marketers are reaching the wrong people? And Dr. Fou looked at me like I was an idiot. “No,” he said, “it’s much worse than that. The problem isn’t that marketers are reaching the wrong people. The problem is that marketers aren’t reaching people at all.”
Marketers are reaching…robots.
When you buy programmatic display on a billion random sites, this is what happens. Some enterprising criminal sets up a fake site, like Babaganoush.com. Then he programs some bots to click on any ad that runs on the site --- not too much, because that would be suspicious, but just enough to beat the advertiser’s benchmarks. A day later, some media buyer sees that Babaganoush.com has a 5% click-through rate, and re-allocates all the budget to that channel.
It’s hard to estimate the scale of this fraud, since it’s a criminal enterprise, but the most conservative estimate from the ANA is around $6 billion every year.
So we’re wasting $7 billion targeting the wrong people, and $6 billion targeting non-people. With all that wastage, you’d think marketers would be talking about this more often, and working hard to solve the problem. But I went to about 30 marketing conferences last year, and guess how many sessions I saw on the perils of fake data and fake websites? Zero!
Meanwhile, I sat through about a trillion sessions on the arrival of 1:1 personalization at scale. But tactics like personalization are based on a very naive assumption. An assumption that the data is all good. And the data is not all good. As we’ve seen, the data is mostly bad.
And the weirdest part is that we all know its bad. What percent of B2B marketers are confident in the accuracy of their own data? 12%, according to Forrester. And these are the same B2B marketers up on stage extoling the benefits of personalized creative messaging.
If I were a psychologist, I would call this “cognitive dissonance.”
Since I’m not a psychologist, I’m just going to call it “totally cray-cray.”
These days, whenever I hear an ad tech company talking about the power of their 3rd party B2B data, I start thinking about my favorite movie --- The Big Short. Did you see The Big Short? Do you remember that scene with the Jenga pieces, where Ryan Gosling explains the financial crisis? According to Ryan Gosling, one of the most gifted macro-economists on the Planet Earth, the 2008 crisis happened because someone took D-rated loans, bundled them together with Tripe-A loans, and re-sold the whole package as Triple-A.
That is eerily similar to what’s happening in the marketing industry. We are taking fake data, bundling it together with real data, and selling it as big data. I’ve seen this movie before, and it doesn’t have a happy ending --- except for the contrarians, of course. Between GDPR, and Apple and Google wiping or deprecating cookies altogether, I think it’s hard to imagine that this story ends with a perfect, unified understanding of the customer and 1:1 personalization.
I don’t think this story ends with better 3rd party data.
It ends with no 3rd party data. It ends with a Sub-Prime Data Crisis.
So how can you as a marketer limit your exposure to the Sub-Prime Data Crisis? As I said, B2B marketing doesn’t work without data. We need data to reach the right B2B buyers. But we need to do a better job distinguishing between real data and fake data. How?
Well, here I think it’s instructive to study an entirely different discipline outside of marketing: supply chain management. Our summer intern Rachel made the connection here (thank you, Rachel). Rachel told us that decades ago, companies started paying more attention to their supply chains in search of efficiencies. And those companies discovered that the easiest solution was to focus on shortening their supply chains. The more complex the supply chain, the more opportunities there are for wastage. Big sums of money get lost to fraud or human error.
If marketers want better data, we need to work on shortening our supply chains.
Here are two steps you take to shorten your supply chain and increase its efficiency:
Step 1) Choose direct data over indirect data. The more hands your data passes through, the more times one data set gets combined with another data set, the less accurate the data becomes. If your vendor can’t explain exactly where the data comes from in a single sentence, I’d consider that a real red flag. LinkedIn data, for example, comes from LinkedIn profiles, and you can go verify those profiles with your own eyes. It’s not a black box. It’s an open box.
Step 2) Choose direct payment over indirect payment. If you buy media on a billion sites, if you are paying someone who is paying someone who is paying someone, some enterprising criminal will find a way to take your money. Instead, buy directly from reputable publishers that you have heard of --- there are many, many such publishers out there.
Direct data, and direct payment. Two steps you can take to avert the Sub-Prime Data Crisis.
The Subprime Data Crisis