Incrementality testing: more budget doesn’t guarantee more learnings
While a good amount of testing has always been critical to a well-run paid media program, the very type of testing has changed dramatically over the past few years. With the deprecation of third-party cookies and the limited reliability of traditional click-based tracking, the focus has changed from testing specific ads or landing pages using simple statistical tests, to testing entire initiatives and evaluating their causal impact using more sophisticated methods.
One of those more advanced methods we use at Blackbird PPC is location-based incrementality testing. More specifically, we measure the impact of the ads we manage in those locations where we run ads vs. locations where we don’t. The beauty of this test design is that it does not rely on click-based tracking and therefore does not suffer from click-based tracking limitations. Instead, we expect to detect a spike in activity in those test locations based on backend data - typically Salesforce, Shopify, Hubspot, Looker, etc..
This is great, but how do you determine which locations should be part of the test group, and what kind of budget should be allocated to such an experiment? And does more test budget guarantee strong learnings?
Which locations should be part of the test group?
You should always select test locations that are statistically similar to the control locations going in so that any to-be-detected effect can be attributed to your intervention. The below example using Google’s CausalImpact R library shows that recent data looks alike across groups.
Fig. 1: State-level split example
Fig. 2: No statistical difference across groups going in
Does more test budget guarantee stronger learnings?
In general, yes, there is a correlation between more test budget, and more statistical power. Not always though!
As you can see in the below example for a major US retailer, more test budget is generally associated with greater statistical power, however some test groups are likely to drive high statistical power while keeping test budget to a minimum. When looking at the $60,000-$70,000 range, you’ll notice that statistical power ranges all the way from 69% up to 82%, depending on the selection of test locations.
Fig. 3: Statistical power vs. cost scatter plot example
This means that strong statistical learnings are hardly a function of the test budget. Instead, strong learnings, that is results that can be trusted and built upon, require solid test design.
At Blackbird PPC, we run a lot of location-based incrementality experiments, so feel free to reach out if you’re interested in learning more.