Statistical Significance: Landing Page Testing and Ad Testing

I was browsing search marketing news articles today and after reading about several recent studies, I thought about how easy it is to get lost in the data. As search marketers, we are (and if you are not, you should be) continuously testing. We test ad copy, landing pages, keywords, etc. With the multitude of all these tests it is easy to get lost in the numbers and to make decisions based on incomplete data. There are two fundamental mistakes to look out for when evaluating your experiment data. First, is mistaking correlation for causation. For example, it is very easy to make an ad change, or make target new keywords, and then make a decision based on the results that you see. However, it is important to remember that there are many external factors that can affect performance (e.g. seasonality, site changes or updates, other advertising initiatives running at the same time, etc.) This is why it is important to use standard scientific testing practice, where you have a control group. Adwords makes this very easy to do.

AdWords Experiments Beta
This particular experiment didn’t have a positive result.AdWords allows you to easily set up a control group, which will prevent any external biases.
  Second, is not waiting for experiment results to be statistically significant. It’s easy to launch a new ad and if in a few days click-through rates are higher than the original ad, you pause the original.  However, were there enough impressions and enough clicks to show that the CTR increase was statistically significant?  Again, in the above image, the blue down arrow next to the CTR tells us that the decrease was statistically significant. Another scenario where it is easy to make a decision too early, is landing page testing.
Webiste Optimizer Test
If the decision were made in May, the conversion rate improvement would have been likely overstated.
If the decision was made too early, then we might have made the wrong decision, because in the first week the conversion rate improvement was huge. However, as you can see long-term, there is no statistically significant winner.  In a test case like this, you will need to make a judgement call, even without the final result, because it is so close.  If the margin of error was much higher, then it would be best to pause the experiment and run a follow-up to verify the hypothesis. In conclusion, make sure you are properly evaluating your test data, so that you avoid making the wrong decisions.




SUITE 2000

PHONE: 612.392.2427