Thomas Green here with Ethical Marketing Service. Today I would like to go through the split test with you.
Previously I have talked about what to write in your ads. And the formula for the copy in your landing pages. But if you write two ads or create two landing pages, how do you know which one to use? Ideally you would want to remove any bias, any guesswork or opinion from the equation.
The answer is you would split test them and see which one performs best. This can also be known as an A/B test. Or if you had 3 ads you wanted to test, an A/B/C test (and so on).
In a split test, typically there will be one variable you would change, this is called incremental testing. An example might be ad copy which says “Call Now” vs ad copy. Which says “Call Now For Your Free Audit”. When you test this ad copy, you would be determining which one of these two gets the highest response, based on the data that is generated.
But what happens if you test more than one thing at a time?
There is a feature in Google Ads called a responsive display ad where you can test many images, videos, headlines and descriptions at once. This is referred to as a Multivariate Test.
Because you are not testing increments like my previous example. You may not know why this particular combination works best. But you know that it does of the other combinations tested. Based on the new results from your multivariate test, you now have a new control (or winner) that you can now experiment against.
In order for you to determine if you have a “winner”, you would need the results to be statistically significant. Google’s definition of this is:
“The probability that the differences observed in ad performance during an ad copy test, are not due to random variance. The ads in the test truly differ in quality.”
Because it is Google, they mention ads which is accurate, but testing extends to many areas of business. Whenever you are testing. If you don’t have a big enough sample size, the chances of getting meaningful conclusions significantly reduces. You often see this when people quote a particular study to provide proof for something. But when you look at the numbers within the study, there were only 11 people who took part. There is no way you could use that sample size to make any conclusions in your data. So be sure your sample size is big enough if you are going to use it to make any changes in your account.
So what would happen if you ran a test and you weren’t 100% convinced of the results you got?
In this instance, you would attempt to reproduce your results. If you ran the test again and got near enough the same result. It would be much more conclusive that the first result was an accurate one. This is also a principle which is used in science, as soon as a conclusion is reached with a test. It needs to be reproduced many more times before it can be considered factual. I have spoken previously about natural fluctuation in your data and if you want to know more about that. I will link that video in the description.
Typically Google’s experiments are set to 30 days, but timeframe is less important than sample size. The more users who were exposed to an experiment, the more reliable it is. So if you want to be sure about the results of a test, make it longer than the typical timeframe so you can include more people.
A very useful activity when you have finished your test, is calculating the yearly impact of it. If you have a finished test where it has resulted in an extra £1000 revenue. That means that the yearly impact is projected at £12,000 additional revenue from your one test. These numbers start looking a lot higher with bigger businesses. But the more you do the yearly impact, the more worthwhile you can clearly see your work is.