How to Determine Sample Size in A/B Testing
Marketing experimentation (also known as A/B testing or split-testing) should be cornerstone of your digital marketing efforts today. Determining the right sample size is crucial to ensure you are making decisions based on statistically valid results.
Effective A/B testing requires that you present your marketing experiments to enough people (a.k.a. your sample size) to validate your results. So how do you go about calculating sample size in order to get statistically significant results? It depends on four key factors we’ve outlined below.
There Is No Generic Answer to Determine Sample Size
Before we get into exactly how you can determine sample size for A/B testing, it’s important to clarify one thing. There is no generic answer to what you’re looking for. We can’t just give you a number and tell you to stick to it whatever your experiment is. Even if we do give you a fixed number, there’s a pretty high chance that the data you gain from it will hold no value whatsoever.
In order to determine sample size, you need to get your hands dirty and work with some math. There is a technical formula that allows you to figure out the exact size of the population necessary to validate your testing. The formula used to calculate this size depends on four factors.
You can calculate your conversion rate by dividing the of people that are convinced to purchase by the number of people to whom you presented your product/service. It’s one of the most important factors of digital marketing and the baseline to measure your marketing funnel performance.
Conversion rates can be calculated to give you a sense of your overall visitor-to-customer ratio. For example if every 2 out of 100 website visitors makes a purchase, your conversion rate is 2%. Conversion rates can also be broken down into smaller bit-sized actions within your marketing funnel. For example, if 50 out of 100 website visitors click the Add to Cart button, that is a 50% conversion. In this case you’d want to focus your A/B experiments on these in-between parts of the funnel. In other words, test your Cart page to see if you can increase the number of people who complete a full checkout after Adding to Cart. By increasing the conversion of people complete a to see if you can increase your overall conversion rate.
Minimum Detectable Effect
The Minimum Detectable Effect, or simply MDE, is a percentage used to denote the difference in percentage. Every organization is expecting some type of improvement from the A/B testing they’re about to run. The MDE is a simple representation of those expectations in the form of a percentage. If you’re looking to improve the conversion rate by 20% from what it is right now, 20% is your MDE.
This is a term used in statistics to calculate the number of results from a certain pool of results that did not occur randomly or by chance. It basically helps you understand if you can depend on the results of an A/B Test.
For instance, if the testing has a significant level of 90%, this means that the choice you select after analyzing the results of the test has a 90% probability of being actual results instead of being random. In other words, this also means that there is a 10% chance that the choice you’re making is the wrong one. Statistical significance is another name for reliability, and it can make or break the importance of your A/B testing results (read more about statistical significance in our blog post here).
Statistical Power is another term that is frequently used in the world of statistics. It basically explains the probability that the experiment or test in question will reach a result. Statistical power does not discuss the factor of right or wrong. It only explains the chances of reaching a result from the two in your sample pool.
Duration of Your A/B Testing
Similar to your sample size, there’s no solid answer to how long your experiment should run for. It depends on a lot of the same factors as the sample size formula we talked about above. It does require some extra values like the number of average daily visitors to your site and the percentage of visitors that are going to be a part of the test. To calculate this number, you can use our duration calculator to conveniently come up with a time span for your A/B testing.
Best Sample Size Calculators
If you’re looking for sample size calculators to take care of the math for you, there are more than a few decent options. Here are a couple of sample size calculators that we like to use.
AB Tasty: AB Tasty is probably the most popular calculation tool for A/B Testing sites on the internet. It not only provides the resources to calculate the right sample size, but also offers a few more neat tools to assist your experimental ventures. You can calculate MDE on a landing page by using the number of visitors and the conversion rate of your site. Apart from that, it also gives an insight into how long your test should run. All you need to offer is an estimate of your daily visitors and the number of variations. Check out AB Tast sample size calculator here.
Optimizely: Optimizely’s A/B Test Size Calculator uses a unique two-tailed sequential likelihood ratio test to calculate and reveal the statistical significance of the test. Unlike AB Tasty, it doesn’t just tell you the results. It also tells you how long the tests should run to finally achieve a higher percentage of statistical significance. Check out Optimizely sample size calculator here.
As small businesses have less traffic yielding less of an impact, they are able to experiment more freely learning through trial and error what works and what doesn’t. However, when it comes to large conglomerates, they don’t really have that luxury. Either they make the right choice, or they lose millions of dollars.
Your marketing isn’t just something you can put at risk. You have to make sure that whatever decision you make is the right one. If you want the help of a professional to boost your marketing, Digital Dames is here to support you. We’re conversion rate optimization agency focused on improving revenue through experimentation.