A/B Testing, also known as split testing, is a scientific method of optimizing a website or app's performance by comparing two versions against each other to determine which one produces the best results. This could be cost indicator metrics such as cost per click, cost per video view, cost per message sent and cost per mille.
What Is A/B Testing?
At its core, A/B Testing is a scientific method used to compare two different versions of something—in this case, a website, ad campaign, app, or business process —to determine which one produces the highest rate of success.
A/B Testing is essentially a research methodology – a randomized controlled trial, in scientific parlance – to quantify the effect of making an intervention to an experience. While many people look at A/B Testing as something you mainly do to landing pages, it can be used almost anywhere in a business to optimize your existing experiences.
The goal is to find out which version performs better (version “A” or “B”) based on specific goals and metrics.
Why Is A/B Testing Important?
A/B Testing is an invaluable tool for any business that hopes to make smarter decisions based on user behavior. It helps marketers, product managers, and analysts identify areas that need improvement that will have the biggest effect on their bottom line.
Every decision you make includes some degree of uncertainty. In some cases, the right action is obvious. For example, if web developers fix a bug on their website, it's unlikely to harm the user experience. But for the majority of actions and ideas, there's no certainty what the outcome will be. Instead of relying on instinct, which is notoriously unreliable even among experts, marketers can run a simple, easy, and cheap A/B test to quantify what will be most successful.
When A/B Testing is embedded in an organization’s DNA, companies tend to experience greater innovation since the risk of shipping bad ideas decreases. They also experience higher ROI, as negative ideas can be scrapped before being sent to 100% of their audience.
Finally, companies that experiment tend to have better working relationships and collaboration, since data is the deciding factor in shipping new experience, not political battles or opinions.
A/B Testing doesn't reduce or eliminate the need for human creativity; rather it allows teams to focus on customer research, innovation and ideation instead of spending effort persuading stakeholders that an idea is worth implementing.
- Uncover insights that inform decisions and optimize current experiences.
- Improve user engagement, ad creative, website performance, and overall customer experience.
- Identify areas of improvement that will have the biggest impact on the bottom line.
- Test hypotheses quickly and cost-effectively.
- Increase ROI by finding what resonates most with customers.
- Reduce risk of shipping bad ideas by running experiments and trials.
The 4 Common Types of A/B Testing
There are four main types of A/B Testing:
1. A/A testing
2. A/B Testing
3. A/B/n testing
4. Multivariate testing
On a technical note, there are many different types of experiments beyond the four most commonly used.
For example, machine learning algorithms such as bandit algorithms and evolutionary algorithms adaptively learn which variant is winning and allocate traffic in real-time to the winner. Quasi-experiments and design of experiments allow you to test elements when you can’t completely isolate your variables or control your sample allocation perfectly.
A/A testing is used to measure the performance baseline, or “control” version, of a website or app. By splitting traffic 50/50 with an identical variant for both groups, marketers and product managers can better understand user behavior on the control version before making any changes. This process also helps identify variations in website traffic, as well as any randomization errors in experimentation software.
A/B Testing compares two versions of a website or app or experiment – A (the “original” or “control”) and B (the “variant” or “challenger”). This allows marketers to measure user behavior on both versions and make decisions based on the results using statistics (typically a t-test or chi squared test).
A/B/n tests split traffic equally among variants. In such experiments, marketers can create more than one variant to test – for example testing their website's landing page CTA color between different hues. Instead of the standard A/B test with two variants, an A/B/n test could feature several variables, such as a comparison of blue vs orange vs green vs black. This is an effective way to maximize testing potential and identify optimal conditions.
Multivariate tests, in comparison to A/B tests, tests multiple variables of an experience all at once. For example, if a marketer wanted to assess a headline, CTA copy and CTA color, they could construct individual A/B tests for each variable or design one experiment that compares the original version to variants of each element.
Multivariate experiments can be used to quantify interaction effects between elements. For instance, combining headline A with CTA color B may beat headline B in isolation, helping to craft an optimal user experience where no single element acts alone.
A/B Testing is commonly used to improve metrics, but it also serves other purposes - such as ensuring an implementation doesn't have a negative effect or verifying an element's relevance for customer experience.
How Do You Get Started With A/B Testing?
Prior to engaging in A/B Testing, website or product managers should determine their desired key performance indicator (KPI) and assess if their current traffic is sufficient. In many cases, websites don’t get enough traffic to run a valid and statistically significant experiment.
A/B Testing follows a scientific approach, it is subject to statistical constraints - for example, the law of large numbers. Utilizing an A/B Testing calculator can help identify the necessary effect size (percentage improvement) needed to reach statistical significance (generally determined by a p-value under .05).
Because large improvements are infrequent, they will often need a higher traffic count to detect smaller gains. More traffic makes it easier to detect smaller wins, and thus, the easier it is to run A/B tests.
If there is sufficient traffic, marketers need two things to start running informed A/B tests: technology and people.
To run an A/B test, it’s necessary to have the right measurement and experimentation technology. Most ad tools and campaign management systems (CMS) have analytics that track users and conversions, while A/B Testing tools randomize traffic and help determine significance.
Many ad tools have built-in A/B Testing functionality. This is easy to do with the LinkedIn Campaign Manager, for example.
When testing website elements, a suitable CMS or A/B Testing tool is required - with many options available at varying price points.
Don’t underestimate the resources required to run an A/B test. Even basic A/B tests might need:
- A product manager / marketer to manage the test
- An analyst to measure the results
- A designer to design the creative
- A copywriter to write new copy
- An engineer to implement the variant in your A/B Testing tool
Simple changes like button color tests require fewer resources, but more complex tests, such as those involving product onboarding may also require cross-functional alignment and additional engineering support.
Steps to Implement an A/B Test
Step 1 - Identifying an opportunity area
Marketers and marketing leaders should assess the KPI they're trying to improve.
In some cases, this is obvious; in others it requires customer research to identify user experience bottlenecks.
Quantitative analysis of funnels can help pinpoint any sub-optimal steps. For example: a SaaS marketer may notice many visitors convert to a free trial, but few go on to become paying customers.
Alternatively, qualitative research such as surveys can illuminate the specific issues that need addressing. For example: research may reveal free trial users didn’t understand the product well-enough because the onboarding sequence didn’t answer their questions.
These insights may be used to form hypotheses for variants that could solve the problem.
Step 2 - Establishing a hypothesis
It is best practice to write a hypothesis before running an experiment. This helps define the variable to be changed, outlines a rationale for the treatment and predicts potential obstacles.
Include quantitative and qualitative data that backs up the reasoning for running your experiment.
Quantitative and qualitative data should back up the reason for the hypothesis.
A suitable format might be:
We believe X because Y.
For example, a hypothesis from the scenario where free trials aren’t converting to leads might be: “We believe free trials aren’t converting to customers because the product doesn’t demonstrate how to use it early on.”
Step 3 - Designing the experiment
The scope of the test will determine what resources you’ll need to design it. This might involve:
- Writing new copy
- Designing new elements
- Develop new functionality
Multiple team members may be involved for the test to be run successfully.
Processes will vary depending on the idea being proposed, but the first step is often creating a wireframe – a rough mockup of the new experience. Finally, it's coded into the testing tool.
Step 4 - Implementing the test
After designing an experiment and completing quality assurance, it’s time to launch the test; 50% of the traffic should go to the original and 50% to the variant, aiming for 95% statistical confidence.
A great analyst can help adjust thresholds according to risk limits. Monitoring the test for tracking bugs or issues post-release is recommended within a few hours or days.
Step 5 - Analyzing the results
Analyzing the results of an experiment depends on its design.
Most A/B testing tools include a statistical engine to help with analysis.
If the variant wins, ship it live to the full audience. But if it lost or was neutral, they may have to go back and consider a different design or treatment.