Introduction

A/B Testing
Image by Joseph Mucira from Pixabay

Statistics and A/B testing are some of the most fundamental questions asked in data science and analytics interviews. Today we’ll go into how to approach, solve, and prepare for different statistics and A/B testing interview questions.

Why are statistics and A/B testing important in data science interviews?

If you’re applying for a data scientist position at almost any company, A/B testing and statistics questions are bound to come up in the interview process at least once.

Most companies hiring for data scientist roles use the Internet to approach at least part of their customer base. In some cases, such as social media companies, the internet is the primary or sole point of contact with the customer.

Since A/B testing is frequently involved in the decision-making process for any design changes in an online interface, companies want to make sure that the data scientist they’re hiring will be the key decision-maker when it comes to the A/B testing process.

Statistics on Computer Screen
Image by Tumisu from Pixabay

Many companies expect familiarity with the core concepts of A/B testing to be accompanied by a complementary background in statistics. They want to know that you have the statistical chops to handle things like multivariate testing, or comparing two population samples of unequal size, or any number of issues that can crop up as the basic A/B testing process becomes complicated by the real world.

What is A/B Testing?

A/B testing, also known as “bucket testing” or “split testing,” comes from a user experience research methodology. A/B testing consists of a randomized experiment with two variants named (as you might expect) A and B. The two variants are usually similar except for one feature difference that allows us to test user behavior.

The goal of A/B testing is to figure out whether that variation has a significant impact on user behavior and how significant the impact is.

Modern A/B testing is a relatively new phenomenon. Although the discipline of statistics has a long history, it simply wasn’t possible to A/B test on a large scale with accurate data collection until the Internet provided a structure for the rapid (and mostly unseen) collection of data. That means that A/B testing is still a field in its infancy. Over time we will test the limits of what’s possible in terms of data collection and testing methodology.

How much statistics do I need to know?

Well, this is a bit of a trickier question. The quick and easy answer is: it depends.

Our data shows that statistics and A/B testing interview questions came up frequently in interviews at Facebook, Google, Amazon, Microsoft, Capital One, and IBM.

Let's go into which roles have statistics and A/B testing questions showing up in the interview process.

Data Scientist

At Interview Query, we broke down all the data scientist interviews into different question types.

Facebook Data Science Statistics / AB Testing Stats
Facebook Data Scientist Overview

As you can see, Facebook tests their candidates less than the average data scientists on statistics interview questions. However we know that at least one round on the onsite is filled either a statistics or probability interview question.

On the other hand, look at the interview breakdown for the data scientist position at Google:

Google Data Scientist Statistics / AB Testing Emphasis
Google Data Scientist Overview

That’s a lot of emphasis on A/B testing and statistics. And if we look at the priorities for the average company, we find that A/B testing is emphasized more often than not.

As for statistics, well, companies want to know that the data scientists they hire are well versed in statistical concepts so that when there’s a hiccup in the A/B testing process, their hire has a sizable toolkit with which to troubleshoot the issue. Was there a large enough sample size to A/B test successfully? If not, by how much did we miss the mark? Are the results of our A/B test statistically significant? How significant are they?

You can see how much overlap there is between the domains of A/B testing and statistics. Employers just want to make sure that the data scientist that they hire is the complete package, not someone over-optimized in one direction. They want to feel like they’re in good hands when it comes to the complex, nitty-gritty parts of the A/B test where the layperson may find himself completely out of his depth.

Data Science Interview Skillset
Data Science Interview Skillset

That might be why our 2021 Data Science Interview Report found that statistics ranked in the top three most frequently asked categories of questions in the data science interview, making it a must-have in your data science repertoire.

Types of A/B Testing / Statistics Interview Questions

In general, the sorts of questions you’ll encounter on the subjects of A/B Testing and Statistics can be separated into three categories:

The A/B Testing Case Study Question

This type of A/B testing interview question revolves around a hypothetical A/B testing scenario. The goal is to evaluate candidates based on their practical working knowledge of how to design a functional A/B test. You may be given a specific set of two (or more) features that a business wants to compare in an A/B test, then asked how you would go about setting up the test, account for confounding variables, and measure the significance of your results.

Example Question:

A team wants to A/B test multiple different changes through a sign-up funnel.
For example, on a page, a button is currently red and at the top of the page. They want to see if changing a button from red to blue and/or from the top of the page to the bottom of the page will increase click-through.
How would you set up this test?

Here’s a hint:

The prompt gives two examples for possible changes to the button: changing the color and/or moving its location.
Given that this is no longer a question about singular changes, what decisions do you need to make about the type of test you’re running?
Think about the fact that you’re testing two independent variables– what do you need to account for between them in your test?
Try this interview question on your own on Interview Query.

We want to approach case study questions like this with a clear understanding of the process of A/B test experiment design. We want to take into account all of the different factors we should consider when designing an A/B test. The more that we can demonstrate a thorough understanding of the scope of the problem, the more attractive we seem as a candidate. For example, we might consider:

  • What does our sample population look like?
  • Have we taken steps to ensure that our control and test groups are truly randomized?
  • Is this a multivariate A/B test? If so, how does that affect the significance of our results?

In general, we can follow the process of A/B test experiment design:

  • Setting Metrics
  • Constructing Thresholds
  • Sample Size and Experiment Length
  • Randomization and Assignment

Let’s walk through the process.

Setting Metrics

The first question you’ll ask when designing an A/B test is: what am I trying to measure?

A good metric is simple, directly related to the goal at hand, and quantifiable. Every experiment should have one key metric. That is, there should be one thing that you’re measuring that determines whether the experiment was a success or not.

Constructing Thresholds

In this step of experiment design, you have to determine by what degree your key metric must change in order for the experiment to be considered successful. Typically, the significance level of an experiment is 0.05 and the power is 0.8, but these values may change depending on how much change you feel you need to detect in order to implement the design change, which can be related to external factors such as the time needed to implement the change once the decision has been made.

For instance, if a change that you’re making on a website would take four months for engineers to implement, you may choose a higher significance level to justify making the change, since so much time and labor is involved in the process.

Sample Size and Experiment Length

This step in the process concerns two things: how large of a group are we going to test on and for how long?

You’ll set a unit of action, maybe people or, more commonly, users, and then calculate a sample size that is sufficient to represent the larger community in which the eventual change will take place (or not).

Finally, you’ll set your experiment length, which is a function of sample size, since you’ll need enough time to run your experiment on X users per day until you reach your total sample size. However, time introduces variance into an A/B test; there may be factors present one week that aren’t present in another, like holidays, or weekdays vs. weekends.

The rule of thumb is to run your experiment for about two weeks, provided you can reach your required sample size in that time.

Randomization and Assignment

Finally, we’re going to answer the last questions we’ll face when implementing an A/B test (aside from the logistics of implementation): who gets which version of the test and when?

In any A/B test, we need at least one control group and one variant group. As the number of variants we’re testing increases, the number of groups that we need increases, too, which is something to keep in mind for multivariate testing.

We need to make sure that we have a normal distribution of users with a wide variety of attributes to make sure the results of our A/B test are valid; if we don’t randomize sufficiently, we may find ourselves faced with confounding variables further down the line.

It also matters exactly when we’re giving our users an A/B test.

For instance, are we giving every new user an A/B test? How will that affect our assessment of existing users? Conversely, if we’re assigning an A/B test to all users, and some of those users signed up for the website this week, and others have been around for much longer, are we sure that the ratio of new users to existing users is representative of the larger population of the site?

Finally, we want to make sure that our control group and our variant group are of equal size so that they can be easily (and accurately) compared at the end of the test.

Okay, so, now you know the basic form of the A/B testing case study, but what now? It can be exceptionally difficult to study for this type of question because there are almost infinite variations of the basic problem that a company can throw at you. You could refresh the fundamentals continuously, but will that prepare you for the sorts of edge cases that companies love to throw at their prospective hires?

On Interview Query, we offer you the opportunity to practice real interview questions from real companies on subjects like A/B testing and statistics. You know that what you practice on Interview Query is relevant experience for your next interview because every single question on the site has been sourced from an actual interview at companies like Facebook, Google, Amazon, and more.

Statistical Concepts Interview Questions

Statistics GUI
Image by janjf93 from Pixabay

The next type of question that you’re likely to encounter during the Statistics and A/B testing interview process is the statistical concepts interview question. This type of question is testing two things:

  • Your conceptual knowledge of statistics
  • Your ability to communicate statistical information to a layperson

Since, as a data scientist, you both need to be able to perform complex analyses on large amounts of data and also communicate your findings to a number of external stakeholders, questions like these weed out both candidates who don’t have a firm technical grasp of statistics as well as candidates who may grasp statistics, but struggle to communicate their findings to others.

You might be asked to describe the difference between Type I and Type II errors (the first is a false positive; the second is a false negative) and how you go about detecting them, or describe what a result with a significance level of 0.05 actually means.

Example Question:

How would you explain what a p-value is to someone who is not technical?

Here’s a hint:

P-value and confidence interval are both concepts that come from the statistics.
First, why does this kind of question matter?
What an interviewer is looking here is can you answer this question in a way that both conveys your understanding of statistics but can also answer a question from a non-technical worker that doesn't understand why a P-value might matter.
For example, if you were a data scientist and explained to a PM that the ad campaign test has a .08 p-value, why should the PM care about this number?
Try this interview question for yourself on Interview Query.

In general, if you’re familiar with the statistical concepts that are likely to be relevant to your job as a data scientist, this part of the interview should be pretty straightforward. Remember to keep your explanations simple, direct, and to-the-point.

If you need to practice this type of problem, there’s no need to crack open your old statistics textbook. Instead, you can head to Interview Query, where we have hundreds of interview questions, ranging in difficulty from Easy to Hard, on subjects like statistics, probability, and more. Every interview question on Interview Query has already appeared in an interview somewhere, so you know that you’re not wasting your time covering material that won’t appear during any interview at any company.

Statistics and Probability Problems

Dice for Probability Problems
Photo by Nick Fewings on Unsplash

The last type of Statistics & A/B Testing question you’ll be asked is to solve statistics questions ranging in complexity from very simple to very difficult.

These questions are simply meant to assess your core capabilities as a statistician and tend to be pretty straightforward with a definite right or wrong answer. You might be asked to compute mean and variance in a non-normal distribution, or to compute the relationship between a sample size and its margin of error. There’s no limit to the number of questions or their variety.

Example Question:

Given uniform distributions X and Y and the mean 0 and standard deviation 1 for both, what’s the probability of 2X > Y?

Here’s a hint:

Given that X and Y both have a mean of 0 and a standard deviation of 1, what does that indicate for the distributions of X and Y?
What are some of the scenarios we can imagine when we randomly sample from each distribution?
Write out each of the possibilities where X > Y and X < Y, as well as possible values of X and Y in each.
Try this interview question for yourself on Interview Query.

When answering this type of question, make sure that you present not only a solution to the problem at hand, but also walk the interviewer through the thought process you used when arriving at your answer. Cite any relevant statistical concepts that you’re employing in your solution and, in general, make your response as intelligible as possible for the layperson.

If you’re looking for resources to help you practice statistics problems, look no further than Interview Query, where we have hundreds of real interview questions from real companies, waiting for you to practice. We’ve broken our questions down by role, so that you can practice only those questions that are relevant to your current job search. You can practice statistics problems on Interview Query today.

Statistics & A/B Testing Concepts Review

What are the primacy and novelty effects?

The primacy effect involves users being resistant to change, while the novelty effect involves users becoming temporarily excited by new things.

What is a null hypothesis?

There is no significant difference between populations that can’t be explained by chance or sampling error.

What are Type I and Type II errors?

A Type I error is a ‘false positive,’ or the rejection of a true null hypothesis. A Type II error is a ‘false negative,’ or the failure to reject a true null hypothesis.

What is a holdback experiment?

A holdback experiment rolls out a feature to a high proportion of users, but holds back the remaining percent of users to monitor their behavior over a longer period of time. This allows analysts to quantify the ‘lift’ of a change over a longer amount of time.

What is the Central Limit Theorem?

The central limit theorem, or CLT, says that if we have a very large sample of independent variables, they will eventually become normally distributed.

What is a normal distribution?

Normal Statistical Distribution
Image from Pixabay

Most people probably recognize this as the ordinary “bell curve” distribution, in which the relevant features of a given group are evenly distributed around the mean of the curve.

Conclusion

Above, we’ve given you a bit of a primer for the challenges of the Statistics & A/B testing interview process. However, if you’re looking for even more resources to prepare yourself, we have a full Statistics & A/B Testing course available on Interview Query, along with a bank of real interview questions asked by real companies, ordered by recency and difficulty. Take the next step on your journey to becoming a full-fledged data scientist and join with your free account today!