In order to reduce the risk and uncertainty of new ideas, it’s important to test them by running experiments. However, the experiments we run must be well designed so they can produce strong evidence. The starting point for designing strong experiments is being explicit about what we are trying to learn. A clear and well stated hypothesis that is testable and precise is at the heart of everything. With a clear hypothesis in hand, we can then think about how to design experiments that will provide strong evidence for what we are trying to learn.
So how do we make sure that we are designing experiments that provide strong evidence? In this post we highlight two key factors; selecting the right participants and designing a well-crafted artefact.
One challenge we have noticed is that innovation teams test their ideas with vaguely defined customer segments. Using such a broad sample of participants will produce weak evidence. Ask yourself, which customer segment are you really targeting with your business idea? This might seem obvious, but if your business is targeting young people, over 18 years old who are still at university, then this is who you should focus on. You have to choose a sample of test subjects that is relevant for the success of the business idea you are testing.
Making the customer segment clear and explicit, will ensure that everybody on our team is running the experiment using the same test subjects. Teams often assume they are talking about the same customer segment but they often have different people in their minds. That is why it is important to make sure that the team has a shared understanding of who we are going to run our experiment with.
We need to make sure that our participants are representative of the customer segment we are targeting. In our example above, you could correctly target the right participants by testing your idea with young people, over 18 years old who are still at university. However, if your sample is made up of just one gender (e.g. females) or people of the same age (e.g. 19 year olds), then it may not be representative of the customer segment your business is targeting.
The Strategyzer Testing Course
Each experiment comes with its own set of best practices to design for stronger evidence. Here we have just covered a couple of examples. In addition, there are more best practices you can learn and apply to testing your business ideas. If you are interested in learning more about testing, we’ve just released a batch of new content included in our Testing Course: Testing Business Ideas
Finally, the size of the sample that we use is also important. Small sample sizes generally produce unreliable evidence. Within the social sciences, a general rule of thumb is to use samples of over 30 participants. However, if we are working on an idea that sells to businesses rather than consumers, we may have a small group of people we can talk to. So if you are serving an industry with 25 companies in total, testing your idea with 50% of your customers would be a great achievement.
Our general advice is to have a large enough and representative sample of the right participants. Once you know who you want to target, you can find these customers by asking qualifying screener questions, before you begin testing your business idea.
Well crafted artefact
The second factor to consider when designing strong experiments is a well-crafted artefact. It is important to recognize that an experiment and the artefact that is used to run that experiment are not the same thing. If we are running an A/B test, the two versions of the landing page we create are the artefact. In a customer interview, the artefact is the interview guide we will use and the questions we plan to ask. The quality of the design of these artefacts will impact the strength of the evidence.
When designing artefacts, teams often think they just need to build a smaller version of the product they intend to build. This is not the correct approach to take. It is important to focus on what we are trying to learn (i.e. the hypothesis). We then need to design our artefact with the single goal of getting to that learning. We need to view our artefact as a device that allows us to test our hypothesis and learn.
If we just build a smaller version of the product we intend to build, it is highly likely that we will be introducing too many variables into our experiment. If there are too many variables in one experiment, it becomes difficult to figure out exactly what is driving customer behaviour and what we’re actually learning. For example, when running an A/B test, it is important that we keep everything on the two landing pages the same - except the variable we are interested in testing. If we change too many parameters, we will not be able to know which of the changes had an impact on customer behaviour.
Design for strong evidence
When we choose the experiments to run, we need to consider the fact that different types of experiments produce evidence of different strength. For example, evidence from customer interviews can be considered weaker than evidence from A/B tests. This is because customer interviews measure what people say, whereas A/B tests measure what people do.
However, it is possible to take an experiment that produces strong evidence, such as an A/B test, and design it poorly, in a way that weakens the evidence. It is also possible to take an experiment that produces weak evidence, such as customer interviews, and design it in a way that strengthens the evidence.
When it comes to running customers interviews, asking for facts rather than opinions creates stronger evidence. So rather than asking people what they think they might do in the future, it is better to ask about their past behaviour. It is also important to avoid questions that may lead customers to give you positive feedback about your idea (e.g. “Do you like our design for the App?”). Instead, focus on customers jobs-to-be-done and the problems they are trying to solve in their lives (e.g. “What are things you struggle with?”).
Finally, we can strengthen our evidence by running more than one type of experiment to test our hypothesis. In the early stages of an innovation project, we can run cheaper experiments that produce relatively weak evidence. As we get better signals, we can then run more expensive experiments that produce stronger evidence. The goal here is to run the right experiment at the right time.