Blogger: Larry Cannell
How do you know if a change to a website will work as intended? This paper from Microsoft provides a number of interesting examples of how online experiments - which measure the impact of various (often subtle) changes to a website - can prove, or disprove, assumptions made by designers. In addition to the formal paper there is also a set of PowerPoint slides available that includes a quiz. See how your instincts compare to the results of their experiments (which changes made a difference and which did not):
“Over the last three years, we built an experimentation platform system (ExP) at Microsoft, capable of running and analyzing controlled experiments on web sites and services. Experiments ran on 18 Microsoft properties, including MSN® home pages in several countries (e.g., US, UK, Brazil), MSN Money, MSN Real Estate, www.microsoft.com, store.microsoft.com, support.microsoft.com, Office Online, www.xbox.com, several marketing sites, and Windows Genuine Advantage. In terms of scale, the larger experiments run with tens of millions of users, one ran with over 100M users when we need a lot of statistical power to detect a very small but critical effect.”
…
“In this paper we share several interesting examples that show the power of controlled experiments to improve sites, establish best practices, and resolve debates with data rather than deferring to the HIghest-Paid-Person’s Opinion (HiPPO) or to the loudest voice.”
…
“Now that we have run many experiments, we can report that Microsoft is no different. Evaluating well-designed and executed experiments that were designed to improve a key metric, only about one-third were successful at improving the key metric!”
Only about a third of their proposed changes had measurable improvements! In many cases, an enterprise would have spent money implementing the other two-thirds (based on instincts, best guesses, or because the HiPPO said too) and would have seen no benefit, or worse.
“The humbling results we shared in Section 5 bring to question whether a-priori prioritization is as good as most people believe it is. We hope this will help readers initiate similar changes in their respective organizations so that data-driven decision making will be the norm, especially in software development for online web sites and services.”
Online Experimentation at Microsoft
Lessons learned:
- Test early, test often
- Most experiments fail so “experiment often”
- A failed experiment is not a mistake
- “Try radical ideas”


Comments