Ronny Kohavi of Microsoft on controlled experiments

Regular readers know how much I care about experimentation in marketing. Seth Godin is starting to call this “layering” and it’s at the heart of my book, Do It Wrong Quickly. But no good name has emerged. I was contacted recently by Ronny Kohavi, Microsoft’s General Manager of Experimentation Platform, who uses the market research term “controlled experiments” for what they do. Ronny came to Microsoft from Amazon, where he served as Director of Data Mining and Personalization, and he’s written four of the top papers on machine learning according to CiteSeer. He was gracious enough to answer several questions for you to explain how experimental marketing works in real life.

Me: Tell us more about your job—do you work on the Live site or somewhere else in Microsoft? How do you end up working on so many experiments and how did you start doing that at Microsoft?
RK: I joined Microsoft in 2005 to work on something different, but after the first few months I realized that there were a lot of opinions on what to build, but little data. There was no easy way to try things out, so I felt that I could make a greater contribution to Microsoft by building a system for running controlled experiments. I started a small incubation team, and managed to attract some of the best people I worked with in the past.
The team built the first version of the Experimentation Platform in a year and we went live with two experiments on two properties: the MSN US home page, and Windows Marketplace. The team is doing two things at Microsoft:

We are building the software and the service, i.e., the Experimentation Platform.
We are educating Microsoft teams about this alternative to developing software, i.e., quicker cycles where experiments are run to help guide the features and to help evaluate ideas. This very much aligns with the “Do It Wrong Quickly” theme.

Me: What was the purpose of the solitaire vs. poker test for the Game Downloads page in Windows Marketplace (shown below)? Microsoft Solitaire Download

Microsoft Poker Download
RK: One of the simplest ways to improve clickthrough rates is to change images. It’s usually trivial to change images, and when someone is designing an image, they typically have several alternatives. Instead of betting on one, try the top three (and encourage diversity of ideas). What is surprising to people about this example is that many predict the winning image incorrectly and, more annoyingly to some, that the difference is so large (61%). It’s very common for the technical audience to prefer the Poker image and confuse themselves with the target audience.
Me: What was the result of this test in terms of action?
RK: Use the winning image (although the site has changed since then). More important than the specific test, which in this case was their first experiment with us, was the learning that “Just do the right thing” is harder than it seems. When the team was asked which image would win, their vote went for Poker. When running experiments, it’s always fun to take bets in advance about the delta between the two versions. It’s humbling to see how many times we’re wrong, not just about the magnitude, but even on the direction. In experiments I was involved with at Amazon, many features that we thought would be strong were simply flat: no statistically significant difference. Knowing that something does not work, however, has value: it eliminates one idea so we can move on to others. If you have 10 keys to open a lock, every one that you try which doesn’t work provides additional data. Moving fast isn’t enough; you want to move effectively in the right direction, and that’s what experiments allow you to confirm.
Me: What do you see as the big cultural issues in getting testing into an organization?
RK: The main reason to avoid something new is inertia. Microsoft has developed software for years and many believe they have the art perfected. With services and websites, there is extra information that is readily available: customer interactions. The ability to prototype ideas and see how customers react to them is something that was not available 15 years ago.
Here are some reasons why people avoid experimenting:

Some believe it threatens their job as decision makers.
Program managers select the next set of features to develop. Proposing several alternatives and admitting you don’t know which is best is hard.
Editors and designers get paid to select a great design.
Failures of ideas may hurt image and professional standing. It’s easier to declare success when the feature launches.
We’ve heard: “We know what to do. It’s in our DNA,” and “Why don’t we just do the right thing?”

Me: What are the best ways to overcome cultural resistance?
RK: I’m not sure I know the “best” way, but what we are doing today is based on several efforts:

Raising awareness and educating people. We give internal talks and run classes. We know the message is resonating because we initially had a hard time filling the classes, and now they’re booked with waiting lists.
Showing successes. Running experiments with those that are on board and showing successes will help others see the opportunity. As with every population, we have the early adopters and the skeptics.
Highlight the limitations. Experimentation is not a panacea for everything, so we should recognize when it is appropriate and highlight the limitations so it’s not misused.

Me: Can you could explain the advantages of controlled experiments?
RK: When I was director of data mining and personalization at Amazon, the two most successful innovations by my team were not on any road map the year before, and were initially ranked so low by myself and the team that one was given as a ramp-up project to a new employee, and the other given to an intern. The projects generated hundreds of millions of dollars in incremental revenue. Such an observation is very humbling and highlights the biggest advantage of controlled experiments: we can try a lot of things quickly, and let users guide us.
Me: Do you have any one-to-two page case studies of projects you’ve worked on that you would like to share?
RK: We have several talks and papers.
Me: If you had to explain your ACM article to the average marketer, what would be the key insights that marketers need to know but most don’t have today?
RK: Controlled Experiments have existed for tens of years, and the math is well understood. With the web we have an unprecedented opportunity to try things quickly and make data-driven decisions. Don’t fall into the trap of believing you know what is best for the customer, and don’t let the HiPPO (Highest Paid Person’s Opinion) guide you.
Me: What is the most important thing you’d like to say to marketers that I haven’t asked you about?
RK: Marketers love a good catch phrase that’s memorable. I would tell them to avoid HDD (HiPPO Driven Design).
Some of us tend to think of experimental marketing as something only born-on-the-Web companies know how to do. Thanks to Ronny for helping us see how a big company can do it, too.

Trending Now

Ronny Kohavi of Microsoft on controlled experiments

Mike Moran

Related Posts

POPULAR POSTS

Envisioning the Future of Human Work in the Age of AI: The Team Flow Institute 2024 Forecast

How Digital Technologies Are Revolutionizing the Commercial Real Estate Industry

Lessons Learned from the MGM Hack

Team Flow Institute Launches to Create a Collective Vision for the Fourth Industrial Revolution

Global Cooperation Urgently Needed to Govern Risks of Advanced AI