The state of machine learning in web experimentation and testing

Machine learning is one of those buzzwords that can get almost anyone to either raise their eyebrows — or roll their eyes. The hype around machine learning (sometimes inaccurately referred to as “AI”) has been so overblown for so long that many have become grizzled cynics, believing that machine learning represents not a panacea but […]

Machine learning is one of those buzzwords that can get almost anyone to either raise their eyebrows — or roll their eyes. The hype around machine learning (sometimes inaccurately referred to as “AI”) has been so overblown for so long that many have become grizzled cynics, believing that machine learning represents not a panacea but a pointless money-hole for business.

The reality is actually somewhere in between. Machine learning is indeed a powerful tool for all kinds of applications, including experimentation and web experience design, but it needs to be directed by the same sound strategy as any other initiative. The core lie of the machine learning hype isn’t that it can solve your problems, but that it can solve your problems on its own.

To come to a better understanding, we have to dive into just what machine learning is, and how it is used in experimentation and experience design.

What is machine learning?

Put simply, machine learning is the automation of software design. One program observes the activity of another, usually a so-called “neural network,” and makes small, iterative edits to that neural network’s design. Most of these edits are about deemphasizing aspects of the neural net that are associated with bad outcomes, and reinforcing those associated with good ones.

This simple idea has had an incredible impact over the past decade or so, allowing the creation of programs that could never have been created directly. Data scientists spent decades trying in vain to teach computers to quickly transcribe spoken language, before finally giving up and passing the problem to machine learning; while each of the scientists had a brain that could of course translate spoken words themselves, their lack of awareness of the process by which that happened made them incapable of teaching the process to a computer.

Machine learning, on the other hand, was so successful in tackling this problem that we now take language-capable smartphones for granted. The neural nets doing that work were architected (and continue to be architected as new data comes in) by machine learning algorithms. Many in the business world have been told that machine learning can help them create equally revolutionary pieces of software, solving hard problems and overcoming seemingly impossible complexity.

Sounds great, so how does it actually work?[?bp_heading]

If we understand how machine learning works, however, we know that it only really applies in specific types of situations. The biggest hurdle in most cases is the need to create a dataset of so-called “training data” to be analyzed; if we want to teach a computer to find pictures of cats, the dataset is a collection of pictures that either do or do not contain a cat. This dataset must also be meticulously appended with metadata containing whatever information the machine learning algorithm requires — in the case of cat-pictures, this means a tag showing whether there really is or is not a cat in each picture. In web design, the dataset is often composed of user journeys, and a means of determining whether this journey ended positively or not.

So, you can’t simply tell a machine learning algorithm to become better at “making me money;” you have to be able to express your needs clearly, knowing in advance the exact process to be improved, exactly what a better or worse outcome will look like, and specifically what endpoint you’d like the process to achieve. It often requires human supervision or maintenance at some step in the process — which, if you’re keeping score, seriously undermines the whole “automated” argument for machine learning. “ML” can be expensive and time-consuming, requiring heavy investment before you can even get the process started.

It should be obvious that while machine learning is powerful, it is also poorly suited to many applications; it’s far from a silver bullet for digital experiences and growth. Neural nets only know what they’ve seen, and tend not to be able to deal with deviations from the content of their training dataset. This means that generalizing machine learning solutions to similar tasks in different projects can be difficult, and that they are often incapable of dealing with drastic change in the environment — say, a global pandemic and the mass changes in human behavior that come with it.

[bp_heading]Machine learning x Experimentation

In experimentation on web experiences, there are three major uses we see most often. These tactics represent the easiest ways to incorporate machine learning into your business.

  1. Traffic allocation

    Uses so-called “multi-armed bandit” algorithms to determine best distribution of traffic to maximize conversions, traffic, or whatever metric it is told to attenuate. Training data is generally gathered on an ongoing basis as site traffic occurs during the test period. It is able to correlate many attributes of different user journeys, and Frankenstein together the strongest overall journey or journeys, based on its programmed goals.

    This article offers a good primer on how multi-armed bandits function, and how a different mathematical approach to the solution can lead to drastically different outcomes.

  2. Dynamic experience design

    Rather than targeting small user groups, here a machine learning system takes aggregate traffic patterns and determines the ideal experience for the group, overall. This can hone the core version of the site toward a baseline experience that should provide solid results for the vast majority of users. With such a design as the default, more user-specific journeys can be created.

    For example, ML could identify specific product categories that are most correlated with conversion, and reorder product categories to display these more prominently.

  3. Predictive user scoring

    Here, training data is observed through visitor engagement with the site, judged against specific engagement metrics. The computer derives patterns from this data and applies them to predictively calculate engagement metrics for users. These scores can be used to shunt traffic between multiple journeys tailored to user type.

    For example, a company might want to use predictive user scoring to guess how susceptible each user is to becoming annoyed with bugs on a site — leading to so-called “rage clicks” and a high bounce rate. Users with a high rage score can be shunted to a simpler, less advanced journey that minimizes the risk of bugs, while users with low scores can be more safely shunted to test versions of the site.

Starting with just these three applications, any business can start considering strategies that are enabled, or enhanced, by machine learning. But if you’re going to be your company’s evangelist for these advanced techniques, be sure to make it clear that you actually understand the process. In a recent post, Google’s Cassie Kozyrkov went into more detail about the many situations in which machine learning isn’t the answer.

Clearly explain what machine learning can do to improve the company’s bottom line, and you’ll have even the cynics singing your praises.