The state of machine learning in web experimentation and testing

Machine learning is one of those buzzwords that can get almost anyone to either raise their eyebrows — or roll their eyes. The hype around machine learning (sometimes inaccurately referred to as “AI”) has been so overblown for so long that many have become grizzled cynics, believing that machine learning represents not a panacea but […]

Machine learning is one of those buzzwords that can get almost anyone to either raise their eyebrows — or roll their eyes. The hype around machine learning (sometimes inaccurately referred to as “AI”) has been so overblown for so long that many have become grizzled cynics, believing that machine learning represents not a panacea but a pointless money-hole for business.

The reality is actually somewhere in between. Machine learning is indeed a powerful tool for all kinds of applications, including experimentation and web experience design, but it needs to be directed by the same sound strategy as any other initiative. The core lie of the machine learning hype isn’t that it can solve your problems, but that it can solve your problems on its own.

To come to a better understanding, we have to dive into just what machine learning is, and how it is used in experimentation and experience design.

What is machine learning?

Put simply, machine learning is the automation of software design. One program observes the activity of another, usually a so-called “neural network,” and makes small, iterative edits to that neural network’s design. Most of these edits are about deemphasizing aspects of the neural net that are associated with bad outcomes, and reinforcing those associated with good ones.

This simple idea has had an incredible impact over the past decade or so, allowing the creation of programs that could never have been created directly. Data scientists spent decades trying in vain to teach computers to quickly transcribe spoken language, before finally giving up and passing the problem to machine learning; while each of the scientists had a brain that could of course translate spoken words themselves, their lack of awareness of the process by which that happened made them incapable of teaching the process to a computer.

Machine learning, on the other hand, was so successful in tackling this problem that we now take language-capable smartphones for granted. The neural nets doing that work were architected (and continue to be architected as new data comes in) by machine learning algorithms. Many in the business world have been told that machine learning can help them create equally revolutionary pieces of software, solving hard problems and overcoming seemingly impossible complexity.

Sounds great, so how does it actually work?[?bp_heading]

If we understand how machine learning works, however, we know that it only really applies in specific types of situations. The biggest hurdle in most cases is the need to create a dataset of so-called “training data” to be analyzed; if we want to teach a computer to find pictures of cats, the dataset is a collection of pictures that either do or do not contain a cat. This dataset must also be meticulously appended with metadata containing whatever information the machine learning algorithm requires — in the case of cat-pictures, this means a tag showing whether there really is or is not a cat in each picture. In web design, the dataset is often composed of user journeys, and a means of determining whether this journey ended positively or not.

So, you can’t simply tell a machine learning algorithm to become better at “making me money;” you have to be able to express your needs clearly, knowing in advance the exact process to be improved, exactly what a better or worse outcome will look like, and specifically what endpoint you’d like the process to achieve. It often requires human supervision or maintenance at some step in the process — which, if you’re keeping score, seriously undermines the whole “automated” argument for machine learning. “ML” can be expensive and time-consuming, requiring heavy investment before you can even get the process started.

It should be obvious that while machine learning is powerful, it is also poorly suited to many applications; it’s far from a silver bullet for digital experiences and growth. Neural nets only know what they’ve seen, and tend not to be able to deal with deviations from the content of their training dataset. This means that generalizing machine learning solutions to similar tasks in different projects can be difficult, and that they are often incapable of dealing with drastic change in the environment — say, a global pandemic and the mass changes in human behavior that come with it.

[bp_heading]Machine learning x Experimentation

In experimentation on web experiences, there are three major uses we see most often. These tactics represent the easiest ways to incorporate machine learning into your business.

  1. Traffic allocation

    Uses so-called “multi-armed bandit” algorithms to determine best distribution of traffic to maximize conversions, traffic, or whatever metric it is told to attenuate. Training data is generally gathered on an ongoing basis as site traffic occurs during the test period. It is able to correlate many attributes of different user journeys, and Frankenstein together the strongest overall journey or journeys, based on its programmed goals.

    This article offers a good primer on how multi-armed bandits function, and how a different mathematical approach to the solution can lead to drastically different outcomes.

  2. Dynamic experience design

    Rather than targeting small user groups, here a machine learning system takes aggregate traffic patterns and determines the ideal experience for the group, overall. This can hone the core version of the site toward a baseline experience that should provide solid results for the vast majority of users. With such a design as the default, more user-specific journeys can be created.

    For example, ML could identify specific product categories that are most correlated with conversion, and reorder product categories to display these more prominently.

  3. Predictive user scoring

    Here, training data is observed through visitor engagement with the site, judged against specific engagement metrics. The computer derives patterns from this data and applies them to predictively calculate engagement metrics for users. These scores can be used to shunt traffic between multiple journeys tailored to user type.

    For example, a company might want to use predictive user scoring to guess how susceptible each user is to becoming annoyed with bugs on a site — leading to so-called “rage clicks” and a high bounce rate. Users with a high rage score can be shunted to a simpler, less advanced journey that minimizes the risk of bugs, while users with low scores can be more safely shunted to test versions of the site.

Starting with just these three applications, any business can start considering strategies that are enabled, or enhanced, by machine learning. But if you’re going to be your company’s evangelist for these advanced techniques, be sure to make it clear that you actually understand the process. In a recent post, Google’s Cassie Kozyrkov went into more detail about the many situations in which machine learning isn’t the answer.

Clearly explain what machine learning can do to improve the company’s bottom line, and you’ll have even the cynics singing your praises.

Third Edition of Designing for Older Adults

The third edition of the definitive source for information for designing for older adults has been published: This new edition provides easily accessible and usable guidelines for practitioners in the design community for older adults. It includes an updated overview of the demographic characteristics of older adult populations and the scientific knowledge base of the … Continue reading Third Edition of Designing for Older Adults

The third edition of the definitive source for information for designing for older adults has been published:

This new edition provides easily accessible and usable guidelines for practitioners in the design community for older adults. It includes an updated overview of the demographic characteristics of older adult populations and the scientific knowledge base of the aging process relevant to design. New chapters include Existing and Emerging Technologies, Work and Volunteering, Social Engagement, and Leisure Activities. Also included is basic information on user-centered design and specific recommendations for conducting research with older adults. 

A 20% discount is available by using code ‘A004‘ at checkout from CRC Press.

The group of authors (the Center for Research and Education on Technology Enhancement) is also running a workshop:

The focus of this workshop is to bring together representatives from companies, organizations, universities, large and small, who are involved in industry, product development, or research who have an interest in meeting the needs of older adults. Additionally, members of the CREATE team will present guidelines and best practices for designing for older adults. Topics include; Existing & Emerging Technologies, Usability Protocols, Interface & Instructional Design, Technology in Social Engagement, Living Environments, Healthcare, Transportation, Leisure, and Work. Each participant will receive a complimentary copy of our book Designing for Older Adults.

If you would like a registration form or any further information on the conference accommodations, please contact Adrienne Jaret at: adj2012@med.cornell.edu or by phone at (646) 962-7153.

Listening to the End User: NHL/NHLPA Collaboration with Hockey Goalies

Today we present a guest post by Ragan Wilson, PhD student in Human Factors and Applied Cognitive Psychology at NC State University. Saying that goalies in professional ice hockey see the puck a lot is an understatement. They are the last line of defense for their team against scoring, putting their bodies in the way … Continue reading Listening to the End User: NHL/NHLPA Collaboration with Hockey Goalies

Today we present a guest post by Ragan Wilson, PhD student in Human Factors and Applied Cognitive Psychology at NC State University.

Saying that goalies in professional ice hockey see the puck a lot is an understatement. They are the last line of defense for their team against scoring, putting their bodies in the way of the puck to block shots in ways that sometimes do not seem human. In order to do that, they rely on their skills as well as their protective equipment, including chest protectors. As written by In Goal Magazine’s Kevin Woodley and Greg Balloch, at the professional level this and other equipment is being re-examined by the National Hockey League (NHL) and the National Hockey League’s Player’s Association (NHLPA).

For the 2018-2019 NHL season, there has been a change in goal-tending equipment rules involving chest protectors according to NHL’s columnist Nicholas J. Cotsonika. This rule, Rule 11.3, states that “The chest and arm protector worn by each goalkeeper must be anatomically proportional and size-specific based on the individual physical characteristics of that goalkeeper”. In practical terms, what this rule means is that goaltender chest protection needs to be size-wise in proportion to the goaltender using it so, for instance, a 185-pound goalie would seem more like a 185-pound goalie versus a 200-210 pound goalie. The reasoning for the rule change was to try to make saves by the goalie more based on ability than on extra padding and to potentially increase scoring in the league. Overall, this is a continuation of a mission for both the NHL and NHLPA to make goalie equipment slimmer, which was kick-started by changes in goalie pants and leg pads. The difference between previously approved chest protectors and the approved models are shown below thanks to the website Goalie Coaches who labeled images from Brian’s Custom Sports Instagram page below.



To a non-hockey player, the visual differences between non-NHL approved and the NHL approved pads look minuscule. However, according to In Goal Magazine, implementing these changes have been an interesting challenge for the NHL as well as hockey gear companies such as Brian’s and CCM). Whereas changing the pants rule was more straightforward, the dimensions of chest protectors are more complicated and personal to goalies (NHL). This challenge could be seen earlier in the season with mixed feedback about the new gear change. Some current NHL such as Vegas Golden Knights’ Marc-Andre Fleury (In Goal Magazine) and Winnipeg Jets’ Connor Hellebuyck (Sports Illustrated) noted more pain from blocking pucks in the upper body region. On the other hand, the Toronto Maple Leafs’ Frederik Andersen and Garrett Sparks have not had problems with these changes (Sports Illustrated).

What always makes me happy as a student of human factors psychology is when final users are made an active part of the discussion for changes. Thankfully, that is what appears to be happening so far with this rule change since the NHL and NHLPA seemed to be actively interested in and considering feedback from current NHL goaltenders about what could make them more comfortable with the new equipment standards at the beginning of the season (In Goal Magazine). Hopefully, that continues into the next season with all the rigorous, real-life testing that a season’s worth of regular and playoff games can provide. Considering there are already some interesting, individualized adjustments to the new equipment rules such as changing companies (Washington Capitals’ Braden Holtby), or adding another layer of protection such as a padded undershirt (Marc-Andre Fleury) (USA Today), it’ll be interesting what the situation is for this equipment come the next off-season, especially in terms of innovation from the companies that produce this gear at a professional level.


Ragan Wilson is a first-year human factors and applied cognitive psychology doctoral student at NC State University. She is mainly interested in the ways that human factors and all areas of sports can be interlinked, from player safety to consumer experiences of live action games.

What are your website visitors doing?

Chances are that you’re tracking your website visitors en masse. You’re probably tracking acquisition sites, tallying up conversions and working to optimize your pages for the best success. But with all of that quantitative research, do you know about each individual user’s journey, and where they are struggling on your site? If not, you should […]

The post What are your website visitors doing? appeared first on Brooks Bell.

Chances are that you’re tracking your website visitors en masse. You’re probably tracking acquisition sites, tallying up conversions and working to optimize your pages for the best success. But with all of that quantitative research, do you know about each individual user’s journey, and where they are struggling on your site? If not, you should check out one of our partners: SessionCam.

Jonathan Hildebrand, Brooks Bell’s Sr. Director of UX & Design, spoke at SessionCam’s user conference last week in Chicago. If you’re unfamiliar with SessionCam, the company began with a mission of building the best session replay solution on the market.  Over time the solution has grown into a fully-fledged behavioral analytics solution including heatmaps, conversion funnels, form analytics and more.

We’ve been blown away by the machine learning algorithms which identify signs of customer struggle and frustration on a website.  We sat down with Jonathan to ask him for a couple takeaways from the event.

As a UX expert, what do you appreciate most about SessionCam?

Where SessionCam really shines is in the qualitative data it provides, which can uncover major hurdles on your site in ways that quantitative data could never reveal. SessionCam’s recordings allow customers to watch a complete play-by-play of a visitor’s experience on the site, whether it’s through a mobile device or desktop.

What about specific to testing?

From a testing perspective, SessionCam can be great for post-test analysis since it allows you to watch videos from the live test experiences. The Customer Struggle Score is also a great way to understand where problems are occurring.

Any interesting case studies?

Definitely. One that comes to mind is a retailer that has a buy online, pick up in store (BOPUS) program. They were using SessionCam to uncover the source of order mistakes. When there was an error at pickup, they would go back and watch that customer’s online session to see if a problem occurred during the online order process and determine if there were any improvements they could make.

And you only need to check out their website to see the kind of value that SessionCam has added to many of the world’s leading brands.

If you’re interested in finding out more about SessionCam, give us a shout.

The post What are your website visitors doing? appeared first on Brooks Bell.

Human-Robot/AI Relationships: Interview with Dr. Julie Carpenter

Over at https://HumanAutonomy.com, we had a chance to interview Dr. Julie Carpenter about her research on human-robot/AI relationships. As the first post in a series, we interview one the pioneers in the study of human-AI relationships, Dr. Julie Carpenter. She has over 15 years of experience in human-centered design and human-AI interaction research, teaching, and … Continue reading Human-Robot/AI Relationships: Interview with Dr. Julie Carpenter

Over at https://HumanAutonomy.com, we had a chance to interview Dr. Julie Carpenter about her research on human-robot/AI relationships.

As the first post in a series, we interview one the pioneers in the study of human-AI relationships, Dr. Julie Carpenter. She has over 15 years of experience in human-centered design and human-AI interaction research, teaching, and writing. Her principal research is about how culture influences human perception of AI and robotic systems and the associated human factors such as user trust and decision-making in human-robot cooperative interactions in natural use-case environments.