Hold-Out Groups: Gold Standard for Testing—or False Idol?

You run an A/B test on the call-to-action for a pop-up. You have a process, implement it correctly, find a statistically significant winner, and roll out the winning copy sitewide. Your test answered every question except one: Is the winning version better than no pop-up at all? A hold-out group can deliver the answer, but, […]

The post Hold-Out Groups: Gold Standard for Testing—or False Idol? appeared first on CXL.

You run an A/B test on the call-to-action for a pop-up. You have a process, implement it correctly, find a statistically significant winner, and roll out the winning copy sitewide.

Your test answered every question except one: Is the winning version better than no pop-up at all?

A hold-out group can deliver the answer, but, like everything, it comes at a cost.

What are hold-out groups?

A hold-out group is a form of cross-validation that extracts, or “holds out,” one set of users from testing. You can run holdouts for A/B tests and other marketing efforts, like drip email campaigns in which a percentage of users receives no email at all.

After completion of a test and implementation of the winning version, the hold-out group remains for weeks, months, or, in rare cases, years. In doing so, the holdout attempts to quantify “lift”—the increase in revenue compared to doing nothing.

For example, a “10% off” coupon (delivered through a pop-up or email campaign) may generate 15% more sales than a current “$10 off a $100 purchase” coupon. However, without a holdout, you don’t know how many consumers would’ve bought without any coupon at all—a winning test may still reduce profits.

Most often, however, holdouts are used not to measure lift from a single test but lift for an entire experimentation program. Because holdouts require siphoning off a statistically relevant portion of an audience, they make sense only for sites with massive amounts of traffic.

The difference between a hold-out and a control group

Imagine you want to test a headline on a product page. The version on the left (Control) is the current version, while the experimental version (Variation A) is on the right:

ab test apple cables

Assume, by some miracle, that Variation A performs better, and you implement it for all visitors. That’s the standard process for an A/B split test—50% see each version during the test, and 100% see the winning version after the test completes.

However, if you continue to show some visitors the control version, that control group becomes the holdout. In other tests, the control may not “transition” from control to holdout. Instead, it can be a separate segment omitted from the start—like the email campaign in which a percentage of subscribers receive nothing.

Because a holdout can estimate the value of a marketing effort beyond a relative improvement between two versions, some consider it “the gold standard” in testing.

Why hold-out groups are “the gold standard”

For many, holdouts are a gold standard for testing because they measure the value not just of a test but of a testing program.

And while the value of testing may be apparent to those involved in it, individual test results do not aggregate into ROI calculations made in the C-Suite. There, the considerations extend beyond website KPIs:

  • Does it make sense to employ a team of data scientists or email marketers?
  • If we fired the entire team tomorrow, what would happen?

Holdouts also have the potential to assess experimentation’s impact on customer lifetime value. While a short-term split test may record an increase in clicks, form fills, or sales, it doesn’t capture the long-term effects:

  • Do pop-ups and sticky bars increase email leads but, over time, reduce return visitors?
  • Does a coupon program ultimately decrease purchases of non-discounted items?

Some effects may take months or years to materialize, accumulating confounding factors daily. Thus, when it comes to measuring the long-term impact of tests, how long is long enough?

Defining the scope for hold-out groups

How long should you maintain a hold-out group? Without a defined window, you could make ludicrous comparisons, like decades-long holdouts to measure your current site against its hand-coded version from the late 1990s.

The decisions in the extreme are laughable, but as the gap narrows—five years, three years, one year, six months—they get harder.

Look-back windows and baselines for holdouts

How much time should pass before you update the “baseline” version of your site for a hold-out group? “It depends on your goals,” CXL Founder Peep Laja explained. “You could leave it unchanged for three years, but if you want to measure the annual ROI, then you’d do yearly cycles.

What about the degree of site change? “When it’s functionality, there’s a sense of permanence,” Cory Underwood, a Senior Programmer Analyst at L.L. Bean, told me. “When it’s messaging, you get into how effective and for how long it will be effective.

Underwood continued:

There are times when you would want to get a longer read. You can see this in personalization. You target some segment with a completely different experience on the range of “never” to “always.” Say it won and you flip it to always. Six months later, is it still driving the return?

A hold-out group offers an answer. (So, too, Laja noted, could re-running your A/B test.) But you wouldn’t get an apples-to-apples comparison unless you accounted for seasonality between the two time periods.

In that way, a hold-out group is uniquely rewarding and challenging: It may mitigate seasonality in a completed A/B test but reintroduce it when comparing the hold-out group to the winner.

Omnichannel retailers like L.L. Bean manage further complexity: Demonstrating that website changes have a long-term positive impact on on-site behavior and offline activity. The added variables can extend the timeline for holdouts. Underwood has run hold-out groups for as long as two years (an anomaly, he conceded).

For test types and timelines that merit a hold-out group, implementation has its own considerations.

Implementing hold-out groups for tests

The implementation of holdouts isn’t formulaic. Superficially, it involves dividing your audience into one additional segment. (Hold-out segments often range from 1 to 10% of the total audience.) For example:

Control: Audience 1 (47.5%)

Variation A: Audience 2 (47.5%)

Hold-out: Audience 3 (5%)

Many A/B testing tools allow users to adjust weights to serve (or not serve) versions of a test to an audience. But not every test can take advantage of segmentation via testing platforms.

As Underwood explained, a decision to roll out tests on the client side (using a testing tool) versus server side (through a CDN) hinges on two considerations:

  1. The scale of change. Large-scale DOM manipulations deployed via client-side rollouts risk a slow and glitchy user experience. The greater the difference between versions of the site involved in a test (like a hold-out that preserves an entirely different homepage design), the more that server-side delivery makes sense.
  2. The specificity of targeting. Testing tools connect user data with CRM data for more granular targeting; server-side segmentation is limited to broader attributes of anonymous users, such as location and device type, making it difficult to test changes for a narrowly targeted audience.

At a certain scale—say, for Pinterest’s quarter-billion monthly users—building a custom platform can expedite testing and integrate more effectively with in-house tools.

pinterest testing platform model
Pinterest built its own A/B testing platform to support more than 1,000 simultaneous tests. (Image source)

Perhaps most importantly, profitable implementation depends on knowing when a hold-out group improves a website—and when it’s a costly veneer to hide mistrust in the testing process.

When holdouts work

1. For large-scale changes

To the site. The more expensive a change will be to implement, the greater the justification to use a hold-out group before implementation.

After-the-fact holdouts for a non-reversible change make little sense. But advance testing to validate the long-term effect does. “As the risk goes up, the likelihood [of a holdout] also goes up,” summarized Underwood.

Often, Underwood said, marketing teams request holdouts to validate proposals for extensive site changes. A holdout that confirms the long-term value of their plans is persuasive to those who sign-off on the investment.

To team priorities. John Egan, the Head of Growth Traffic Engineering at Pinterest, agrees with Underwood—a test that implicates larger changes deserves greater (or, at least, longer) scrutiny, which a holdout delivers.

But site development costs aren’t the only costs to consider. As Egan explained, holdouts also make sense when “there is an experiment that was a massive win and, as a result, will potentially cause a shift in the team’s strategy to really double down on that area.”

In those circumstances, according to Egan, a holdout typically lasts three to six months. That length is “enough time for us to be confident that this new strategy or tactic does indeed drive long-term results and doesn’t drive a short-term spike but long-term is net-negative.”

2. To measure the untrackable

Egan acknowledged that, while holdouts are standard at Pinterest, “we only run holdout tests for a small percentage of experiments.”

For Pinterest, the primary use case is to:

measure the impact of something that is difficult to fully measure just through tracking. For instance, we will run periodic holdouts where we turn off emails/notifications to a small number of users for a week or a month to see how much engagement emails/notifications drive and their impact on users’ long-term retention.

Egan detailed such an instance on Medium. His team wanted to test the impact of adding a badge number to push notifications. Their initial A/B test revealed that a badge number increased daily active users by 7% and boosted key engagement metrics.

pinterest badge numbers
Badge numbers drove a near-term lift, but would that lift endure? Egan’s team used a hold-out group to find out. (Image source)

Still, Egan wondered, “Is badging effective long-term, or does user fatigue eventually set in and make users immune to it?” To find out, Pinterest created a 1% hold-out group while rolling out the change to the other 99% of users.

The result? The initial 7% lift faded to 2.5% over the course of a year—still positive but less dramatic than short-term results forecasted. (A subsequent change to the platform elevated the lift back to 4%.)

pinterest test results
The badging group continued to outperform the hold-out group after more than a year, albeit less dramatically than initial test results showed. (Image source)

The takeaway for Egan was clear: “In general, holdout groups should be used anytime there is a question about the long-term impact of a feature.”

3. To feed machine learning algorithms

Today, a Google search on “hold-out groups” is more likely to yield information for training machine learning algorithms than validating A/B tests. The two topics are not mutually exclusive.

As Egan explained, holdouts for machine learning algorithms, “gather unbiased training data for the algorithm and ensure the machine learning algorithm is continuing to perform as expected.”

In this case, a hold-out is an outlier regarding look-back windows: “The holdouts for machine learning algorithms run forever.

These use-cases make sense, but all come with costs, which can quickly multiply:

  • Teams spend time identifying a hold-out segment.
  • Teams spend time maintaining the hold-out version of the website.
  • A portion of the audience doesn’t see a site change that tested better.

In some cases, the justification for a hold-out group derives not from a commitment to rigorous testing but from methodological mistrust.

When holdouts skirt the larger issue

Tim Stewart, a consultant for SiteSpect and trsdigital, is usually “setting up testing programs or rescuing them.” The latter, he noted, is more common.

As a consultant, he often meets directly with the C-Suite, a privilege many in-house optimization teams don’t enjoy. That access has made him a skeptic of using holdouts: “With holdouts, the answer to ‘Why?’ seems to be ‘We don’t trust our tests.’”

Stewart isn’t a full-blown contrarian. As he told me, he recognizes the benefits of hold-out groups to identify drop-offs from the novelty effect, monitor the cumulative effect of testing, and other rationales detailed previously.

But too often, Stewart continued, holdouts support statistically what teams fail to support relationally—the legitimacy of their process:

I understand what [CEOs] want. But testing does not give you an answer. It gives you a probability that the decision you make is in the right direction. Each one individually is only so useful. But if you structure a set of questions, the nth cumulative effect of learning and avoiding risk is worthwhile. That’s the faith-based part of it.

In other words, a valid testing process diminishes the need for holdouts. Running those tests, Stewart said, is:

a lot of money and effort and caveats [that] defers any kind of responsibility of explaining it to the business. For proving the business value, you should be proving it in other ways.

That’s especially true given the opportunity costs.

The opportunity costs of holdouts

Testing resources are limited, and using resources for holdouts slows the rate of testing. As Amazon’s Jeff Bezos declared, “Our success at Amazon is a function of how many experiments we do per year, per month, per week, per day.”

Opportunity costs can rise exponentially due to the complexity of managing hold-out groups, which businesses often underestimate.

Stewart has an analogy: Imagine a pond. Toss a large paving stone into the pond. How hard would it be to measure the size and effect of the ripples? Not too hard.

ripple in pond

Now imagine throwing handfuls of pebbles into the ocean. What effect does each pebble have on the waves that pass by? What about at high tide or low? Or during a hurricane? Pebbles in the ocean, Stewart suggested, are a more accurate metaphor for holdouts, which must factor in everything from offline marketing campaigns to macroeconomic changes.

Can a hold-out group still provide an answer? Yes. But at what cost? As Stewart asked: What’s the ROI of statistical certainty measured to three decimal places instead of two if your control isn’t much of a control?

At a certain point, too, you need to include yet another variable: The impact on ROI from using holdouts to measure ROI. And, still, all this assumes that creating a hold-out group is feasible.

The illusion of feasibility

There is no true hold-out,” Stewart contended. “Even on a control, there are some people who come in on different devices.” (Not to mention, Edgar Špongolts, our Director of Optimization at CXL, added, users with VPNs and Incognito browsers.)

Holdouts exacerbate the challenges of multi-device measurement: The longer a test runs, the more likely it is that someone deletes a cookie and ends up crossing from a “no test” to “test” segment. And every effort to limit sample pollution increases the costs—which slows the rollout of other tests.

Say you want to go down the rabbit hole to determine the ROI of a testing program—cost is no factor. As Stewart outlined, you’d need to do more than just hold out a segment of visitors from an updated site.

You’d need to withhold all test results from a parallel marketing team and, since websites are never static, allow them to make changes to the hold-out version based on gut instinct. Stewart has presented executives with that very scenario:

What we actually have to have is a holdout that includes all of our bad ideas and our good ideas. It’s not holding an audience—it’s running a site without the people who are making the changes seeing any of the test results. Why would we do that?! My point exactly.

Stewart doesn’t make his argument to eschew all use of holdouts. Instead, he aims to expose the misguided motivations that often call for it. Every test result offers probability, not certainty, and using hold-out groups under the false pretense that they’re immune to the ambiguities that plague other tests is naive—and wasteful.

A holdout doesn’t free analysts from dialogue with management, nor should management use a hold-out result to “catch out” teams or agencies when, from time to time, a test result fails to deliver on its initial promise.

“It’s not really about the math,” Stewart concluded. “It’s about the people.”


“Can you do it easily, cheaply, and with enough of your audience?” asked Stewart. Underwood and Egan have done it, but not because of testing efficiency alone.

Both have earned the autonomy to deploy holdouts sparingly. Their body of work—test after test whose results, months and years down the road, continue to fall within the bounds their initial projections—built company-wide faith in their process.

Top-down trust in the testing process focuses the use of holdouts on their proper tasks:

  • Unearthing the easily reversible false positives that short-term tests periodically bury.
  • Confirming the long-term value of a high-cost change before investing the resources.

The post Hold-Out Groups: Gold Standard for Testing—or False Idol? appeared first on CXL.

How “False Expertise” Can Damage Your Business—and How to Protect It

When I last checked, there were 987,119 “thought leaders” on LinkedIn. Soon, there’ll be more than a million. How many of those do you trust? “False expertise” is misidentified competence: We perceive expertise where there is none or evaluate expertise based on irrelevant factors. False experts include legions of self-appointed “gurus” and “visionaries” who saturate […]

The post How “False Expertise” Can Damage Your Business—and How to Protect It appeared first on CXL.

When I last checked, there were 987,119 “thought leaders” on LinkedIn. Soon, there’ll be more than a million. How many of those do you trust?

“False expertise” is misidentified competence: We perceive expertise where there is none or evaluate expertise based on irrelevant factors.

False experts include legions of self-appointed “gurus” and “visionaries” who saturate social media with bad advice. But they’re not the only sources.

Our brains are hardwired to take shortcuts that bias our identification of expertise, helping charlatans thrive and warping our decision-making.

Why are we so bad at this?

Why we fall for false expertise

1. We’re bad at making rational decisions. Thinking, Fast and Slow, the seminal book by Nobel Prize–winning economist Daniel Kahneman, makes a clear case for human vulnerability in decision-making.

We like to think that we make rational, “slow” decisions. Most of the time, however, we’re using our much faster, less rational system to choose. It’s one reason we continue to fall for false expertise.

2. We want to validate our own perspective. Khalil Smith, the Practice Lead for Diversity and Inclusion at the NeuroLeadership Institute, summarizes the biases that lead us toward false expertise:

  • Similarity. “People like me are better than people who aren’t like me.”
  • Experience. “My perceptions of the world must be accurate.”
  • Expedience. “If it feels right, it must be true.”

Because of those biases, explains University of Utah Professor Bryan Bonner, we focus on “proxies of expertise” rather than expertise itself.

Those proxies can be anything from height (we tend to elect the taller political candidate) to gregariousness in a meeting or the extracurricular activities on a resume.

3. We fall victim to the Halo Effect. Even if we initially judge someone based on real expertise, we often overextend that evaluation—a cognitive bias known as the Halo Effect.

E. L. Thorndike first demonstrated the Halo Effect in the military by showing the high correlation among soldiers’ ratings for physique, intelligence, leadership, and character.

grass flower halo
The Halo Effect falsely extends our perception of someone’s expertise to areas beyond it.

In the modern office, if someone has great creative ideas, the Halo Effect makes us more likely to admire that same person’s copywriting or management skills.

4. We overestimate our knowledge. Some people knowingly claim expertise they don’t have. Others aren’t aware of their deficiencies. The Dunning-Kruger effect highlights how those with the least knowledge are also the least capable of recognizing their ignorance, but subtler aspects of self-assessment affect perceived expertise, too.

How much do you know about Philadelphia, Pennsylvania? What about Acadia National Park? Or Monroe, Montana? If you’re familiar with all three, you’re not alone. But you’re also mistaken: There is no Monroe, Montana.

Researchers have shown that a higher self-assessment of topic knowledge leads to a greater likelihood that we’ll claim false expertise (like knowing about a city that doesn’t exist.)

Not every motivation is malevolent—our brains may work harder to find any connection for topics we know well. But it’s nonetheless a cautionary tale about the “illusion of knowledge” or “overclaiming,” a condition to which experts are particularly vulnerable.

5. Traditional barriers to expertise have diminished. It costs less than $100 per year to run a website, and—unlike the print publishing era—no reputable editor or printing costs stand in the way of immediate, uncensored, worldwide distribution.

A conspiracy theorist may have a better-looking website (or larger Twitter following) than a renowned academic, and it’s left to the consumer to push aside those proxies.

flat earth society tweet
Twitter verification adds credibility to blatant falsehoods.

The digital era also tempts us, David C. Baker writes, to engage in “expertise of convenience.” It takes only a few minutes to create a new webpage that targets a subset of your market, even if that market is outside of your wheelhouse.

Marketing campaigns, Baker argues, are now chalkboard specials: readily changeable menus of expertise that require no long-term commitment—unlike the 20-foot neon sign above a restaurant.

The result of these vulnerabilities is that we hire the wrong candidates, listen to the wrong people, and fail to differentiate our businesses.

How false expertise leads us to hire the wrong people

Resumes are noisy:

  • The name at the top may suggest gender and ethnicity.
  • A college choice may betray class status.
  • Extracurriculars may create a bond—or distance—between you and a candidate.

And none of those elements has anything to do with how well that person can do the job.

Research confirms our focus on “looking glass merit,” which results from interviewers—most of whom have minimal training—judging a candidate based on similarity. In short, we seek to validate our own characteristics: Hiring someone who’s like us reinforces our own value.

Familiarity, in particular, has dangers. One of our cognitive system’s favorite shortcuts is “familiar = safe.” The “mere exposure” effect helps explain why we continue to push similar songs to the top of the Billboard charts and why high-level brand awareness can ultimately lead to a sale.

In hiring, it means that our strongest biases exist where we least expect them—with characteristics so familiar as to seem standard. Just as our brain’s craving for familiarity swings wide the door for underqualified candidates, it also blocks pathways for those from different backgrounds.

(If individual hiring biases remind you of those demonstrated by prospects during a proposal process, they should.)

Mitigating the impact of false expertise on hiring

woman playing instrument
“Blind auditions” reduced the noise in orchestra auditions.

Reduce the noise. The implementation of “blind auditions” for leading orchestras—where applicants were heard but not seen—increased female membership from just 6% to 21% in a little over two decades.

Microsoft rolled out an “Inclusive Hiring” effort that’s tailored to those who would never make it through a traditional hiring process, like exceptional coders on the autism spectrum who may not have linear employment histories or engaging interview skills.

Other companies, like Blendoor, have developed technology platforms that automatically strip out some of the “noisiest” parts of resumes.

Strengthen the signal. Reducing the noise also increases the signal. And honing that signal, Smith explains, requires advance preparation. In the case of hiring, it means asking, “What is it that we’re actually hiring for?”

Smith concedes that no single process can help organizations answer that question, but he outlined a loose order of operations that, simple as it may be, is too often ignored:

  1. First, define the skills and characteristics that are essential for job performance.
  2. Then identify the questions that you need to ask potential employees.

Even with the right people in an organization, false expertise remains a daily threat.

How false expertise elevates bad opinions and strategies

Managers and employees encounter false expertise in three ways:

  1. Promotions
  2. Meetings
  3. Strategy

The issues associated with promotions parallel those for hiring—an unstructured promotion process risks greater deference to false expertise.

The other sources of false expertise, however, bring their own challenges.

Meetings: Being the expert doesn’t mean you’ll be heard

Groups are superior to individuals in recognizing an answer as correct when it comes up. But when everybody in a group is susceptible to similar biases, groups are inferior to individuals, because groups tend to be more extreme than individuals. – Daniel Kahneman

As Kahneman explained in an interview, groups are overly optimistic and suppress individual dissent. (He cites the U.S. invasion of Iraq in 2003 as a classic example.) That can lead to a “risky shift”—a polarized consensus based on false expertise, or, as it’s known more commonly, groupthink.

What makes opinions stand out in a meeting? The research is divided. One study demonstrated that even when meeting attendees recognize expertise (i.e. the group knows who the most knowledgeable person is), groups take the expert-recommended path just 62% of the time.

In other words, the problem is not simply recognizing expertise but also deferring to it. How often (or loudly or persuasively) someone speaks may serve as the proxy for expertise. As Smith cautioned: “Volubility is not trust.”

In contrast, a study that reviewed audio recordings from NASA meetings found that the amount of “air time” affected the perceived influence more than the actual influence, which usually deferred to “real” expertise.

Ways to focus meetings on real expertise

two people meeting

The push and pull between research on System 1 (fast) and System 2 (slow) thinking won’t resolve soon. Nonetheless, there are ways to combat the impact of a particularly charismatic meeting attendee:

  • Set up “if-then” plans. According to Smith, if you’re agreeing with a dominant personality (or a beautiful slide deck), then get a quieter person to paraphrase the same message after the meeting (or review the argument in plain-text notes). Is it still as persuasive?

There are other strategies, too, such as one put forth by Utah’s Bonner:

  • Anonymize ideation. Write out ideas on index cards or in shared documents, then review them anonymously. You’ll judge them only on the strength of the idea, not the person pitching it.

In some ways, reducing false expertise in hiring and meeting management is easier—these are decisions and processes that happen over and over again.

It’s far harder to limit the impact of false expertise on one-time strategic decisions.

Strategy: Adding accountability to decision-making

Even if you could neutralize false expertise entirely, strategic decisions would still be wrong from time to time. Business climates change rapidly. Unexpected events occur.

More likely, you’ll fail to notice a bias, or recognize one but not know how to remove it. Strategic decisions are some of the largest decisions your organizations makes, but because each one is unique, it’s more difficult to defend them against false expertise.

Ways to focus strategic decisions on real expertise

At the very least, Smith offers, document your current decision-making to make it easier to review mistakes in the future:

  1. Detail your process. Write out an explicit thought process: “We decided X, which led us to conclude Y, which is why we’re going with strategy Z.”
  2. Incentivize awareness. Celebrate moments when team members identify flawed thinking or decision-making—encourage people to identify bias.
  3. Slow it down. Take a short break before making a big decision. It increases the chances of making a “slow,” System 2 decision.
  4. Host a “pre-mortem.” Assume your planned decision was wrong and work backward to understand why. You may uncover current biases.

The right people and right processes are critical components to take on the most challenging work—differentiating your business from those who are all too happy to claim unwarranted authority.

How to differentiate yourself in an ocean of false experts

Hiring and management are internal challenges. But the most frustrating aspect of false expertise may be its elevation of undeserving people and businesses to the top of the industry.

Differentiating your business depends on understanding how the hucksters got there and to how to fight them off.

The big business of “thought leadership”

We’ve poisoned the well. Political scientist Daniel Drezner argues that our society traded skeptical, analytical “public intellectuals” for simplistic, rah-rah “thought leaders.”

In the words of Matthew Stewart, author of The Management Myth: Why the Experts Keep Getting It Wrong, we’ve fallen for “corporate mysticism.” Why have thought leaders run amok? Because being a “thought leader” has become incredibly lucrative.

Certainly, in digital marketing, it’s a shortcut to success. Consumers don’t know a good agency from a bad one, so social media presence or a major speaking gig becomes an easy proxy.

PR professionals and marketers have recognized the potential value of that proxy and shoved their leaders into the spotlight. Some CEOs join Twitter, in other words, not because they engage honestly and regularly but because it generates leads or boosts stock prices.

When marketing goals are the primary motivator—not real expertise or a desire to share it—Minimum Viable Expertise proliferates.

How an obsession with “growth” blunts expertise

In Baker’s The Business of Expertise, he outlines a path for the development of “hard-won, noninterchangeable expertise.” Baker sees a continuum: on the left is specialization; on the right is general knowledge.

Expertise grows as you move to the left, but the number of potential clients increases as you move to the right. For example, a social media marketing agency for credit unions (Baker’s example) has tremendous expertise but limited market appeal. A generic “digital marketing agency,” on the other hand, can pitch any client but has no niche.

The biggest mistake that many make, he contends, is drawing a massive circle around all experience so that no opportunities fall outside. As Baker argues, you must have the courage to specialize to differentiate yourself and justify a price premium.

That means declining bad-fit opportunities for client work, which, in the near term, slows business growth. The challenge, of course, is that industry publications continue to laud the fastest growing companies—they hand the biggest microphone to those with the least expertise.

Here’s how to keep that from happening.

How to establish expertise for your business

[This post contains video, click to play]

1. Credentials will not rescue you.

In his book, Matthew Stewart highlights the proliferation of MBAs and laments the shallowness of course offerings. (McKinsey admitted that its MBA-less employees “are at least as successful” as those with credentials.) As Stewart argues, an MBA is training; experience is education.

Smith carried the point forward: Credentials can become part of the false expertise marketplace—a degree doesn’t guarantee job performance and, in some cases (like an obsession with Ivy League grads), may be a distraction.

For marketing, in particular, credentials are lacking. Credentials successfully separated doctors from snake-oil salesmen, but, as Smith noted, that process took decades, and the stakes for marketing likely will never justify such a rigorous framework.

2. Focus on process, not just results.

Does your website highlight thinking or implementation? Baker asks the same question in his book. After all, in the marketing world, what website doesn’t have case studies with triple-digit growth or a stack of impressive client logos below the fold?

“The past is not always a great predictor,” noted Smith. “You have to show your work.” That commitment to detailing process, not just results, is key to separating real experts from false ones.

“There’s a world of difference between experienced UX designers and people who read a blog post about it,” explained DePalma Studios’ Zach Watson. “One of the most important is having a proven research process.” (You can find ours here.)

“However,” Watson continued, “Most of our target market doesn’t understand this, so we’ve made it a huge part of our content strategy. By educating our audience on the critical nature of user research, we’re putting distance between us and other agencies that use graphic designers to do UX work.”

3. Challenge false expertise.

“You have an opportunity to draw a distinction between bad operators and what you do,” Smith explained. How? Educate and undermine. It’s not enough to share best practices—you must also call out false expertise.

As Smith envisioned, “Whether you buy from us or not, let’s educate you with what ‘good’ looks like. I may not get your business right now, but you’ll understand that I’m doing this for your benefit and my benefit. It’s mutually beneficial.”

It’s also the long play: To push experts forward and bury its false prophets, you must change consumer thinking about your entire industry.

4. Establish authority first.

Widespread false expertise can also undermine preferred marketing strategies. Dr. Nicole Prause, a neuroscience researcher, has continually battled an array of pseudo-science that plagues her field of sexual physiology.

For her company, Liberos, she abandoned a preferred, casual tone (such as using her first name) to keep signals of expertise at the forefront. Prause also focuses on outreach to credible media sources and cites media interviews on her website—a typically unnecessary approach for a neuroscience laboratory engaged in academic research.


Rooting out false expertise can improve the hiring process by reducing the “noise” of resumes. In meeting management, anonymous ideation or post-meeting fact-checking can diminish the influence of a persuasive presentation. And a pre-mortem on strategic decisions can spot biases you’ve so far ignored.

Improvements in each of those areas make it easier for you to hone your real expertise and differentiate your business from the hundreds (or thousands) of “thought leaders” whose haphazard growth, they believe, conveys expertise.

When I spoke with Smith, he was waiting on a flight at Dulles. He relayed an example unfolding before him. In this instance, the “experts” were potential security threats: “TSA is constantly trying to read behavior. But are they looking for a certain demographic or style? The way someone looks or sounds? Are they doing due diligence?”

Combating false expertise—in your head or others’—isn’t easy. “You can’t do this kind of rigorous decision-making for everything,” conceded Smith. If you’re hungry, he implored, just pick a restaurant.

“But do the hard work and avoid lazy decisions.”

The post How “False Expertise” Can Damage Your Business—and How to Protect It appeared first on CXL.