It takes one wrong word to put your foot in your mouth. We’ve all done it and, in the process, squandered an opportunity to impress someone (or some crowd).
Copy is a bridge between your product and your customers. Design matters, too, but it’s context for the message—not the message itself. It’s why copy is twice as important:
But how do you improve it? How do you know which word or phrase might tank a sale? Or what missing detail preserves just enough uncertainty to keep someone from clicking “Add to cart”?
While copy testing has been a decades-long standard for brand spots, it wasn’t built for the modern age. And A/B testing, another way to pit phrase-against-phrase, doesn’t tell you why a version won. That leaves you with a lot of uncertainty—which HiPPOs are all too happy to resolve.
A modern copy testing methodology, by contrast, delivers fast, affordable bursts of quantitative and qualitative feedback for direct-response copy.
Pre-testing: great for singular brand campaigns—and little else
Copy testing isn’t new. It came from “pre-testing,” and it made more sense when companies ran singular brand campaigns.
If you needed to find the pitch with the highest “day-after recall” because you were running the same TV ad for weeks (or months), pre-testing helped protect you from a total flop.
You gathered a “consumer jury,” exposed them to variations of an ad, then measured the likeability, persuasion, and recall of ads with quantitative (e.g., “On a scale of 1 to 100…”) and qualitative (i.e. open-ended questions) methods.
Modern versions show ads to consumers who watch them online, from their homes. Still today, pre-testing is slow and expensive. Even contemporary, streamlined methods of pre-testing have price tags between $2,000 and $6,500 per ad.
That’s no problem for months-in-development campaigns and seven-figure ad budgets. It’s ludicrous for a startup and our rapid-fire environment of rotating social media ads and landing page tweaks.
As Frito Lay’s CMO Ram Krishnan concedes, “It’s very tough to test just because of the volume of content we are putting out.” Even if you’re spending millions on Google and Facebook, the traditional methodology has pitfalls.
What copy testing will (and won’t) achieve
If you ask a focus group to choose a color for your brand, you won’t end up with fuschia.
The ad that performed the best on average, especially for quantitative metrics, delivers your typically inoffensive McDonald’s spot—the safety scissors of the ad world. Sacrifice reward; avoid risk.
Some marketers take the opposite approach by avoiding copy testing. Neither Allstate’s “Mayhem” campaign nor Old Spice’s “The Man Your Man Could Smell Like” campaign went through copy testing. (There was “a lot of pressure to kill” the former, according to Allstate’s former VP of marketing.)
Being weird was the point. As Oscar Meyer’s former ad lead Tom Bick recalls, “We literally used what I fondly called the F-me test. Is it bold, will it possibly ruffle feathers internally, will consumers say, ‘I can’t believe they did that’?”
Testing can lead to false confidence, cautions Bick:
It gives you the illusion that you are being a disciplined marketer and it gives you a sense of confidence, be it false, that you are doing the right thing.
Advertising is about building trust and a feeling about a brand that predisposes people to liking you [. . .] that then allows more rational messaging maybe to come through the filter. And most copy tests don’t reward you for that.
A/B testing copy has limitations, too.
Why A/B testing isn’t a replacement
A/B testing can tell you which version of copy generated more leads or sales. It tells you nothing about why a given variation won.
For ad campaigns, A/B testing also risks spending a lot of cash on a losing variation, one whose shortcomings you could’ve sussed out in advance.
A/B testing further assumes that you have enough traffic to test to begin with, which becomes increasingly less likely as users move down the funnel. (A blog post earns more traffic than a product page, which earns more traffic than a checkout page…)
Even then, do you really want to test your wild ideas on purchase-ready buyers? As Unilever’s Elliot Roazen notes, that’s an expensive, haphazard experimentation process:
Creative and product teams will work to put together sales pages and then launch the pages with paid media behind them, tweaking the page’s copy and design based on performance metrics.
The problem is that these assumptions, more often than not, are merely hunches, and paid traffic isn’t exactly cheap.
There’s also a risk of lost context. If you’re testing a brand new value proposition on your homepage, what happens if the product page alludes to the value prop of your control? Or your drip campaign touts unrelated benefits? A/B testing your messaging carelessly can turn your marketing copy into a patchwork quilt.
If the copy in a variation is ignored or contradicted elsewhere in the funnel, how will you know the impact of copy changes to one page? You won’t.
A modern approach to copy testing
Direct-response copy is the driving force of modern marketing. Compared to pre-testing of TV campaigns, it has different needs. Recall is less important—attention (e.g., on a landing page) is already won and doesn’t need to be maintained for long.
A modern, data-driven approach blends quantitative and qualitative data to tell you:
- What is or isn’t working (quantitative). How easy is it to understand your message? How much do people care about that pitch? How badly do they want to keep reading?
- Why it is or isn’t working (qualitative). Which words and phrases make a difference? Which are missing? What turns people off?
Peep Laja, founder of CXL Institute, explains what that looks like for us:
CXL Institute has 100+ landing pages—one for each course and training program, and a number of PPC landing pages. These pages are copy-heavy, with hundreds of words because CXL Institute is a complicated, expensive product.
The way to increase the conversion rate on those pages is to improve the copy. But web analytics or heat maps can’t tell you anything about the quality of your words.
Most get by with opinions from their colleagues—because they’re easy to source. Of course, the constituent whose opinion matters most is the customer.
The process for getting answers for any use case breaks down into four steps.
A four-step copy testing methodology
Copy testing is a research methodology, not a set-in-stone process. There’s flexibility based on who you are and the questions you need to answer.
You can tailor these broad steps to your needs.
Step 1: Develop research goals and questions.
Make a list of things you want to learn from copy testing: What is it that you want to know? Typically, you want to focus on uncovering friction and copy blind spots.
You might have research questions about the overall copy (“What doubts and hesitations do you have about this?”) or a specific section of the page (“On a scale of 1 to 5, how interesting is this?”).
“There’s no limit to what questions you might ask,” says Laja. “You start with research goals. Then, you formulate the question accordingly. Few do copy testing after a page is live, although this is low-hanging fruit.”
As Roazen has found, copy testing can also help refine product messaging prior to a launch:
Our mandate has recently switched to the creation of new brands, which (roughly speaking) follows a workflow of ideation, validation, launch, and optimization. Between each of these stages, we sense check our communications with feedback from target consumers.
For some concepts, the feedback from these tests results in a serious pivot. You really have only a few seconds to communicate the “what” you are selling, the value that this product provides, and how much you’re selling it for. In rounds of copy testing, consumers have said our product pages do not clearly articulate one of these key communication points, meaning we have to figure out a change that makes this clearer.
By rigorously copy testing your sales page, you ensure that you are getting verbatim, qualitative feedback to refine your copy further. This gives you a head start when you finally do launch. Essentially, you’re starting on second or third base.
Step 2: Recruit panelists.
You need folks to be part of your study. This is qualitative research, so as few as five people will add value, but the optimal range is around 15 people.
Find people interested in your offer (i.e. your target audience) but aren’t customers yet (so they’re unbiased).
For consumer products, interest-based Facebook groups are a good place to find people. For specific B2B folks (targeting by title + industry), LinkedIn is a good bet.
You need to compensate the panelists for their time (e.g., gift cards, real money). The more niche or hard-to-get people are, the more you need to pay.
Step 3: Facilitate research sessions.
Run individual sessions with each panelist. Any video conferencing tool with screen-sharing functionality works. As the panelist reads the copy, ask the research questions you’ve prepared.
Step 4: Gather all the research data in one place.
The simplest way to analyze the data is with a spreadsheet. Gather all the questions and answers you got from the panelists.
Like any research, it takes time and effort. (The way around it is to use a tool like Copytesting, which automates all of that for you.)
If you’re wondering what you’ll learn, here are some examples, broken down by quantitative and qualitative results.
Quantitative and qualitative results from copy testing
Examples of quantitative feedback
Quantitative feedback from copy tests tells you:
- How clear a message is (e.g., via a Likert scale). Do users understand your headline? Your value proposition? Is jargon or awkward phrasing getting in the way? Clarity beats persuasion.
- How much people care. Are you talking about the things that people care about? A clear pitch for the wrong benefit isn’t persuasive.
- How much people want to keep reading. If the goal of a headline is to interest people in reading what comes after, are you doing a good job?
For example, in a test on Copytesting, Kion Flex scored well (4.3 out of 5) for clarity. The product describes the problem it solves—“mild, temporary joint discomfort”:
But while clear, the messaging is general. Is it best as a daily supplement? For injury recovery? Aging joints?
Readers cared less (in Copytesting parlance, the “CareScore” was lower) about the points made. A generic use-case robs readers of the “this is exactly what I’m looking for” moment. A supplement for any joint discomfort doesn’t sound like the one I need for my issue.
Compare that to Lambda School, which scored exceptionally well on CareScore:
The headline certainly helps—they’re “the school that invests in you.” But they back that up by addressing a primary anxiety in higher education and a barrier for many: “pay no tuition until you’re hired.”
These interpretations, of course, would be speculative without the qualitative feedback to support them.
Examples of qualitative feedback
When it comes to copy, the problem often isn’t the wrong words but the missing ones—the specifics of which you won’t uncover without qualitative feedback.
That lack of information costs you sales, found Nielsen Norman Group:
In our e-commerce studies, we found that 20% of the overall task failures in the study—times when users failed to successfully complete a purchase when asked to do so—could be attributed to incomplete or unclear product information.
Supply sells a $75 single-blade razor. But its copy promises the same benefits as every other razor—less irritation, nicks, and bumps.
Why this razor—at this price point? Consumers are unsure:
- “I’d ultimately like to ask what makes this better than other similar products on the market?”
- “I’d like to know more about the design of the handle and why it looks the way it does. Is it made to be disposable, or how long will it last?”
- “Do people who bought it think it’s worth $75? How much are extra blades?”
The feedback is a laundry list of questions that crave specifics on exactly how it works, the materials of its construction, and performance differences between a single-blade razor and the ubiquitous three-blade varieties.
Despite highlighting real, ROI-focused outcomes, testers we’re skeptical. “This sounds great,” Rost heard again and again, “but we don’t believe it.”
Rost and his team realized they needed to embed details about who achieved those results (e.g., testimonials from real people at real companies) and explain how they did it—the “meat and potatoes” of the process.
In other instances, the words that are present cause problems.
But how far is too far?
Quantitative feedback alone about whether the above copy was persuasive (or, in the context of an A/B test, whether that variation converted better), wouldn’t reveal which phrases resonated or put people off.
Here’s what people had to say about the paragraphs above:
- “To me that sounds really militant.”
- “It sounds rather elitist.”
- “This section turned me off. It comes across as haughty and unnecessarily arrogant.”
- “I don’t like that it says too hard for most, because that sounds a bit snobby.”
If we were going for elitist or arrogant, we nailed it. But we weren’t.
Some respondents were “intrigued” by the pitch of the courses as “challenging,” but, overall, the aggressiveness of the copy made us seem like jerks. So we dialed it back:
A new group of testers validated those changes:
- “The tone of the rest of the text does a good job of implying the type of commitment that’s needed to learn something of value via their course and website.”
- “It’s refreshing to see a program that discloses that effort is required.”
- “I think this program can be trusted since it says that people should only take this course if they are serious about their careers.”
Some people still thought it was a bit much, but then again, CXL Institute isn’t for everyone :)
At an even more granular level, we identified trigger words that really turned people off. “Badass,” apparently, is one of those words:
- “I think this part of the segment is unnecessary: ‘It takes real badass.’ I honestly think this is pretty cheesy and takes away from the professionalism of the program.”
- “I don’t like the swear word. It sounds like it takes effort to do the program but it could have been more professionally said.”
- “The whole ‘badass’ phrasing is so dodgy it doesn’t feel like the big names that train with the company can be legitimate.”
For SwipeGuide, Rost catalogs such keywords, good and bad:
We’ve gotten great insight into what kind of language turns people on, sparks interest, or makes them skeptical.
We now have a list of keywords that B2B-minded people are looking for in a benefits page. When I go back to implement revisions, I can target keywords that are unclear.
If you’re writing copy—or accountable for its success—you want to know this stuff. Otherwise, you risk two things:
- Running an expensive ad campaign with copy that doesn’t work.
- Throwing out a bunch of good copy because you don’t realize that one word is poisonous.
Because qualitative feedback helps you understand exactly where you’ve gone too far, you can take risks—rather than staying in the safe center.
As Chef Sean Brock writes, “Overseason something with salt and acid just so you know what is too much. Then ride the line, and you’ll find your balance.” Without qualitative feedback, you’re throwing out the whole plate of food, still none the wiser.
Modern copy testing delivers data to support—or challenge—choices for direct-response copy. It also gives marketers the qualitative feedback to know what needs to be changed, be it a single word or whole paragraphs.
You may be happy that a percentage of reviewers think your copy is weird. Success, as with pre-testing, may not be about maxing out your quantitative scores. But, armed with information on why reviewers think what they think, you’ll know the risks and rewards you’re choosing.