Every once in a while, someone will ask us: why is there no way to sort landing page templates in Thrive Architect by conversion rate?
We get this question because one of our competitors advertises sorting landing pages by conversion rate as a great feature and advantage of their tool. To clarify: we're not talking about the highest converting landing pages on your site, among the landing pages you've used. We're talking about sorting all landing pages templates based on the average conversion rates of all users.
At first glance, this might seem like good idea. I mean, who wouldn't want to be able to pick the highest converting template to start with?
In this post, I'll explain why sorting templates by conversion rate is wrong on, like, SO MANY levels...
More...
How to Pick the Fastest Running Shoe
Let's forget about landing pages for a moment. Let's say you're in the market for a new running shoe. And you come across a store that features a list of all of their running shoes, sorted by speed.
In a running shoe, faster is better, so this is great, right? You can pick among the fastest shoes, using this list.
But wait a minute - how is this data collected? How does this list come about?
You ask a store clerk and she tells you that the data comes from all the customers' real world use of the shoes. The store has aggregate data that shows how fast people are going in each of the shoes and can average out all those values to sort the shoes by speed.
Here's the problem: the only thing we know is the shoe model and the average speed among thousands of people using the shoe.
What we don't know includes:
- The age of the runners.
- The distance the shoes are used for (100m dash and marathon data are all lumped together).
- The terrain the shoes are used on.
- The athleticism of the wearers (data from professional athletes and hobby runners are lumped together).
- And many more factors...
As you can see, the factors we don't know make a much bigger difference than the factors we do know.
Sure, a better shoe model might help you run a bit faster, but the difference in speed between a senior citizen on a leisurely Sunday run and a professional 100m sprinter during a record attempt is greater than the difference between individual shoe models, by many orders of magnitude.
The difference in the speeds of different shoe models is therefore much more determined by factors we don't know and can't control for than by factors we do know.
And that, unfortunately, makes our list of shoe models sorted by speed completely useless.
What Shoes and Page Templates Have in Common
Okay, why am I talking about running shoes?
All the problems mentioned in the shoe example apply to landing page templates as well. It's just a bit more abstract and more diffiuclt to visualize than with shoes.
Here's what we could know and measure about how landing page templates perform:
- Number of visitors and conversions.
- The average conversion rate.
- How many people use each template.
- The broad category of the templates (e.g. sales pages and opt-in pages).
What we can't know and control for includes:
- The copy used on the page.
- Images, elements added, elements removed, changes made to the template (we could know them but not quantify them).
- The type and quality of the offer being presented.
- The price of products being sold.
- The "temperature" of traffic being sent to the page (think: traffic from an email list of fans vs. cold traffic from PPC ads).
- And many more factors...
We have the same problem as with the hypothetical shoe example above. The difference that these unknown factors make is vastly greater than the difference a template makes.
Example Comparison
A shoe worn by a professional athlete in a race will go much faster than a shoe worn by a senior citizen going for a stroll, regardless of the shoe model. A landing page selling a $5 product to an audience of long-time fans will have a vastly higher conversion rate than a landing page selling a $2,000 product to cold traffic, regardless of the template used.
Can't We Categorize?
An opt-in page for a free offer will always have a higher conversion rate than a sales page. So, what if we split up the templates by categories? We can compare sales pages to sales pages, opt-in pages to opt-in pages and so on. Doesn't this remove some of the randomness and lead to a useful apples-to-apples comparison?
No. Here are some examples to illustrate why:
Example 1
Offer 1:
Type: | Sales page |
Product: | Simple, universally appealing. |
Price: | $1 |
CR: | 6.9% |
Design: | Template A |
Offer 2:
Type: | Sales page |
Product: | Complex, professional, niche. |
Price: | $4,000 |
CR: | 0.72% |
Design: | Template B |
Template A is used on a page with almost 10x the conversion rate (CR) of Template B. But does that really mean Template A is better?
Example 2
Offer 3:
Type: | Opt-in page |
Offer: | "Subscribe to our newsletter" |
CR: | 1.2% |
Design: | Template C |
Offer 4:
Type: | Opt-in page |
Offer: | Valuable, relevant free course. |
CR: | 15.8% |
Design: | Template D |
Template D performs much better, but how much does that have to do with the template and how much with the difference between the two offers?
Example 3
Offer 5:
Type: | Webinar registration |
Copy: | Excellent copy, crafted by a pro. |
CR: | 19.2% |
Design: | Template E |
Offer 6:
Type: | Webinar registration |
Copy: | Boring, generic, typos everywhere. |
CR: | 1.4% |
Design: | Template F |
Template F seems to be a lot worse, but what would happen if the copywriting was elevated to a much higher level? Would the difference between templates E and F still exist?
Copywriting, pricing, the quality and value of an offer presented on the page: these are all factors that can't be controlled for and that make a huge difference to the performance of a page.
What if You Have a Ton of Data?
If we gather data from tens of thousands, hundreds of thousands or even millions of users, doesn't it average out and we end up with a useful list?
No. More data of this kind doesn't average out in any meaningful way. This is a classic big data problem: when we have a lot of data and we run models on it, it always seems like something relevant is happening. But you can measure what is effectively just noise and still find patterns in it. That doesn't mean you're measuring anything relevant or getting any real information out of the data.
You can think of it like this: measurements of what matters most to the conversion rate of a page are either imprecise or non-existent. You can't add up many imprecise measurements and average them out to reach a precise measurement.
Imagine that you have a scale made for weighing people. It's made to measure many kilograms/pounds and it's precise down to about 100 grams (0.22 lbs). If you try to weigh a needle, which weighs less than 1 gram, this scale won't give you any useful data.
And what if you weigh the needle 1,000 times and calculate the average? It makes no difference at all. You still won't be any closer to knowing the true weight of the needle.
What if We Filtered Data by Account?
What if we compared the conversion rates of different templates only within individual accounts and then ranked the pages overall, based on their performance within accounts?
In other words, this would look at the relative performance of templates used by the same user. This would eliminate some of the randomness. We can assume that the quality of copywriting will be roughly the same across the board, on pages made by the same user. We can also assume that the same user is likely to have similar offers in the same market, used across all pages.
It would present us with a new problem, though: small sample size. Most users will never use enough different templates and send enough traffic to all of them to provide a meaningful ranking.
Plus, it doesn't eliminate most of the other problems, such as comparing templates that have both been customized to the point where they have nothing in common with the original template anymore.
Think of this: the same user can load the same template on two pages and make changes to them, to run an A/B test. Presumably, one version will win the A/B test. So, the template is at the same time worse and better than itself. How do we translate that into useful data for a "rank by conversion rate" list?
This just illustrates, once again, that factors we can't control make a greater difference than which template was chosen.
Where Our Focus Lies
By now, I hope I've convinced you that any attempt to rank templates by conversion rate is hopeless. That's why we don't do it. And frankly, measuring this data and advertising the results as a useful feature would be dishonest.
Instead, there are two important factors we focus on, for our templates:
1) Built In Conversion Best Practices
There are some things - not many, but some - that reliably lead to higher conversions. These are the kinds of things that almost all high converting pages have in common. For example:
- A large, attention-grabbing headline at the top.
- Clearly visible, high contrast buttons/calls to action.
- A strong focus on a single call to action or a very small number of calls to action.
- A clear visual hierarchy.
- High contrast text in a crisp, readable font.
- Not having a big, slow, animated slider.
All of our templates come with these factors built in, so that if you do nothing but change the text and tweak the design, you have a great page. The conversion basics are taken care of.
2) Rapid Implementation
If you want optimal conversion rates, you need to test. And we try to make creating and testing pages easier by making our templates rapidly customizable.
The more time you have to spend on making your page look right, the less time you'll have to create A/B test variations and generally focus on the business side of your business. This is why rapid implementation is extremely important to us.
What's Your Take?
What's your take on this topic? Were you hoping for a "sort by conversion rate" feature in our pages? Did I change your mind?
Let me know by leaving a comment!
Thanks for saying it like it is…
The other issue I have is that you get a well-worn path effect with this sorting. Everyone uses the same template and a possible better template doesn’t get a chance to shine.
Thank you, Rob. True, the worn path problem can also arise. This would be relevant for landing pages sorted by user voting, for example.
This is SOOOOO true.. Great article Shane
Thank you, Sherwood!
As always, very insightful – you guys rock!
Thank you, Dennis!
Makes sense. But I would have loved to consume this post in form of a video (by Shane of course) and with some example pages which really drive home the point.
Great info nonetheless and very useful in explaining why conversion optimization is such a tricky thing to get right.
Thank you, Pullkit! Hey man, I thought the rare non-video post from me would be a welcome change. 😀
Nah man … It’s always good too another bald man doing great internet marketing (I’m bald too). It’s a great club to be in too with Andre Chaperon, Neil Patel and Darren Rowse just to name a few 🙂
Haha, I’ve never thought of this as an exclusive club of any kind. But I’ll take it. 😀
And I thought I was the only one who considered this bald thing… 😉
So you mean we still have to test for OURSELVES? Like people with a brain? Pfuuuuuuuhh … 😉
I know, pretty annoying right? 😀
Oh god don’t get me started on this. What, every template is going to have the same results regardless of the offer? That’s just so dumb on the face of it I can’t believe some people buy into it.
Haha, thank you, David. Glad to see that I’m not the only one who finds the very idea exasperating.
Nailed it Shane, and explained very well.
Thank you, Nick!
Always appreciate your honest and direct approach, Shane. It makes total sense.
Thank you, Kevin!
I come from a science background. I’m trained to detect BS. In IM, all I have to do is take a shallow breath, and the first thing I can smell is BS.
Yes, sorting landing pages by conversion across all users is gimmick borne of ignorance.
I guess if there’s any benefit, it might help indecisive people make a decision and actually launch a landing page … even though its under a false assumption.
Haha, yeah, there’s no shortage of BS in this space…
testing the new thrive comments feature here 🙂
How do you like it?
Shane, that bugged me for the longest time while I was a LP customer. Thanks for calling it out.
It bugged me as well, yes. And I was surprised that no one seemed bothered by something so blatantly misleading.
Very insightful. Thanks for taking the time to break this down!
My pleasure, Bryce. 🙂
Another fine illustration given by Shane and the Thrive Themes team explaining nuances business leaders exercise at the decision making level. The analogy used here is clear… stay focused on your site keeping attraction from becoming distraction. I’ll retell the story as a thought process to a client developing a content theme for a podcast.
Thank you for your comment, Larry!
Great insights, Shane. I appreciate thoughtful commentary and analysis based on data instead of hype or dogma.
Thank you, Mark!
Whereas I don’t “disagree” with the article, I would like to offer a different perspective.
As with most things in marketing, the answer is always going to be: test, test, test. Regardless if you have the “best converting landing page” or not, you should always be testing to improve.
As such, having a good baseline understanding of what IS working for others can help speed up the process of testing by having a control…… but, specifically, a control that WORKS .
To use your shoes example, if I knew that tennis shoes functioned better than riding boots, then I would avoid having to start out with testing riding boots only to find out that they are horrible for running. one, somewhere has tries to run a race with riding boots, and I’m sure they failed miserably. So, if that were documented, one could at least say “riding boots aren’t good for running”. They are good for riding.
Knowing that a landing page gets SOME conversions is better than starting out with a completely unproven landing page that I have no data on.
Again, I get the point you making and I don’t completely disagree with it, but having conversation data if something that has value.
(Please excuse any grammatical errors, using voice to text)
I see the point you’re getting at, but as in your analogy, this would only work if a template is really complete garbage and has such a fundamental flaw in the template that it cannot be saved by good copy, good offer etc.
So, here’s the concession we can make: if we sort templates by average conversion rate and some of those templates have no call to action – no way for a visitor to convert and no way for the user to add a way for the visitor to convert – then indeed, we would see those templates sort to the bottom of the list.
I submit that this doesn’t make the list more useful, because it only separates out templates that are SO BAD that we could tell at a glance not to use them, even with no understanding of CRO technicalities and no design sense. It also only works if the template customization is very minimal. If you had such a flawed template in Thrive Architect, you could still just add a button or opt-in form to it and fix the problem. And so, someone could take the “broken” template, add a call to action, write great copy for a great offer and beat the pants off the average conversion rate of the “best” template of the bunch.
Don’t fall for the data fallacy. This is the real problem here. We see data, we see numbers and averages and with think there MUST be meaning behind those numbers. Especially when those numbers seem precise (like: 7.283%) and when we know those seemingly precise numbers come from a large data set.
Surely, there must be some use to be gleaned from all this data and precision, our brains tell us.
But the answer is simple: no, there is not. The information content of the data is exactly 0.
An aggregation of data for Product A, Product B, Product n… is meaningless and Invalid statistical information… Spend your time getting the important things right…
Excellent breakdown, Shane.
Seems obvious, but I can also see why promotion/hype would get people excited about the “feature”.
Those examples you illustrated really drive the point home. Love that you answered all the “But what if…” Qs that your reader might be wondering.
Thank you, Jay!
Good info, but really just want to try the new Comments tool.
What do you think? 🙂
Agreed. You could try to analyze all the data and try to segment. You could even run a D.O.E or a multiple regression to try to get an equation that would explain and maximize the model based on all the different variables. But, before all that you should run a hypothesis test to determine if there actually is some statistical difference between the groups that would allow you to sort the pages by conversion rate. Most likely the p value you´ll get will be higher than 0.05, failing to reject the Ho hypothesis, which would conclude that there is no significant difference between groups and therefore sorting by conversion rate is not something really statistically valid.
Yes, exactly. I really can’t imagine that you could run any model on this data that would give you any respectable p value.
Totally agree. For high conversion tactics we should probably better resort to case studies. The folks at Sumo.com have their KPI fixed on recurring monthly revenue, I would say that that should be starting point whether you’re selling products, memberships or services.
Recurring revenue is the en vogue growth factor to focus on, yes. It makes sense for most businesses, too.
You give so much clarity always. I love you guys and I am so glad to have found you and being a member.
That’s great to hear. Thank you, Silje!
Spot on… I get a similar question from affiliates: what’s your conversion rate? And it all depends on what running shoes the affiliate is wearing 🙂
Ah yes, the good old “what’s a good conversion rate?” question. 🙂
Great analysis Shane…and great analogies that hopefully **everyone** can understand! Your (& Thrive Themes’) transparency & marketing honesty are what set you apart from ALL the (pseudo)-competition out there!
LP turned me off **completely** months ago with some very misleading advertising! If a product is quality, there is no need to resort to such ridiculous tactics!
Always proud to tell people I’m a delighted Thrive Member, and point them your way!
Great article! 🙂
Thank you for your comment, Karen!
Makes total sense! When I first heard about this concept, in reference to LandingPages, my first intuition was a resistance. I was calling BS somewhere in my head and you explained why it didn’t make any sense. The ONLY way to test out and find the best performing template is to fix – make constant – all the other variables on the page, such as copyrighting style and content, the product being sold, the target audience, many, many such variables. And the only way to truly make these variable fixed is by testing two versions of the exact same copy side-by-side. So only the person or the company selling their products can carry out such tests reliably. Which brings me to A/B testing. So glad you guys are working on it… (Or is it already released?)
Yes, that’s exactly right. And even trying to fix all the other variables on a page isn’t always possible. Two different templates may have different amounts of text, may have completely different layouts and sections and so on, so that you can’t fit the exact same content from Template A into Template B, even if you wanted to.
But yes, A/B testing is where it’s at and I’m looking forward to releasing this for our landing pages.
I agree with your thoughts on this Shane. At the same time, there are also exceptions.
For example, I do tend to consider conversion rates when choosing a webinar registration page design. Obviously, the conversion rate is to some degree dependent on the topic, the headline, and target audience. However, the design also has a big role to play and has a sizable influence on conversion.
Webinar registration pages all have a similar function and so your ‘shoe sale’ example doesn’t hold water here. At the very least, a webinar registration page design that has a very low rating will make me think carefully before implementing it.
Thanks for your comment, Mary.
Seems I have not succeeded in fully explaining this yet. So, let me bust that webinar registration page bubble for you:
Imagine 2 templates, both for webinar registrations. Template A is a “bad” template. Poorly designed, confusing layout. Templats B is the “good” template. Excellently designed and all that.
Now, imagine someone uses Template A (the bad one) and creates a webinar which features a superstar guest – let’s say Oprah Winfrey – and it’s for a one-time-only event. And they come in and customize the template to improve it significantly and they write the best copy of all times. And this page is sent out to a red-hot list of members who all know that this one webinar event is an exclusive opportunity that will never present itself again.
We get an 80%+ conversion rate using the “bad” template.
Now, we use template B (the good one) and we create a generic webinar offer. Nothing awful, but just nothing really interesting. The copy is vague the hosts are unknown. And the traffic source is from PPC ads.
We get a 2% conversion rate using the “good” template.
If you agree that this scenario is possible – that it’s possible to get a great conversion rate on a poorly designed template and a bad conversion rate on a well designed one – then you must also agree that everything in between is equally possible.
And we can’t control for the factors that matter. We have no algorithm (yet) that can say “if the guest is Oprah Winfrey, give this conversion result a different weight”.
So, if everything from awful to great is about equally possible on good and bad templates, then the data we collect is all noise and no information. And picking a “better” template based on average conversion rate is, in fact, not making any difference at all.
Kaboom! Not sure why this isn’t clear after all the a/b testing goobering..
its funny in the on and off line fitness field..this is exactly the thinking of fit pro’s ..just find the highest fastest cash in the pocket with with/out an ounce of fulfillment approach…
This why the entire industry is behind the fitness delivery curve..massive amount of opportunity is being left on the table..
Buyers remorse and massive consumer confidence blind spots. around every fitness to end-user corner .
Thanks for the very awesome post.
Instructor Savant Professorjohn
Yeah, the fitness industry is another area rampant with the practice of selling people what they want to hear, no matter how far from the truth is.
Great, thank you!
You’re welcome, Thomas.
The wisdom of the crowd is a trailing indicator – meaning that it will always be based on averages of the accepted wisdom of previous times.
I do not want to know what ” everyone else” is doing, I want to know what I should be doing.
Thanks for this great, rational, and usable discussion of internet wisdom.
Check out “Story Brand” by Donald Miller. What he teaches overlaps with how you set up websites. You are both sharing leading edge marketing concepts. Thank you.
Thank you for your comment!
Yes, that’s a good point as well. I saw a talk by Donald Miller and I liked what he had to say. I haven’t read the book, but I liked the concept he laid out in his talk.
Very well said, Shane. Marketing (especially the online arena) is full of claims and theories that people would like to believe. As the song goes, It Ain’t Necessarily So…
Yes, indeed. There’s great temptation to tell people (or sell to people) what they want to hear, rather than what’s true.
…but you didn’t answer the question, Shane!
Which shoes ARE the fastest????
Hahaha… asking the real questions, here. 🙂
It is good to know Thrive has such a deep understanding of data and how to process it as it gives me confidence when I read information from Thrive.
I have used data throughout my career, which started when running a small pathology lab in toxicological research where colossal sums were spent on experiments. Processing the data had many similarities to what you, Shane, have explained for understanding templates, what goes into them and what comes out of them.
Raw data is just data. Processing it converts it to meaningful new knowledge provided the research was soundly and knowledgably undertaken.
Thank you, Shane.
Thank you for your input on this post, Peter!
The hipe about conversion rates 🙂
A guy told me: I have 65% Conversion Rate! And i’m asking: And how much money you make with this 65%? He told me: 4’000 bugs
I have a “bad” conversion rate of 8%. But i make 20’000 bugs.
So, it comes all to “how much money is in your pocket and how much money you have spend”. All those nice % figures are nice – but its finally all about the money.
Yeah, that’s a good point as well. This is something I’ve seen as well, from the perspective of a vendor with thousands of affiliates. The most powerful and highest earning affiliates are rarely the ones with the largest audiences and largest numbers. There’s an aspect of lead quality that is much harder to quantify than just a conversion rate on a landing page.
Superb article – but you probably could have just written this line:
“Presumably, one version will win the A/B test. So, the template is at the same time worse and better than itself.”
…and dropped the mic 😀
Haha, fair point. 🙂
Well thought out and presented. I really like the comparison to running shoes and found it very meaningful.
Thank you for your comment, Dane! Glad to know you found this insightful.
Thanks, Shane,
Your story applies to so many areas of business and persuasion.
Great work.
Thank you, Dale!
I love your work when it comes to marketing Shane, but I don’t love it so much when it comes to statistics. Unless you believe that there’s some systematic effect where all the bad copywriters choose template A and all the good copywriters choose template B, the effect of copy quality really will average out across enough measurements. This is just as if you were running a drug trial: some people are healthier, some have better genes, every person is different in huge ways, and it does average out. If all the healthy people or good copywriters were in group A you’d have a problem, but they’re not, it’s random.
You also don’t have to take my word for it, every statistical test looks at whether between-group variance is sufficiently greater than within-group variance. Someone a few posts above suggested this, although they seem to think it won’t come up significant whereas I’d guess that it probably will. But why are we guessing and handwaving? Get data on a per-subscriber, per-template level, do an ANOVA to look for effect of template and then post-hoc tests to see which templates actually perform differently from each other. Using Bonferroni corrections here would avoid the “classic big data problem” you mention.
I’ve seen how LP do it (previous user) and I don’t think they do it right, but it can be done.
Hello Matthew,
First of all: thank you for disagreeing with me! It would be easy to just dismiss the content and go elsewhere, but by voicing your criticism, you give me an opportunity to learn and grow and I appreciate that very much.
So, I agree that using the right models, you can glean real differences, even if there are many variables you can’t control. And I agree that this is something I didn’t, but perhaps should have addressed in the post as well.
From my understanding, this doesn’t solve the core problem, though.
Let me try and use your analogy to explain why: let’s say we’re doing a medical test for some pill. And we know that indeed, in our test group there are younger and older people, healthier and less healthy people.
I think the problem we have with the landing pages is analogous to trying to find out whether blue, white or red pills will have the most positive effect on mortality in our medical study. Yes, there’s going to be some placebo effect based on the color of the pill, but for a large outcome like mortality, the effect of the pill’s color is expected to be so weak and the effects of dozens of other factors so strong that it’s unlikely we’ll arrive at any useful insight, no matter the size of the data set.
And another issue that we have particularly with Thrive Architect, is that all templates are fully editable. You can take any template and transform it into something completely different. This would also average out, but it influences the signal:noise ratio in the data significantly. It’s the equivalent of some people taking completely different pills in the medical study, but you don’t know who or what pills.
Never really thought about this sorting thing. However, I love your focused and high-professional way of bringing things straight to the point. No fluffs and fillers here at Thrive Themes. So outstanding.
Thank you for your comment, Chris!
Great stuff, Shane. Love everything you guys do.
Thank you, Bryan!
since thrive doesnt have an A/B split test (it has one just for the headline), what split testing solution do you recommend? (i know youre creating a solution for before the end of the year, but in the mean time?)
Thanks for another in depth and informative article.
Yup! Couldn’t agree more. I had a sales page that was selling nothing. absolutely nothing. I quickly sold out by increasing the price by 50% and adding 1 line of text. I did nothing to the template, design or anything else.
As much as I like things to look nice, design simply isn’t that important (within reason) if someone is interested in what you’re saying.
Thanks for the example, Debs!
As a salesman and marketer for several decades, your explanation is excellent and makes sense. I love your running shoe and weighing a needle examples. They do make an abstract concept more understandable. There are so many factors affecting the conversion rates, even, shockingly 😉 , a few you didn’t name, like load times for the different sites, color schemes used relevant to the target audiences, and how well the products appeal to the selected audience.
Thank you for your comment, John!
Indeed, the list of uncontrolled variables that can affect these outcomes is practically endless.
Thanks for the great post!
Your feeling was definitely correct on this one. 🙂