This is an interesting post. If you’re into advanced SEO, you’re really going to get a kick out of this. At the very least, it’ll get your mind working.
On the internet these days, there is a big push to make each and every page of your website unique. The push comes from those who have managed to become respected through the years in the area of search engine optimization. I’ve long had a theory that many of these folks are either reading the same thing as you and me, or they are deliberately spouting out false information. If they are reading the same content as us, they are merely parrots repeating the repetition, which is many times worse than playing the telephone game. Information in the internet today can be categorized as horrible.
What I want to talk about today is a specific area of ecommerce websites that has to do with the canonicalization of web pages. If you aren’t already familiar with the term canonicalization, it simply means that when a search engine crawls two pages that are similar enough to be combined as one, it does so. This helps all parties involved in many ways. For the search engines, it reduces storage space. Sections of pages like this can combined and kept in a different format than other more unique types of pages. For the website owner, the similar pages are combined and one may rank well. If they were treated as two unique pages (separate), neither may rank at all. Both may rank, but they would have to have some authority to do so. In my experience, two similar pages that have been combined have a much better chance at ranking.
Now, it puzzles me that this push to keep every page of every website unique is coming from who we refer to as professionals, because I have found something that is quite the opposite to be much more beneficial. And I have the stats to back it up.
I’m going to give you a short example first and then a bit of theory. Let’s say you have a classifieds website that has many members and many advertisements. Let’s say that for each member who registers at your site and places an ad, a few different web pages are created. First, the ad page pops up out of nowhere. After that, a new member profile page that holds all ads placed by that member is created as well as tag pages for any tags the member applied to their ad. The ad is also listed on the appropriate category and subcategory pages as well as search results.
Boy, those sure are a lot of areas affected by just one ad.
To keep things simple, let’s talk about just the member profile page and a tag page. Remember, each member only has one profile page (it may paginate if there are enough ads) that all their ads are listed on, but each and every tag gets its own page with all listings that have the same tag applied to it. If one ad has five tags that are unique to that ad, five tag pages are created.
The content of the pages may look something like this:
All other things being equal, those are the content areas of the pages I mentioned earlier.
Now, I want you to pay particular attention to the words in the green bar in each image. If a website has ten million members, there will be at least ten million profile pages and if there are ten million listing tags, there will be at least ten million tag pages. The green bar is a fancy CSS styled H1 tag. Here’s where things get interesting.
And just to let you know, the two pages above came from one of my own websites.
Back a few years ago, the developers of the software I use were fairly simple people. They offered software with basic templates and no heading tags. The only heading tags I used on the sites I operate were the ones I placed there myself. Those were primarily used on the category pages. About two years ago, after they probably had heard enough whining from their customer base, the developers decided to release a new template set. Elements of this new template set were incorporated into the base code as well. Unbeknownst to me, there were H1 and H2 tags stuffed in around every corner. You want to know something? Ever since then, things haven’t been the same.
I have played around with templates and page titles since 2004 and massive traffic fluctuations have followed each change. Before the heading tags were used in the templates, it seemed as though the changes would create more dramatic swings than after. It always seemed that when I would change a page title to separate and make unique an area of the website, traffic would drop, but then when I would make a generic title to cover all pages, traffic would increase. I’ll give an example below.
Let’s talk about member profile pages. Back in the day, when I had all member profile pages titled, “Seller Ads,” traffic seemed to do well, but when I would change the titles to something unique, such as, “Member A Ads” and “Member B Ads,” traffic would tank. I never knew why. And after years of reading terrible advice online about how each and every page is supposed to be unique, I made things worse. The same thing happened with different sections of the site.
In the case I just mentioned above, here’s what I think was happening – When the search engines would crawl the individual pages the duplicate titles, I think they consolidated all the pages into similar clusters, or canonicalized them. When I made the titles unique, I broke the relationships and each page was treated as a weaker individual. I’ll show you some stats below.
The big upswing you see in July of 2009 is right after I merged many areas of the website at hand under one generic page and title. No pages were noindexed and all had the same exact title. After altering the titles and templates, traffic dropped.
I mentioned above that the swings were less severe after the templates were updated by the developers. This is even after I would change the titles the same way as I had earlier. I always wondered why, until today.
It’s my hunch, based one my experience, that H1 and H2, or any heading tags for that matter, affect page canonicalization similarly to that of page titles. This would explain why the traffic increases and drops were less dramatic after the heading tags were added. Things were getting worse and worse because of the breaking of relationships. I’ll give you another example below.
Let’s talk about the tag pages now. If I had ten million tag pages, all unique because of the page title and H1 tag (in the green bar you can see above), and many fairly thin because of the small number of ads with the same tag, wouldn’t many of them rank poorly because of the duplicate and poor content on each page? Especially after the Google Panda update? Wouldn’t in behoove me to remove the H1 tag and duplicate the tag page title across all tag pages so Google and the other search engines can merge the duplicate and thin pages that need to be merged?
Perhaps this will help things make sense.
Look at the two pieces of page content I posted above. The first is from the member’s profile page and the other is from the one and only tag this person used in their ad. Both pieces of content are almost identical. Shouldn’t those two pages be canonicalized into one more powerful page? Perhaps Google would even merge each of these pages into a category page. They were back a few years ago, but ever since I changed the titles around and the developers added the H1 tags to each page, they are treated as separate, weaker pages. Actually, neither ranks for anything. Multiply that by hundreds of thousands and I’d say there is a problem.
Conventional wisdom tells me to keep these pages unique, regarding page title and heading tag, but experience tells me that I should change things back to a more generic setup like I had earlier. Have one generic page title for all of these similar areas of the website and let the search engines merge pages where they see fit. I’m fairly certain the search engines are more familiar with what they want than I am.
Thoughts on this? I’m curious to read some feedback, especially from those who have already dealt with page title, heading tag and canonicalization issues.