WhatsApp us

The Power Of The Canonical Tag

If you're reading this, you're probably a Search Engine Optimiser (SEO) or probably a web designer who wants the site you're designing to get a favourable listing on the Search Engines Results Page (SERP).

For today's topic, we're going to touch on the difference between rel=canonical and the 301 redirect. Below is a video by Matt Cutts.

You'll notice that the above embedded video discussed about link juice loss from a 301 redirect and why that's a necessity. But the statement that most closely matches what we're asking ourselves about here comes in the last few seconds when he's comparing the strength of the canonical tag and the 301 redirect and states, “... but as far as the amount of PageRank that gets passed, there's not a lot of difference.” When the man says it, you can really believe it.

In one video we can get two pieces of information on the amount of weight the rel=canonical tag can pass over which combined lead only to one conclusion. The two pieces of information are:

1) There is very little strength loss on a 301 redirect.
2) The amount of strength passed via a 301 and the rel=canonical tag are virtually the same.
The conclusion then is that an exploit that inserts the rel=canonical tag onto a page can be a very effective strategy, on par with 301ing the page itself but even “better” in that it likely won't be detected by the site owner.

Is This An Issue?

The next question we need to ask ourselves is, “Is this an issue now or just a warning?” The answer is that it is an issue right now. WebmasterWorld user goodroi claims to have seen evidence of this and I have no reason to doubt him – he knows his stuff; but even if we want to take that claim with a grain of salt, Matt Cutts sent out the following Tweet on May 13th, “A recent spam trend is hacking websites to insert rel=canonical pointing to hacker's site. If U suspect hacking, check 4 it.”

With that, let's assume it's an issue, a known issue, and now discuss who's at risk and how to contend with it.

The Hack

Sadly, there is no one hack when we're dealing with things like this. Every environment has it's own weaknesses, some more than others.

A WordPress blog, for example, has different weaknesses than a custom CMS, which is different than a static site. To be sure, all are vulnerable and where there's monetary incentives, there are people who will look to exploit the situation.

The hardest part to contend with is that the offending element isn't visible nor will it generate warnings about your site in the SERPs as malware will. It'll just sit there, quiet in the header passing your strength to another domain.

I haven't heard any tales yet of a cloaked hack, but the question was asked in the forum thread if it’s possible. I'm familiar enough with cloaking techniques to confirm that it wouldn't be that difficult to cloak the tag, so when you view your source it's not there but appears when Googlebot drops by.

The only security you have is your own site security and hosting environment. Ensuring that your CMS is fully up to date (so stop ignoring that WordPress update notice) and that your hosting environment is secure (have you changed your password since the last time you've given it to a third party?).

These are all best practices to defend against all exploits. This current situation is simply a notice of another use of your potential vulnerabilities.

This isn't a new issue and as Matt Cutts puts it: “On the ‘bright’ side, if a hacker can control your website enough to insert a rel=canonical tag, they usually do far more malicious things like insert malware, hidden or malicious links/text, etc.”

It's not new that they'll be there – it's just the nature of what they're doing that is different. You may not get a malware warning, you'll “just” notice that all the power of your page is gone.