Hearing the world ‘duplicate content penalty’ strikes fear in the hearts of most marketers. However, understand that it’s only people with no SEO experience that use this phrase with any frequency. Most have never read Google’s guidelines on duplicate content, and they just somehow conclude that there’s going to be heck to pay if something appears twice online.
Here at 4GoodHosting, part of what makes us a good Canadian web hosting provider is the way in which we’re frank with our customers about exactly how it is the in the world of digital marketing. Is publishing duplicate content advisable? No, it’s certainly not. Is it going to be catastrophic for my visibility online as someone with a real interest in successful digital marketing. Very unlikely, and that’s going against what many of you have likely heard.
Let’s bust some duplicate content myths today.
Myth #1: Non-Original Content on a Site Will Mean Lower Rankings Across Your Domain
There has yet to be any evidence that non-original content hurts a site’s ranking, except for in one truly extreme and rare instance. The same day a new website went live, a very lazy PR firm copied the home page text and pasted it into a press release. By putting it on various wire services they immediately created hundreds of versions of the same homepage content plastered all over the web. Google took note, and not in a good way, and the domain was manually blacklisted.
Why was this so much of a problem, when similar instances - albeit on a lesser scale - occur every day? For starters, let’s consider volume. There were hundreds of instances of the same text. Next, timing; All the content appeared at the same time. Next, Context; It was identical homepage copy on a brand new domain.
There’s a lot to be tolerated, but laziness isn’t going to be. However, this isn’t what people are talking about when they offer the phrase ‘duplicate content.’ It takes more than simply same word-for-word copy from one well-known site copied to another lesser known one to make red lights go off at Google.
It’s a fact that many sites - including some of the most popular blogs on the internet - frequently repost articles that first appeared somewhere else. There’s no expectation that this content will rank, but they also know it won’t make their domain less credible.
Myth #2: Scrapers Will Hurt Your Site
Experts familiar with Google Webmaster Tools know that when a scraper site copies a post any links to his site through that copy are disavowed. And if you’ve ever seen the analytics for a big blog you’ll know that some sites get scraped ten times even before the clock reaches 8am. Trackback reports bear this out, and no they do NOT have a full-time team watching GWT and disavowing links all day? Scrapers and duplicate content are quite simply NOT a priority for them.
Scrapers don’t help or hurt you, and primarily because the sites they’re serving aren’t even relevant or visible in the first place, and the scrapers usually take the article verbatim, links and all. Those links pass with little or no authority, and the occasional referral visit isn’t going to get those recipients very far 9 times out of 10.
On the very rare occasion that Google does get confused and the copied version of your content is outranking your original, Google will want to know about it. Tell them using the Scraper Report Tool.
Google Authorship is also highly recommended. It’s a way of signing your name to a piece of content, permanently associating you as the author with the content. With Authorship, each piece of content is connected to only one author and those blogs that correspond to being ‘contributor to' blogs. No matter how many times it gets scraped, this remains the case.
Keep in mind as well that there is a big difference between scraped content and copyright infringement. Sometimes, a company will copy your content (or even your entire site) and claim credit for its creation. Most of you will know what plagiarism means, but for those who don’t it is the practice of someone else taking your work and passing it off as their own. Scrapers aren’t plagiarizing within the scope of what they do. Anyone who signs their name to your work, however, is plagiarizing it. It’s a BIG no-no.
Myth #3: Republishing Guest Posts on Your Own Site Will Do Harm
Many contributors are guest bloggers, and it’s unlikely that their usual audience sees all their guest posts. For this reason it may be tempting to republish these guest posts on one’s own blog. It’s NOT a hard and fast rule, but content on your own site should be strictly original. But not for fear of a penalty, and more so because original content offers value and that’s good for your web presence in a much more holistic (and rewarding) way.
Some bloggers are actually encouraged to republish their guest post on their own site after a few weeks go by. Often this is done with adding a specific HTML tag to the post
rel=“canonical”
Canonical is simply an uncommon word that means 'official version.’ If you ever republish an article that first appeared elsewhere, using a canonical tag to tell search engines where the original version appeared is wise. Add the tag and republish as you see fit.
If the original was a “how to” post, hold it up to a mirror and write the “how not to” post. Base it on the same concept and research, but use different examples and add more value. This “evil twin” post will be similar, but still original.
Googlebot visits most sites on a daily basis. If it finds a copied version of something a week later on another site, it will identify where the original appeared and move on without creating anything of a fuss. Dinging a domain because unoriginal text was found isn’t nearly the problem for them that others make it out to be.
Fact is, a huge percentage of the internet is duplicate content, and Google is very much aware of it. They’ve been separating originals from copies since 1997, a darn long time since the phrase ‘duplicate content’ became a buzzword around 2005.