What is duplicate content and how to avoid it?
Duplicate content generally refers to the content that is present across multiple websites and multiple domains, making the content not confined to a particular web page. These duplicate content pages are intentionally created by people who create blog networks that pull off the data from other blogs’ feeds.
A few unintentional reasons for the duplicate content creation include the formatted blogs which have scripts to run the blog separately for the desktop version and mobile version, print versions of the blog pages. The duplicate content issue also occurs when there are the www and non-www versions of the websites, as some of the search engines index both the versions of the site.

Having duplicate content in websites not just lowers the ranking of the websites in the search engine, but may also penalize the website, making the page a supplemental one with no value. Here is how search engines like Google determine which page among multiple pages with same content, is the original one -
- Google identifies the various pages with duplicate content
- It determines an original from the list, by considering the facts like where Google first saw the content, the page which is linked the most, the trust on the domains etc.
- Once the original one is determined, the other pages with duplicate content land into the list of supplemental pages.
But this is not the case all the time. Although you post the original content in your website, there are many other websites which may copy it, get indexed in search engines and get determined as the original one because of their authority. Here, you just need to avoid the content from being copied from your website. This is how -
Let your visitors know – By having a banner stating that the content is original, and if copied, you can file a DMCA infringement request with Google, with Yahoo!, and with MSN. This threat not always work, so a friendly request to the owner of the copying site, or contacting the web hosting company of the website may also help.
If your own website is causing the duplicate content issues -
- Be consistent in linking and having either the non-www or the www version of your website. Don’t use both the versions, as indexing both may lead to supplemental pages creation.
- Remove the references to index.htm of your website. The www.examplesite.com and www.examplesite.com/index.htm are different pages with exactly the same content.
- Try to use top level domains, rather than the sub-domains for different language versions of your website.
- Avoid posting stubs. The stub pages really have no content but the same meta data as the other pages have, and this would raise the problem of duplicate content.
If you are using content from other sides and still don’t want to land into any trouble – Use the meta name = “robots” content=”noindex, follow” codes in the header of the web page, so that the search engines ensure that the page should not be indexed.
To check whether anyone is copying your original content, there are sites like copyscape which checks the entire indexed web for the copied content.








































