Aug 6 2008

robots.txt tips for WordPress to prevent Duplicate Content

When you use the WordPress.org blog script there is a lot of duplicate content created from the various pages and category pages that the WordPress script creates. If you’ve been reading about SEO, you probably know Google doesn’t like duplicate content and therefore your blog will likely be penalized for it.

Thanks to reading a post by Jeremy at ShoeMoney.com and reading about different User-agents, I was able to create a robots.txt file that will help eliminate duplicate content from Google indexing.

Add the following in your robot.txt file to prevent Google from indexing duplicate content in your WordPress blog:

User-agent: Googlebot

Disallow: /trackback/
Disallow: /wp-admin/
Disallow: /feed/
Disallow: /archives/
Disallow: /index.php
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: */feed/
Disallow: /feed
Disallow: */trackback/
Disallow: /page/
Disallow: /pages/
Disallow: /tag/
Disallow: /category/
Disallow: /*?

User-agent: Googlebot-Image
Disallow: /wp-includes/

User-agent: Mediapartners-Google*
Disallow:

User-agent: ia_archiver
Disallow: /

User-agent: duggmirror
Disallow: /

The above should help with duplicate content on your WordPress blog website.

7 Comments on this post

Trackbacks

  1. Tony said:

    Your site regarding robots.txt tips for WordPress to prevent Duplicate Content looks very interesting to me. I found it doing a search for money tips.

    August 18th, 2008 at 11:55 pm
  2. Money Talk said:

    There is more reason to comment than ever before! Great post! I searched for a while to find the right answer to my questions!

    August 20th, 2008 at 5:32 pm
  3. Egor said:

    Hello webmaster been surfing the net for Seo Tips and found your blog reg ts.txt tips for WordPress to prevent Duplicate Content. You relly know your stuff! I\’d like to see more posts here. Will definitely bookmark this one and come back.

    August 22nd, 2008 at 5:52 am
  4. Squeaky said:

    What is the difference between these two statements in the robots.txt file in WordPress?

    Disallow: /page/
    Disallow: /pages/

    April 25th, 2009 at 5:42 pm
  5. MoneyM8 said:

    Hi Squeaky,

    robots interpret /page/ and /pages/ as different in the robots.txt file because they are due to one little “s”.

    June 21st, 2009 at 2:36 pm
  6. Ricky said:

    Great Post. This is new to me and I appreciate
    it.

    Thanks!

    May 21st, 2010 at 4:16 am
  7. Ecommerce Leeds said:

    I didn’t initially include a robots.txt in my blog and never had any issues with dupe content. It wasn’t until just recently I decided to add one, more for experimental purposes. So far, search engine traffic hasn’t improved or declined either way. WordPress out of the box isn’t great for SEO purposes, but with minor tweeks, I find that a robots.txt isn’t really necessary.

    July 6th, 2011 at 1:17 am

LEAVE A COMMENT

Subscribe Form

Subscribe to Blog