How to protect your blog from content thieves

Thanks to tools like WordPress, publishing content online as never been easier. But how many times did you saw your own posts, republished on other websites without your consent? Content theft is definitely a big problem for bloggers.

In this article, I’ll show you how to fight back content thieves the best you can.

Content scrapers: Why they are (quite) harmless

Content thieves are divided into two categories: The first are content scrapers. What they do is quite simple: They create a blog, steal posts from other people and republish it on their site like it was their posts.

Some people do it manually, but most are using tools such as the AutoBlogged WordPress plugin, which allow you to “auto-blog”, ie. scrap content from RSS feeds and insert it into your blog.

Content scrapers are essentially looking for money: They steal content and display a large number of ads, and hope that Google will index their posts so they’ll have visits and hopefully, clicks. In my opinion, content scrapers are mostly harmless, because they’re not going to steal your visitors, neither your search engine rankings, thanks to duplicate content detection.

Though, content scrapers are still boring. The following two tips will discourage most of them to steal your content.

Put links on your post titles

As most scrapers are using automatic scraper tools,they’ll scrap all of your content, including the post title. A good way to discourage scrapers is to automatically put a link on your post titles, so each stolen post will automatically link to your original post.

To do so in WordPress, simply open your single.php file and locate where the title is displayed. Then, replace the code by the following:

<h1>
  <a href="<?php the_permalink(); ?>"><?php the_title(); ?></a>
</h1>

Prevent hotlinking

A very negative aspect of content scrapers is the effect it can have on your server. Most of the time, scrapers will display images from your site on theirs, which means that they’ll use your bandwidth. This process is called hotlinking.

If you’re on a shared host such as WpWebHost, that’s not a big problem because they offer unlimited bandwidth, but many hosts like vps.net (They host CatsWhoCode.com) only offer limited bandwidth. In this case, the bandwidth used by content thieves can quickly become a financial problem.

To prevent hotlinking, append this code to your .htaccess file. In WordPress, this file is located on the root of your installation. Make sure you have a backup before modifying the .htaccess file!:

RewriteEngine On
#Replace ?mysite\.com/ with your blog url
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mysite\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
#Replace /images/nohotlink.jpg with your "don't hotlink" image url
RewriteRule .*\.(jpe?g|gif|bmp|png)$ /images/nohotlink.jpg [L]

The result should look like this:

Use them at your advantage with affiliate links

As I previously said, content scrapers steal all your post content, which means they’ll also scrap your affiliate links. And this is great, because if someone click on it and buy the related product, the affiliate comission will be paid to you, not the scraper.

There’s no particular technique to implement: Just think about using affiliate links everytime you can.

Plagiarist: Fighting digital thieves

If content scrapers are the first category of content thieves, plagiarists are the second. What is a plagiarist? It is someone that steal a part of your content and re-use it without giving you credit.

If you follow my CatsWhoCode account on Twitter, you probably know that a boy with a WordPress related blog plagiarized my work various times. I’m not going to give names or details on here because I do not want to make publicity to someone who’s not deserving it, but believe me, that’s annoying.

Unlike with content scrapers, you can’t do a lot to fight against plagiarists. Happilly, bloggers hate plagiarism and the simple fact of tweeting about the theft can prevent him to do it again.

Unfortunely, some people are more stupid than others and despite having their plagiarism exposed, they’ll still continue to negate the theft, and worst, they’ll do it again. In that case, it’s time to use the law at your service.

The USA created a law in 2000, named DMCA. (Digital Millenium Copyright Act) I don’t know much about this law because I’m not American, but it allow you to send a notice to the theft and to his partners, such as webhost or advertising business like BSA or Adsense.

I haven’t sent any DMCA yet, but from what I’ve heard, 99% of content theft will stop plagiarize you after you sent them a DMCA notice.

Here is a standard DMCA notice of copyright infringement that you can send to the theft hosting company and/or advertising partners such as Google Adsense.

Subject: Notice of Copyright Infringement

The copyrighted work at issue is the text that appears on www.yoursite.com/page1.html.
The URLs where our copyrighted material is located include www.theft.com/index.html and www.theft.com/about.html.

You can reach me at you@example.com for further information or clarification. My phone number is +1-000-383-8764 and my mailing address is John Doe, 45, East 49th Street, New York 10001 NY.
The email address of the website owner, who has reprinted our content illegally, is syed@theft.com.

I have a good faith belief that use of the copyrighted materials described above as allegedly infringing is not authorized by the copyright owner, its agent, or the law.

I swear, under penalty of perjury, that the information in the notification is accurate and that I am the copyright owner or am authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.

John Doe
March 10, 2010

Good luck fighting content thieves. Any more tips or advice? Leave me a comment below!

Related Posts with Thumbnails

38 Comments

  1. Posted March 11, 2010 at 5:58 pm | Permalink

    Thanks so much for these tips. I’m happy to try everything I can. Sometimes I’m flattered that my stuff is worth stealing, other times its just plain annoying especially when I find my ‘religious based’ information broken up simply to accommodate porn words or some back door medications.
    Keep up the great work.

  2. Posted March 11, 2010 at 5:59 pm | Permalink

    Great article!
    Didn’t know that a DMCA works like this. I guess it’s because most hosting providers are US companies. I posted similar reports to Google Adsense (and search) before but didn’t noticed a change. Last weekend I asked some people from Google how Google will index those duplicates. they told me the best way would be the link to the original. I don’t think that the title tweak will work, a custom link in the text body should work much better.

  3. Posted March 11, 2010 at 6:12 pm | Permalink

    @Olaf Lederer: The title tweaks works, in the vast majority of cases. Though, a link in the post body is a good idea as well :)

  4. Posted March 11, 2010 at 6:18 pm | Permalink

    Excellent article!! You have given some very good tips :)

    Another way to protect your work is by copy-writing them at Myows. You can protect your work there under creative commons license or any other license. In case of a rip-off or copyright infringement, you can simply open a case against the thief. The administration at Myows will then take care of the rest. :)

  5. Posted March 11, 2010 at 6:18 pm | Permalink

    Never thought about it :D thanks for the Tips Jean :D

  6. Tim
    Posted March 11, 2010 at 9:21 pm | Permalink

    Hej there,

    to my mind the biggest problem is to find those stolen articles/blog-entries. You have to really focus on searching for them so personally I don’t think that it is worth the work. But for sure, especially the “big” blogs or magazines have to care about this and it won’t hurt me to implement those features in my theme in case that there will be a “thief” one day :)
    Nice work!

  7. Posted March 11, 2010 at 9:25 pm | Permalink

    Hey cat,
    very useful tips, Thanks for enlighten us.

    I love the part with inserted image to the blog who has stolen some content. That’s excactly my german humor.

  8. Posted March 11, 2010 at 9:33 pm | Permalink

    Wonderful! You sent that-blog-we-know-about a hit right into the… Well you get it.

    Now on the post itself: Content scrappers use RSS feeds to copy the content, so changing anything in single.php wouldn’t affect them. You’ve got to add a filter to the_title_rss, and wrapping it into reformat to a link. I think.

  9. Posted March 11, 2010 at 9:34 pm | Permalink

    Oh. WordPress ate something: that was “wrapping it into <!CDATA<>>”

  10. Posted March 11, 2010 at 9:42 pm | Permalink

    The link tip can be used also in the body and is specially effective if you put references to your own blog, or past articles. I agree with Adit, myows is a great service, you can also try copyscape.com. About DMCA, google says you need to report by them using mail or fax (not e-mail) – http://www.google.com/dmca.html

  11. Posted March 11, 2010 at 10:03 pm | Permalink

    @Tim,
    you’re right it’s hard to find them and all time you spend is actually too much.
    You can use google alters if you use some unique key phrases in your content. This way you get the copies reported and also other websites about the subject (maybe you like to comment other blogs this way) I really think about to do that…
    @Jean-Baptiste Jung,
    you’re right it works for 100% copies but not for an WP auto blog, they use the title for their own internal links/titles

  12. Posted March 11, 2010 at 11:29 pm | Permalink

    @Handrus Nogueira: Interesting, I didn’t knew that. Thanks!
    @Olaf: Ok. Thanks for the info!

  13. Posted March 12, 2010 at 1:05 am | Permalink

    Great — i have my weapon to fight content thieves now. Thanks to you. :D

  14. Posted March 12, 2010 at 2:19 am | Permalink

    Thanks so much for these tips.
    Thankfully I have not been a victim yet. Damn Thieves!

  15. Posted March 12, 2010 at 1:39 pm | Permalink

    I’ve been sueing content thieves and plagiarisers. I have a friendly solicitor who adds his fee into the recovered amount. Most times it gets settled without going court which in the end will just increase the amount the defendent has to pay out.

    It can actually be quite lucrative and means I can now afford to spend some time locating these content thieves.

    One tip is to hit these people where it hurts, send a DMCA to their advertisers. Showing ads on copied/stolen content is against most providers rules and they will usually lose their account. If they can’t earn money they won’t keep on scraping.

    Also it helps to insert deeplinks into your article and mention your site in the third person where you can. If someone copies it then it become blatently obvious to readers and is quicky brought to your attention.

    BEWARE using cheap online article writing services, many of these are little more than plagiarisers and copycats and you will be the one that gets into trouble for posting the content on your website.

  16. Posted March 12, 2010 at 1:46 pm | Permalink

    @Andy, how do you sue people from countries like Russia, India or China?

  17. Posted March 12, 2010 at 2:20 pm | Permalink

    Woaw! This tips rocks!
    Thanks for sharing. =)

  18. Posted March 12, 2010 at 2:44 pm | Permalink

    I was actually looking for something like this because my site (although is a launching soon page for now) was copied and republished elsewhere. Thanks for the tip! :)

  19. Posted March 12, 2010 at 7:45 pm | Permalink

    Don’t you mean “content scrapers“?

  20. Posted March 12, 2010 at 8:33 pm | Permalink

    @Jason: You’re right, I just updated the post.

  21. Posted March 13, 2010 at 1:20 am | Permalink

    Great article

    thanks for tips

  22. Posted March 13, 2010 at 1:34 am | Permalink

    It’s important to note that there are many AutoBlogged users who have legit sites that aggregate content on a particular subject. For example, some companies use it to monitor blog mentions of their name (which is how I ran across this article).

    As far as AutoBlogged goes, by default it only takes a small excerpt, gives credit, and links back to the original article. This is actually a valuable link for the blogger because it often comes from a site rich in keywords on the subject. Search engines love that and that just drives more traffic to your articles.

    Having said that, yes there are people who just blatantly steal content and don’t give credit and that’s not cool.

    AutoBlogged has a feature that lets the site owner block any articles with certain keywords or from certain domains. Often, you can send a friendly email asking the site admin to exclude your site using AutoBlogged’s URL blacklist feature. It’s always better to politely ask to be excluded rather than firing off legal threats.

    Even better, if the autobloggers are playing fair, only using excerpts, and giving you credit, you could just enjoy the improved search rank and additional traffic you get from them!

  23. Posted March 13, 2010 at 8:41 am | Permalink

    @Autoblogged: Thanks for the input. Of course, I know that Autoblogged hasn’t been created in order to allow users to scrap other people sites, even if some people are using it to steal content.
    It’s not your fault however, and anyways, this post is here to help bloggers fighting plagiarism ;)

  24. Posted March 13, 2010 at 9:48 am | Permalink

    Scripts like Autoblogged makes it possible that people with NO knowledge about websites are able to create copies from other blogs, in my opinion those companies are responsible because they provide the tools for a content thief.

  25. Posted March 13, 2010 at 3:41 pm | Permalink

    A company (which I will not name due to their proximity to my own web firm) has been plagiarizing content for quite some time now and calling it “content writing”. In a friendly meeting we have tried to explain how “content writing” is NOT taking bits and pieces of text from other websites and slapping it into another!!!

  26. Posted March 13, 2010 at 7:23 pm | Permalink

    I absolutely hate content scrapers. I’ve tried a few tricks here and there to limit how much they can scape (even tried a plugin designed to stop it once) but they still manage to do it.

    On the topic of hotlinking, just be careful because many times we can drive traffic to our sites because of things like Google Images and stuff. If the hotlinking is disallowed, we would miss out on that.

    Also, many times other bloggers link to your site and want to include your banner.

    Nothing’s set in stone. It’s whatever makes you comfortable.

  27. Posted March 14, 2010 at 8:49 am | Permalink

    I use the Anti feed Scraper that puts a link in the footer as for hotlinking I can do that easily from cpanel. I am only starting to learn about editing php and these little tricks are much better than having to download a plugin to do this for you thanks for the Info.

  28. Posted March 15, 2010 at 2:43 pm | Permalink

    ahahahaha @ the hotlinking image… brilliant.. good man

  29. Posted March 15, 2010 at 8:00 pm | Permalink

    Joost De Valk developed a plugin called RSS Footer. It allows you to customize your rss footer with a custom backlink/anchor text. At least that way, there’s a slight benefit if your content is being scraped

  30. Posted March 15, 2010 at 9:52 pm | Permalink

    Yoast has so much nice plugins ;)
    thanks for sharing this info, in bbpress it’s possible to have your own RSS template. I guess this is possible in WP as well.

  31. Posted March 19, 2010 at 5:29 am | Permalink

    Jean-Baptiste Jung,

    Regarding the hotlinking code in the .htaccess file: I have nohotlink.jpg in an “images” folder in the root directory. Copied your code exactly and only changed my blog URL. When I test this on another site to see if it works, no image shows up.

    Any idea on why nohotlink.jpg isn’t showing up?

  32. Posted March 19, 2010 at 1:57 pm | Permalink

    @Olaf if you have a name and an address then they can be sued wherever they are (China is a bit of a problem – but i’ve blocked China via a server hack).

    My solicitor will find a firm of solicitors local to the offender and an intial letter of intent will come via them. (Solicitors around the world all speak English in the main!)

    Most will pay up your request for damages on this initial letter, if they don’t most countries award costs and allow you to add your legal fees so in they end they will just end up paying a lot more.

  33. Posted March 23, 2010 at 11:51 am | Permalink

    Interesting post. I write posts about Gmail’s tips. And sometimes I do use tips from others blogs, but I give the original blog credit, I know I have to ask them the permission first.

  34. Posted June 16, 2010 at 10:05 am | Permalink

    For plagiarism, you can also whois the website that plagiarize your stuff and try to find information about its host. Then send the hosting company a DMCA notice so either it will shut down the website or it will send notice to the website owner. You can google some DMCA sample letter forms for different cases.

  35. Posted June 28, 2010 at 3:22 am | Permalink

    We had scores of our pages plagiarized a year or so ago. I found out because they stole and pasted everything, including my name! I have a Google Alert set up for my name so I was notified when my name got indexed. Sadly, the website owner who was a nice man, hired a dishonest website developer. The developer copied my content and charged the owner for the content he plagiarized. The owner took the content down as soon as we confronted him, but I’ve always wondered what happened to the developer.

  36. Posted July 7, 2010 at 8:50 pm | Permalink

    Hi Jean! Well, I tryed to follow the tweaks you posted at Smashing Magazine, but the tweak for Scrapers/Content Thieves didn’t work well. When I try to use it in a website with WP3.0, it just block any images inside the page to load, so basically everything is cleaned.. Do you know if is this a bug for WP3.0? Maybe I did something wrong?

    Thanks for the great tips!

    • Posted July 7, 2010 at 10:31 pm | Permalink

      Hi Nicolas,

      Weird, the tip should work well. Did you replace “mywebsite.com” on line 3 by your site url?

      • Posted July 8, 2010 at 3:27 am | Permalink

        Hi Jean! Thanks for the fast reply. Yep! I added my website this way:
        RewriteEngine On
        RewriteCond %{HTTP_REFERER} !^http://(.+\.)thismywebsite.com/ [NC]
        RewriteCond %{HTTP_REFERER} !^$
        RewriteRule .*\.(jpe?g|gif|bmp|png)$ /images/nohotlink.jpg [L]

        Maybe I did something wrong in the code?
        Thanks again for your help ;)

10 Trackbacks

  1. By designfloat.com on March 11, 2010 at 5:50 pm

    How to protect your blog from content thieves…

    Thanks to tools like WordPress, publishing content online as never been easier. But how many times did you saw your own posts, republished on other websites without your consent? Content theft is definitely a big problem for bloggers. In this article, …

  2. By uberVU - social comments on March 11, 2010 at 10:26 pm

    Social comments and analytics for this post…

    This post was mentioned on Twitter by catswhoblog: New on CatsWHoBlog: How to protect your blog from content thieves http://ow.ly/1q9MhC (Please RT!! Thanks)…

  3. By Weekly DesignLove #9 | DesignLovr on March 12, 2010 at 8:43 pm

    [...] How to protect your blog from content thieves [...]

  4. By 10 Blogging Tips That Improve Your Blog on March 14, 2010 at 6:17 pm

    [...] How To Protect Your Blog From Content Thieves – CatsWhoBlog Thanks to tools like WordPress, publishing content online as never been easier. [...]

  5. [...] [...]

  6. [...] Et n’oubliez pas pas de remplacer « geeknoise.com » par votre site, et fuckyou.jpg par votre image d’avertissement Explication: Une fois que vous ayez enregistré votre fichier, seulement votre site sera cappable de faire un lien vers vos images, et dans le cas échéant, l’image fuckyou.jpg sera affichée. Source: How to Protect Your Blog from Content Thieves [...]

  7. [...] How to Protect Your Blog from Content Thieves [...]

  8. [...] How To Protect Your Blog From Content Thieves [...]

  9. By Nuvdel's Blog | Le blog d'un déluré on August 19, 2010 at 1:07 pm

    [...] Et n’oubliez pas pas de remplacer « geeknoise.com » par votre site, et fuckyou.jpg par votre image d’avertissement Explication: Une fois que vous ayez enregistré votre fichier, seulement votre site sera cappable de faire un lien vers vos images, et dans le cas échéant, l’image fuckyou.jpg sera affichée.  Source: How to Protect Your Blog from Content Thieves [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Subscribe without commenting