Thanks to tools like WordPress, publishing content online as never been easier. But how many times did you saw your own posts, republished on other websites without your consent? Content theft is definitely a big problem for bloggers.
In this article, I’ll show you how to fight back content thieves the best you can.
Content scrapers: Why they are (quite) harmless
Content thieves are divided into two categories: The first are content scrapers. What they do is quite simple: They create a blog, steal posts from other people and republish it on their site like it was their posts.Some people do it manually, but most are using tools such as the AutoBlogged WordPress plugin, which allow you to “auto-blog”, ie. scrap content from RSS feeds and insert it into your blog.
Content scrapers are essentially looking for money: They steal content and display a large number of ads, and hope that Google will index their posts so they’ll have visits and hopefully, clicks. In my opinion, content scrapers are mostly harmless, because they’re not going to steal your visitors, neither your search engine rankings, thanks to duplicate content detection.
Though, content scrapers are still boring. The following two tips will discourage most of them to steal your content.
Put links on your post titles
As most scrapers are using automatic scraper tools,they’ll scrap all of your content, including the post title. A good way to discourage scrapers is to automatically put a link on your post titles, so each stolen post will automatically link to your original post.
To do so in WordPress, simply open your single.php file and locate where the title is displayed. Then, replace the code by the following:
<h1> <a href="<?php the_permalink(); ?>"><?php the_title(); ?></a> </h1>
Prevent hotlinking
A very negative aspect of content scrapers is the effect it can have on your server. Most of the time, scrapers will display images from your site on theirs, which means that they’ll use your bandwidth. This process is called hotlinking.
If you’re on a shared host such as WpWebHost, that’s not a big problem because they offer unlimited bandwidth, but many hosts like vps.net (They host CatsWhoCode.com) only offer limited bandwidth. In this case, the bandwidth used by content thieves can quickly become a financial problem.
To prevent hotlinking, append this code to your .htaccess file. In WordPress, this file is located on the root of your installation. Make sure you have a backup before modifying the .htaccess file!:
RewriteEngine On
#Replace ?mysite\.com/ with your blog url
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mysite\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
#Replace /images/nohotlink.jpg with your "don't hotlink" image url
RewriteRule .*\.(jpe?g|gif|bmp|png)$ /images/nohotlink.jpg [L]
The result should look like this:

Use them at your advantage with affiliate links
As I previously said, content scrapers steal all your post content, which means they’ll also scrap your affiliate links. And this is great, because if someone click on it and buy the related product, the affiliate comission will be paid to you, not the scraper.
There’s no particular technique to implement: Just think about using affiliate links everytime you can.
Plagiarist: Fighting digital thieves
If content scrapers are the first category of content thieves, plagiarists are the second. What is a plagiarist? It is someone that steal a part of your content and re-use it without giving you credit.
If you follow my CatsWhoCode account on Twitter, you probably know that a boy with a WordPress related blog plagiarized my work various times. I’m not going to give names or details on here because I do not want to make publicity to someone who’s not deserving it, but believe me, that’s annoying.
Unlike with content scrapers, you can’t do a lot to fight against plagiarists. Happilly, bloggers hate plagiarism and the simple fact of tweeting about the theft can prevent him to do it again.
Unfortunely, some people are more stupid than others and despite having their plagiarism exposed, they’ll still continue to negate the theft, and worst, they’ll do it again. In that case, it’s time to use the law at your service.
The USA created a law in 2000, named DMCA. (Digital Millenium Copyright Act) I don’t know much about this law because I’m not American, but it allow you to send a notice to the theft and to his partners, such as webhost or advertising business like BSA or Adsense.
I haven’t sent any DMCA yet, but from what I’ve heard, 99% of content theft will stop plagiarize you after you sent them a DMCA notice.
Here is a standard DMCA notice of copyright infringement that you can send to the theft hosting company and/or advertising partners such as Google Adsense.
Subject: Notice of Copyright Infringement The copyrighted work at issue is the text that appears on www.yoursite.com/page1.html. The URLs where our copyrighted material is located include www.theft.com/index.html and www.theft.com/about.html. You can reach me at you@example.com for further information or clarification. My phone number is +1-000-383-8764 and my mailing address is John Doe, 45, East 49th Street, New York 10001 NY. The email address of the website owner, who has reprinted our content illegally, is syed@theft.com. I have a good faith belief that use of the copyrighted materials described above as allegedly infringing is not authorized by the copyright owner, its agent, or the law. I swear, under penalty of perjury, that the information in the notification is accurate and that I am the copyright owner or am authorized to act on behalf of the owner of an exclusive right that is allegedly infringed. John Doe March 10, 2010
Good luck fighting content thieves. Any more tips or advice? Leave me a comment below!



38 Comments
Thanks so much for these tips. I’m happy to try everything I can. Sometimes I’m flattered that my stuff is worth stealing, other times its just plain annoying especially when I find my ‘religious based’ information broken up simply to accommodate porn words or some back door medications.
Keep up the great work.
Great article!
Didn’t know that a DMCA works like this. I guess it’s because most hosting providers are US companies. I posted similar reports to Google Adsense (and search) before but didn’t noticed a change. Last weekend I asked some people from Google how Google will index those duplicates. they told me the best way would be the link to the original. I don’t think that the title tweak will work, a custom link in the text body should work much better.
@Olaf Lederer: The title tweaks works, in the vast majority of cases. Though, a link in the post body is a good idea as well
Excellent article!! You have given some very good tips
Another way to protect your work is by copy-writing them at Myows. You can protect your work there under creative commons license or any other license. In case of a rip-off or copyright infringement, you can simply open a case against the thief. The administration at Myows will then take care of the rest.
Never thought about it
thanks for the Tips Jean
Hej there,
to my mind the biggest problem is to find those stolen articles/blog-entries. You have to really focus on searching for them so personally I don’t think that it is worth the work. But for sure, especially the “big” blogs or magazines have to care about this and it won’t hurt me to implement those features in my theme in case that there will be a “thief” one day
Nice work!
Hey cat,
very useful tips, Thanks for enlighten us.
I love the part with inserted image to the blog who has stolen some content. That’s excactly my german humor.
Wonderful! You sent that-blog-we-know-about a hit right into the… Well you get it.
Now on the post itself: Content scrappers use RSS feeds to copy the content, so changing anything in single.php wouldn’t affect them. You’ve got to add a filter to the_title_rss, and wrapping it into reformat to a link. I think.
Oh. WordPress ate something: that was “wrapping it into <!CDATA<>>”
The link tip can be used also in the body and is specially effective if you put references to your own blog, or past articles. I agree with Adit, myows is a great service, you can also try copyscape.com. About DMCA, google says you need to report by them using mail or fax (not e-mail) – http://www.google.com/dmca.html
@Tim,
you’re right it’s hard to find them and all time you spend is actually too much.
You can use google alters if you use some unique key phrases in your content. This way you get the copies reported and also other websites about the subject (maybe you like to comment other blogs this way) I really think about to do that…
@Jean-Baptiste Jung,
you’re right it works for 100% copies but not for an WP auto blog, they use the title for their own internal links/titles
@Handrus Nogueira: Interesting, I didn’t knew that. Thanks!
@Olaf: Ok. Thanks for the info!
Great — i have my weapon to fight content thieves now. Thanks to you.
Thanks so much for these tips.
Thankfully I have not been a victim yet. Damn Thieves!
I’ve been sueing content thieves and plagiarisers. I have a friendly solicitor who adds his fee into the recovered amount. Most times it gets settled without going court which in the end will just increase the amount the defendent has to pay out.
It can actually be quite lucrative and means I can now afford to spend some time locating these content thieves.
One tip is to hit these people where it hurts, send a DMCA to their advertisers. Showing ads on copied/stolen content is against most providers rules and they will usually lose their account. If they can’t earn money they won’t keep on scraping.
Also it helps to insert deeplinks into your article and mention your site in the third person where you can. If someone copies it then it become blatently obvious to readers and is quicky brought to your attention.
BEWARE using cheap online article writing services, many of these are little more than plagiarisers and copycats and you will be the one that gets into trouble for posting the content on your website.
@Andy, how do you sue people from countries like Russia, India or China?
Woaw! This tips rocks!
Thanks for sharing. =)
I was actually looking for something like this because my site (although is a launching soon page for now) was copied and republished elsewhere. Thanks for the tip!
Don’t you mean “content scrapers“?
@Jason: You’re right, I just updated the post.
Great article
thanks for tips
It’s important to note that there are many AutoBlogged users who have legit sites that aggregate content on a particular subject. For example, some companies use it to monitor blog mentions of their name (which is how I ran across this article).
As far as AutoBlogged goes, by default it only takes a small excerpt, gives credit, and links back to the original article. This is actually a valuable link for the blogger because it often comes from a site rich in keywords on the subject. Search engines love that and that just drives more traffic to your articles.
Having said that, yes there are people who just blatantly steal content and don’t give credit and that’s not cool.
AutoBlogged has a feature that lets the site owner block any articles with certain keywords or from certain domains. Often, you can send a friendly email asking the site admin to exclude your site using AutoBlogged’s URL blacklist feature. It’s always better to politely ask to be excluded rather than firing off legal threats.
Even better, if the autobloggers are playing fair, only using excerpts, and giving you credit, you could just enjoy the improved search rank and additional traffic you get from them!
@Autoblogged: Thanks for the input. Of course, I know that Autoblogged hasn’t been created in order to allow users to scrap other people sites, even if some people are using it to steal content.
It’s not your fault however, and anyways, this post is here to help bloggers fighting plagiarism
Scripts like Autoblogged makes it possible that people with NO knowledge about websites are able to create copies from other blogs, in my opinion those companies are responsible because they provide the tools for a content thief.
A company (which I will not name due to their proximity to my own web firm) has been plagiarizing content for quite some time now and calling it “content writing”. In a friendly meeting we have tried to explain how “content writing” is NOT taking bits and pieces of text from other websites and slapping it into another!!!
I absolutely hate content scrapers. I’ve tried a few tricks here and there to limit how much they can scape (even tried a plugin designed to stop it once) but they still manage to do it.
On the topic of hotlinking, just be careful because many times we can drive traffic to our sites because of things like Google Images and stuff. If the hotlinking is disallowed, we would miss out on that.
Also, many times other bloggers link to your site and want to include your banner.
Nothing’s set in stone. It’s whatever makes you comfortable.
I use the Anti feed Scraper that puts a link in the footer as for hotlinking I can do that easily from cpanel. I am only starting to learn about editing php and these little tricks are much better than having to download a plugin to do this for you thanks for the Info.
ahahahaha @ the hotlinking image… brilliant.. good man
Joost De Valk developed a plugin called RSS Footer. It allows you to customize your rss footer with a custom backlink/anchor text. At least that way, there’s a slight benefit if your content is being scraped
Yoast has so much nice plugins
thanks for sharing this info, in bbpress it’s possible to have your own RSS template. I guess this is possible in WP as well.
Jean-Baptiste Jung,
Regarding the hotlinking code in the .htaccess file: I have nohotlink.jpg in an “images” folder in the root directory. Copied your code exactly and only changed my blog URL. When I test this on another site to see if it works, no image shows up.
Any idea on why nohotlink.jpg isn’t showing up?
@Olaf if you have a name and an address then they can be sued wherever they are (China is a bit of a problem – but i’ve blocked China via a server hack).
My solicitor will find a firm of solicitors local to the offender and an intial letter of intent will come via them. (Solicitors around the world all speak English in the main!)
Most will pay up your request for damages on this initial letter, if they don’t most countries award costs and allow you to add your legal fees so in they end they will just end up paying a lot more.
Interesting post. I write posts about Gmail’s tips. And sometimes I do use tips from others blogs, but I give the original blog credit, I know I have to ask them the permission first.
For plagiarism, you can also whois the website that plagiarize your stuff and try to find information about its host. Then send the hosting company a DMCA notice so either it will shut down the website or it will send notice to the website owner. You can google some DMCA sample letter forms for different cases.
We had scores of our pages plagiarized a year or so ago. I found out because they stole and pasted everything, including my name! I have a Google Alert set up for my name so I was notified when my name got indexed. Sadly, the website owner who was a nice man, hired a dishonest website developer. The developer copied my content and charged the owner for the content he plagiarized. The owner took the content down as soon as we confronted him, but I’ve always wondered what happened to the developer.
Hi Jean! Well, I tryed to follow the tweaks you posted at Smashing Magazine, but the tweak for Scrapers/Content Thieves didn’t work well. When I try to use it in a website with WP3.0, it just block any images inside the page to load, so basically everything is cleaned.. Do you know if is this a bug for WP3.0? Maybe I did something wrong?
Thanks for the great tips!
Hi Nicolas,
Weird, the tip should work well. Did you replace “mywebsite.com” on line 3 by your site url?
Hi Jean! Thanks for the fast reply. Yep! I added my website this way:
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)thismywebsite.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ /images/nohotlink.jpg [L]
Maybe I did something wrong in the code?
Thanks again for your help
10 Trackbacks
How to protect your blog from content thieves…
Thanks to tools like WordPress, publishing content online as never been easier. But how many times did you saw your own posts, republished on other websites without your consent? Content theft is definitely a big problem for bloggers. In this article, …
Social comments and analytics for this post…
This post was mentioned on Twitter by catswhoblog: New on CatsWHoBlog: How to protect your blog from content thieves http://ow.ly/1q9MhC (Please RT!! Thanks)…
[...] How to protect your blog from content thieves [...]
[...] How To Protect Your Blog From Content Thieves – CatsWhoBlog Thanks to tools like WordPress, publishing content online as never been easier. [...]
[...] [...]
[...] Et n’oubliez pas pas de remplacer « geeknoise.com » par votre site, et fuckyou.jpg par votre image d’avertissement Explication: Une fois que vous ayez enregistré votre fichier, seulement votre site sera cappable de faire un lien vers vos images, et dans le cas échéant, l’image fuckyou.jpg sera affichée. Source: How to Protect Your Blog from Content Thieves [...]
[...] How to Protect Your Blog from Content Thieves [...]
[...] http://www.catswhoblog.com/how-to-protect-your-blog-from-content-thieves [...]
[...] How To Protect Your Blog From Content Thieves [...]
[...] Et n’oubliez pas pas de remplacer « geeknoise.com » par votre site, et fuckyou.jpg par votre image d’avertissement Explication: Une fois que vous ayez enregistré votre fichier, seulement votre site sera cappable de faire un lien vers vos images, et dans le cas échéant, l’image fuckyou.jpg sera affichée. Source: How to Protect Your Blog from Content Thieves [...]