X

Block black hat SEO referrers from linking to your site

This post has been revised and updated to prevent unnecessary additions to your web server’s .htaccess file.
The .htaccess example is just a short sample now because real content is very site-specific and not all black hat techniques create real traffic to your web site.

You create a web site with great content and traffic starts to surge as people find links to your stuff on Google and Bing (maybe even Yandex or Baidu). This also causes a big problem that has no permanent, one-click solution: bad guys will certainly take note of your site’s popularity and start using black hat SEO (Search Engine Optimization) techniques to link to your content, attempting to get their scam, malware, and PUP (Potentially Unwanted Program) sites higher in major search engines’ result pages.

These connections to known black hat SEO and malware pages will sooner or later be noticed by major search engines and your great site could start performing worse and worse in search results. This leads to a reduction in the number of visitors and ad income (if you have ads on your site), maybe even sales.

To keep this from happening, you need to check the list of incoming links monthly in at least Google Search Console. Using Google Analytics’ Referral Traffic report on a monthly basis is also strongly recommended.

Update 2019-06-12
It is a wide-spread misconception that linking to high-ranking sites improves your site’s rating. It does not.
John Mueller has confirmed that Google knows and tracks bad referrals well and that you rarely need to take manual actions to disavow backlinks. Just having those backlinks in Search Console does not mean that your site will be penalized.

Where and how to check for black hat SEO referrals/backlinks

For security purposes, make sure your device is running up-to-date anti-virus/anti-malware and web filter software: you will be visiting malware and exploit sites.
To make this even safer, use free VirtualBox for installing a free, light-weight Linux distro (for example, Puppy Linux) in Live CD mode and checking links from there.

You need to check the readability of pages that have backlinks to your site: strange paragraphs of random text and changing content with every refresh/reload (F5) is a dead giveaway of a low-quality site.
To make finding backlinks easier, open the page source code (most browsers have keyboard shortcut Ctrl+U for this) and then use the search tool (Ctrl+F) to see if the link really exists.

You will certainly find a pattern among several black hat sites: their domain names, link structure, and design are often very similar.

As a side effect, you’ll also detect content scrapers that have blatantly copied your content word by word. Make sure to include these sites in your disavow file.

Bing Webmaster Tools is not good for this kind of research as it lists only your pages and the number of links to them, and it does not group the backlinks by external domains.

Google Search Console

In Google Search Console, open your site and navigate to Links from the left side. Then click Top Linking Sites in the Who links the most section. This lists the top 1000 sites linking to your content. Yes, you will need to go through this one by one (except for some well-known domain names, such as google.com or microsoft.com) and verify that the links are actually good. Many black hat SEO sites show the listed referrals to search spiders/bots only.

Take special note of unknown sites that have tens of thousands of backlinks to your site: these are most probably black hat SEO links.

Google Search Console, Top linking sites

Here’s an example of such a domain being used for malvertising.

Many black hat SEO sites use malvertising

Please note that many referrals do not actually send real traffic to your site, so there is no need to add such sites to your .htaccess file on web server. You should disavow these links as discussed later in this tutorial.

Google Analytics

In Google Analytics, navigate to Acquisition, All Traffic, Referrals to see the list of incoming traffic from other domains. This report does not include unneeded traffic, such as from Google’s own search.

You can change the date range for this report from the top right.

Unlike in Search Console, these are real visits to your site.

There are two quality indicators to verify for referral traffic: Bounce Rate shows the percentage of visitors leaving without doing a single thing your site (not even scrolling), and Avg. Session Duration shows how long visitors stayed on your site. 100% Bounce Rate also means 0 seconds of Session Duration – this is usually suspicious traffic, especially when the number of Sessions is higher than 1.

Google Analytics, Referrals report

If you see a large number of sessions from the same source with a session duration of 0, this domain must be included in your web server’s .htaccess block list. While this black hat technique is not so common as it used to be several years ago, it still happens sometimes.

Yandex.Webmaster

In Yandex.Webmaster, open your site and go to Links, External Links from the menu on the left. Then turn on the Group by site option. You might want to sort the list by the number of Links.

Yandex.Webmaster – list of external links

While there are probably no huge numbers here, Yandex has a pretty good indicator called SQI – Site Quality Index. If a site with many backlinks has no SQI number, or the number is below 20, you might want to check that site.

Disavowing black hat links in Google Search Console

To get rid of the bad referrals, create a text file with the list as instructed on this Google Search Console help page and submit it using the Disavow links tool page. This step is a must in getting your site higher in search results again.

Please do note that your site may experience a temporary drop in the number of visitors after submitting the disavow file.

Sadly, site traffic recovery might take months. This is why you need to update the disavow file monthly.

Here’s an example of a disavow file contents. Do not use it for your web site, this is purely site-specific.

# Black hat SEO sites that do not respond
domain:audiogames.net
domain:flattr.ir
domain:freevisit.ir
domain:lilipot.ir
domain:pilisok.ir
domain:softwareexpres.com
domain:webgardbox.ir
domain:webgardii.ir
# Black hat SEO sites connected to malvertising and do not respond
domain:bizttaglarr.ml
domain:closed-section.xyz
domain:culesofkerala.com
domain:fabreeza.com.pk
domain:giami.media
domain:gitantu.com
domain:info-focus.info
domain:jamericorpllc.com
domain:tuuliajolla.info
domain:ulaznice.info
domain:xsl.pt
# Black hat SEO sites connected to ReImage Plus / ReImage Repair scareware, and do not respond
domain:computercontractor.net
domain:corewatch.net
domain:hungariancc.org
domain:icubenetwork.com
domain:integrare.net
domain:logipam.org
domain:svcd2dvdmpg.com
domain:web-syndicate.com
domain:webmasterpaste.com
domain:videocasterapp.net
domain:xwings.net
# Content scrapers that do not respond
domain:weblebhost.com
# Dead sites
domain:mwvana.com
domain:sicsic.net
domain:windowssearch-exp.win
# Hacked sites that have black hat SEO subpages, and do not respond
domain:massagroup.co
domain:mozambiquetourismonline.com
domain:rosturplast.com
domain:sankaraa.com
# Low-quality sites that do not respond
domain:imgcop.com
domain:picswe.com
# Normal sites that contain large amounts of redirection links to malware downloads, illegal stuff (cracks, fake product keys)
# and do not have means to report these bad links
domain:lepszypoznan.pl
domain:ru.net
domain:stanito.com
domain:scoop.it

.htaccess example for blocking black hat SEO referrers

This Apache code is just an example, there is no point in using it on your site.

You need to do your own backlink research and update it monthly because referrals differ by the content of web sites.

First, you might want to create an informative page (for example, badreferrer.html) and redirect visitors with suspect referrals there. After you’ve thoroughly tested your list, you can replace the redirection rule with a deny (error 403) rule.
Please remember to add the <meta content=”noindex, nofollow, noarchive” name=”robots”> line to the head section of your informative page to prevent Google, Bing, and other search robots from indexing it. You might also want to add analytics code to the page if you use such services.

This example list has a rule for each blacklisted item for better readability. After testing, you might want to merge this list into one or more longer rules and remove most comments.

Create a backup of your .htaccess file before making any modifications!

## START of referrer block list

# Obvious black hat SEO pages at bandcamp.com
RewriteCond %{REQUEST_URI} !badreferrer\.html
RewriteCond %{HTTP_REFERER} cherschartugutors\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} croslawnconpagelt\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} fadineedeadkonn\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} faichickphepinnie\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} flexevderhelpti\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} glaversilercart\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} gluccalbehelmdo\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} imtenreamili\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} lumsevalchala\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} maljustletotip\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} nesspostfinddomzu\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} netrephilzugsca\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} newssuradicgist\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} omigmitipur\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} othpasensyre\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} paiheemslylankto\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} plasaluxgrapid\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} raytrocalunin\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} repenkeynape\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} riasnooggangketpe\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} sanytininla\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} serhelpnoprapa\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} signchartzicocu\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} stucentorsoiflip\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} tarnamarmicon\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} thosubzootumi\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} usocalharnipp\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} wigvichararo\.bandcamp\.com [OR]
RewriteCond %{HTTP_REFERER} zoivesoutodow\.bandcamp\.com
RewriteRule .* /badreferrer.html? [R,L]

# Obvious (hacked?) black hat SEO pages and forums
# that promise cracks and other illegal stuff
RewriteCond %{REQUEST_URI} !badreferrer\.html
RewriteCond %{HTTP_REFERER} animation-paradise\.clicforum\.com [OR]
RewriteCond %{HTTP_REFERER} applepie\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} asgard\.xooit\.org [OR]
RewriteCond %{HTTP_REFERER} autofixinfo\.com [OR]
RewriteCond %{HTTP_REFERER} band-of-brothers\.ze-forum\.com [OR]
RewriteCond %{HTTP_REFERER} bcc\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} chevaliersdedeilhen\.ebboard\.com [OR]
RewriteCond %{HTTP_REFERER} clouddownloading79\.fo\.ru [OR]
RewriteCond %{HTTP_REFERER} construccionesmejiamarin\.com [OR]
RewriteCond %{HTTP_REFERER} darkfire\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} dhs\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} dornenvoegel\.bplaced\.net [OR]
RewriteCond %{HTTP_REFERER} efinitho25\.soup\.io [OR]
RewriteCond %{HTTP_REFERER} eg\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} emf\.xooit\.com [OR]
RewriteCond %{HTTP_REFERER} fenwaybarkonline\.com [OR]
RewriteCond %{HTTP_REFERER} forum\.udruga-lisac-ravno\.com [OR]
RewriteCond %{HTTP_REFERER} gamerz-of-generation\.leforum\.eu [OR]
RewriteCond %{HTTP_REFERER} genocide-animal\.topbboard\.com [OR]
RewriteCond %{HTTP_REFERER} gsli\.bplaced\.net [OR]
RewriteCond %{HTTP_REFERER} guerres-mondiales\.xooit\.com [OR]
RewriteCond %{HTTP_REFERER} icell-uae\.com [OR]
RewriteCond %{HTTP_REFERER} jaycalimis27\.soup\.io [OR]
RewriteCond %{HTTP_REFERER} km\.doh\.go\.th [OR]
RewriteCond %{HTTP_REFERER} kursyjezykoweonline\.pl [OR]
RewriteCond %{HTTP_REFERER} lafontaine2012-2013\.lolforum\.com [OR]
RewriteCond %{HTTP_REFERER} lalliancedusang\.soforums\.com [OR]
RewriteCond %{HTTP_REFERER} leschevaliersdelatyrie\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} les\.chars\.de\.guerre\.xoo\.it [OR]
RewriteCond %{HTTP_REFERER} les-faucon-noir\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} lettres-de-vie\.xooit\.com [OR]
RewriteCond %{HTTP_REFERER} logitramites\.com\.co [OR]
RewriteCond %{HTTP_REFERER} m4rs-team\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} mancraftfr\.vraiforum\.com [OR]
RewriteCond %{HTTP_REFERER} missulino\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} muscleandfitness\.hu [OR]
RewriteCond %{HTTP_REFERER} nicopura\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} opiumteam\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} ordredasmodae\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} paragonvanguard\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} persona-france\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} priveachgeso\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} prollonzete\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} pyoufowarpai\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} racismonline\.com [OR]
RewriteCond %{HTTP_REFERER} resistancefrancaisebf3\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} s4fairytail\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} sebastianfur\.it [OR]
RewriteCond %{HTTP_REFERER} sevenfamily\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} sinkorswim\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} skyblockforums\.com [OR]
RewriteCond %{HTTP_REFERER} snoredcare\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} talar\.mahanteymouri\.ir [OR]
RewriteCond %{HTTP_REFERER} team\.fury-nocturne\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} team13legion\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} teamxsak\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} theextremesniper\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} thesims\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} thomasetginie\.xooit\.fr [OR]
RewriteCond %{HTTP_REFERER} tropadconwatch\.guildwork\.com [OR]
RewriteCond %{HTTP_REFERER} tsg\.mypclifeguard\.com [OR]
RewriteCond %{HTTP_REFERER} watchknitting\.org [OR]
RewriteCond %{HTTP_REFERER} wellnesskalkar\.de [OR]
RewriteCond %{HTTP_REFERER} voe-victis\.xoo\.it [OR]
RewriteCond %{HTTP_REFERER} vyraxzys\.gq [OR]
RewriteCond %{HTTP_REFERER} yaquinabayyachtclub\.org [OR]
RewriteCond %{HTTP_REFERER} zenallstar\.xooit\.com
RewriteRule .* - [R=403,L]

# END of referrer block list

Black hat SEO rules should be added into the <IfModule mod_rewrite.c> section of your server’s .htaccess file.

Keep scanning access logs to find out problems with the block list, and do remember that backlink cleanup is a continuous process.