Apache : une liste de bots à bloquer

Apache : une liste de bots à bloquerPar le plus grand des hasards, je suis tombé sur la blacklist de bots du site Hackrepair.com, réalisé à partir du User Agent. Vous pouvez l’ajouter à votre fichier .htaccess situé à la racine de votre site WordPress, si vous êtes en hébergement mutualisé. Je vous recommande de la mettre dans un des fichiers de configuration de votre serveur Apache si vous êtes en hébergement dédié. Sous CentOS, j’ai créé pour ma part le fichier /etc/httpd/conf.d/05-hackrepair.conf dans lequel j’ai ajouté cette liste  dont l’objet est de provoquer une erreur 403 pour tous ces agents :

RewriteCond %{HTTP_USER_AGENT} ^$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Acunetix [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^binlar [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bolt\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot\@yahoo\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BOT\ for\ JCE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^casper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^checkprivacy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^clshttp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^cmsworldmap [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^comodo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Default\ Browser\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^diavol [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^dotbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^extract [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^feedfinder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FHscan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^flicky [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^g00g1e [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^grab [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^harvest [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^jakarta [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Java [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^kanagawa [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^kmccrew [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^libwww [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Maxthon$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^microsoft\.url [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^miner [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*Indy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*NEWT [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^nutch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PeoplePal [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^planetwork [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^purebot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pycurl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Rippers\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SeaMonkey$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck\.internetseer\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^skygrid [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Toata\ dragostea\ mea\ pentru\ diavola [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^turnit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^vikspider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WPScan [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WWW-Mechanize [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Yandex [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^zmeu [NC,OR]
RewriteCond %{HTTP_USER_AGENT} AhrefsBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} CazoodleBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} discobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ecxi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GT::WWW [NC,OR]
RewriteCond %{HTTP_USER_AGENT} heritrix [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTP::Lite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ia_archiver [NC,OR]
RewriteCond %{HTTP_USER_AGENT} id-search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} id-search\.org [NC,OR]
RewriteCond %{HTTP_USER_AGENT} IDBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} IRLbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ISC\ Systems\ iRc\ Search\ 2\.1 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LinksManager.com_bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} linkwalker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lwp-trivial [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MFC_Tear_Sample [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Microsoft\ URL\ Control [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Missigua\ Locator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} panscient.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PECL::HTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PHPCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PleaseCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SBIder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Snoopy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Steeler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} URI::Fetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} urllib [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webalta [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCollage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Wells\ Search\ II [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WEP\ Search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zermelo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC]
# Abuse HTTP Referrer Blocking end
RewriteRule ^.* - [F,L]

Apache  / .htaccess Apache Blacklist Blacklists Bot Bots Formateur Apache Formateur Sécurité informatique Formateur WordPress Sécurité informatique User Agent Wordpress 

Commentaires

Très bonne initiative ce billet avec cette liste de bot ;) Par contre, attention si on a déjà un .htaccess en place, il ne faut pas qu’il soit trop long car cela pourrait ralentir drastiquement le site.

@Sébastien

Faut pas exagérer : ça ne va ralentir à ce point !!! Tout est relatif, tout de même.

J’ai rajouté cette liste dans mon .htaccess sur un serveur mutualisé.
Et mon site devient inaccessible :
Error 403 – Forbidden
L’accès au fichier requiert une autorisation.

J’ai mis cette liste dans mon .htaccess en dessous de cette liste :

RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ – [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

Une idée pourquoi ça devient inaccessible ?

@Jérôme

T’as essayé avec différents navigateurs ?

Laisser un commentaire

(requis)

(requis)