+ Reply to Thread
Page 1 of 4 1 2 3 ... LastLast
Results 1 to 10 of 33

Thread: .htaccess ban list

  1. #1
    Join Date
    Feb 2006
    Location
    Somewhere where I don't know where I am
    Posts
    2,155

    Default .htaccess ban list

    I got tired to stupid bots crawling my site. So i created a .htaccess ban list. If you have any to add, please do so!

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^NPBot [NC,OR]
    RewriteCond %{REMOTE_ADDR} ^12.148.196.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [NC,OR]
    RewriteCond %{REMOTE_ADDR} ^12.148.209.(19[2-9]|2[0-4][0-9]|25[0-5])$ [NC,OR]
    RewriteCond %{REMOTE_ADDR} ^63.148.99.2(2[4-9]|[3-4][0-9]|5[0-5])$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtractor [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailReaper [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailMagnet [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebPix [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla/2.0\ \(compatible;\ NEWT\ ActiveX;\ Win32\) [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCollector [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^PICgrabber [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^mister\ pix [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMiner [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSnake [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webdup [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebBandit [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^teleport [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSucker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteCopy [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ninja [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^JoBo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Extreme\ Picture\ Finder [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [NC,OR]
    RewriteCond %{REMOTE_ADDR} ^64.140.49.6([6-9])$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ClariaBot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Diamond [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^[a-z]+$ [NC]

    RewriteRule ^.* - [F,L]
    Last edited by areidmtm; 10-13-2006 at 08:59 AM.

  2. #2
    Join Date
    Feb 2006
    Location
    Somewhere where I don't know where I am
    Posts
    2,155

    Default

    Adds need to go between ^Zeus [NC,OR] and R^[a-z]+$ [NC]

    ADDED

    RewriteCond %{HTTP_USER_AGENT} ^Gigabot/2.0 [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Gigabot/2.0/gigablast.com/spider.html [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^SurveyBot/2.3 [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [NC,OR] #dont use this line if you want Alexa rating to use your site
    Last edited by areidmtm; 05-16-2006 at 09:33 AM.

  3. #3
    Join Date
    Feb 2006
    Posts
    8

    Default

    Could you please explain how this works a little more, and indicate which .htaccess files this needs to go into and specifically where within the file itself?

    Appreciated...

    Deacon

  4. #4
    Join Date
    Feb 2006
    Location
    Somewhere where I don't know where I am
    Posts
    2,155

    Default

    Quote Originally Posted by Deacon
    Could you please explain how this works a little more, and indicate which .htaccess files this needs to go into and specifically where within the file itself?
    The code above need to be put within a file in your root directory (public_html) called '.htaccess'. If there is not one already there, you will have to create one using any text editor. Make sure that you have the '.' before 'htaccess'. You can place the code anywhere within that file.

    The list will block the spam bot using the USER_AGENT and will redirect it to a 403 forbiden page.

  5. #5
    Join Date
    Feb 2006
    Posts
    8

    Default

    Gotcha. Thanks. One thing, though:

    "Adds need to go between ^Zeus [NC,OR] and R^[a-z]+$ [NC]"

    You indicate a R before the ^[a-z]+$ [NC]

    YOu don't have that in your initial list. Should it be there?


    Deacon
    Last edited by Deacon; 05-14-2006 at 06:23 PM.

  6. #6
    Join Date
    Feb 2006
    Location
    Somewhere where I don't know where I am
    Posts
    2,155

    Default

    Quote Originally Posted by Deacon
    You indicate a R before the ^[a-z]+$ [NC]

    YOu don't have that in your initial list. Should it be there?
    oops no. that's a type o...

  7. #7
    Join Date
    Feb 2006
    Location
    NZ
    Posts
    24

    Default

    I've placed this in another thread here, but I thought I'd paste here too (seems relevant)

    <a href="http://english-51469580358.spampoison.com"><img src="http://pics3.inxhost.com/images/sticker.gif" border="0" width="80" height="15"/></a>


    http://www.spampoison.com/ check it out a truly swee-e-e-e-et concept!

    and it'll get bots that your list might miss
    Cád é an Scéal?

  8. #8
    Join Date
    Feb 2006
    Posts
    598

    Default

    Uh did you try the link? As far as I can tell it's just a trick to have people put the link on their pages to increase that guys reference count and boost his site ranking in search engines.

    I thought the email generating link was hidden in the background of his page but I couldn't find it.

  9. #9
    Join Date
    Feb 2006
    Location
    Somewhere where I don't know where I am
    Posts
    2,155

    Default

    Yeah, I don't see how that would help to get rid of other spam bots.

  10. #10
    Join Date
    Feb 2006
    Location
    Australia
    Posts
    24

    Default

    Spampoison as I recall was an ok idea when it came out (2004), you post a link to it on your blog or whatever, and the bot crawls spampoison and gets a large list of known spammers emails.

    These days its fairly redundant i'm thinking though.

    btw siguie, if you're unsure about a sites safety, check to see if it's got an entry on siteadvisor.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts