OpenBSD Journal

Anti-Spambot configuration changes

Contributed by Dengue on from the readme dept.

Well dear readers, after growing frustration with spambots. I have implemented some anti-spam controls. If you are using an automated process to grab summary.rss , or any other site content, and that process does not pass an http user-agent string, or you are using an anonymizer to strip out the user-agent, you might find access denied. I will be blocking certain user agents known to be spambots, and all access that does not present a user-agent string. Of course, this gives reign to the creativity of anonymous proxy users to create their own user-agent string.

Right now, I'm only blocking two specific user-agents, and will be monitoring logs for problems. I gradually plan to tighten the controls a bit until I block most of the spammers. If you are being denied access (that's a 403 error, btw), drop me a line, and we'll try to work it out. I do not plan on blocking user-agents such as lwp-trivial and wget, so if you are using common tools to grab headlines and such, you wont have any problems. Of course, if you rolled your own tool, and have it passing user-agent: ViciousSpamBag/1.0, you're on your own.

jim

(Comments are closed)


Comments
  1. By xm () xm@while1.org on http://while1.org/~xm/

    How useful is this really?THe spammer will just work around this crap. Perhaps you shouldadd some invisible spambait links that will ban hosts that click on it for 2 hours or something (or feed them crap for X time).

    Comments
    1. By webmaster () on

      To a certain extent, yes they *might*. On the other hand, spamming is all about low-hanging fruit, low effort and high yield. Sure they can get around it, but why expend the effort when I'm just one of a huge number of sites. Your assertion assumes a spammer concerned with specifically targeting deadly.org. Granted, I likely couldn't, or wouldn't defend the site against that. But what I can do is stop the 30 or so bots that crawl this site weekly harvesting email addresses and ignoring robots.txt. Sure, they can change the User-agent: string, and get away with that for awhile. But when I notice that that a user agent failed to honor robots.txt, then I'll just add them to the list. The point isn't necessarily to block all spammers (that would be hard to do), but to block the majority of them (which is relatively easy to do). As for the 'click=ban' idea, I like it, but it's the kind of thing that can bite you pretty quickly by opening yourself up to DoS attacks.

  2. By James () e5z8652@mail.com on mailto:e5z8652@mail.com

    Hmm. When I browse from my OpenBSD box with Konqueror there is no user agent string by default. Any particular one you would like me to set for this page?

    Comments
    1. By webmaster () on

      Konqueror sends the following User-agent string:
      Mozilla/5.0 (compatible; Konqueror/2.1.2; X11).
      It also has the ability to send other User-agent strings for websites that don't recognize Konqueror for things such as javascript support. You won't have any problems (I preview a lot of stuff in Konqueror, as well as Netscape and *cough* IE).

  3. By Jim Bob () jim@nsa.gov on http://www.whitehouse.gov/

    Check out http://scoop.kuro5hin.org/

    It's the engine for http://www.kuro5hin.org/

    It has some good spambot protection built in
    that is a bit more advanced then user-agent crap!

    Also it allows people to log in, post their own stories, and have the user community moderate them, which may promote more discussion around here!!

  4. By Anonymous Coward () on

    How is it that spam bots are using your .rss for SPAM. There are no email addresses in it. Whats the deal yo?

    Comments
    1. By webmaster () on

      I'm not trying to block access to summary.rss, I'm blocking access from things such as: [26/Aug/2001:02:41:26 +0000] "GET / HTTP/1.0" 200 28403 "-" "EmailSiphon". This kind of access has been going on for a long time. Now I'm blocking it. If your legitimate access is being blocked, don't use EmailSiphon to access my site. I can't think of a single reason to allow User-agents such as EmailSiphon access to OpenBSD Journal.

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]