doberman The TeleBlog’s overzealous anti-spam Dobermans are at it again. We’ve just recovered much-appreciated comments on paying writers, e-book prices and other topics—from  Bill Waldron (the shareware pay issue), Bob Russell of MobileRead (long, thoughtful essay on book prices), Dan Carey (self-publishing vs. the traditional kind) and Blaine Higgy (general-purpose computers vs. dedicated e-book devices). Click on their names to see the comments.

Always write us if your comments don’t appear in a day—normally they’ll show up instantly after you’ve established a track record as a commenter. We’re getting thousands and thousands of comment spams, and sometimes the good stuff gets lot in the dreck. We’re at the mercy of Akismet, our anti-spam service, but usually we can rescue lost comments. If you’d like, just to be sure, send along copies when you write us.

11 COMMENTS

  1. Just out of curiosity, who tends the barbarians at the gate? I’d estimate that they’ve flushed one out of every six or eight of my posts, especially the ones that contain too much pricing information and/or too many links to commercial sites (which I provide solely for the convenience of other readers who might want to chase them.)

    Who “owns” the TeleBlog filter? Can the TeleBlog webmaster configure it, or is some service provider providing a helpful (sic) service over which the webmaster has no control? The better filters are pretty flexible and can be configured to prevent them from relegating too much stuff to the bit bucket.

  2. First, if your comment does not appear, feel free to contact me idiotprogrammer at fastmailbox.net . Sometimes akismet is “unsure” so it leaves it in the moderation queue, and David and I usually approve those quickly. The problem is the number of comments which are thrown into the Aksimet queue. 99.9% of them deserve to be there. The problem is how to find the false positives. That’s why we ask commenters to bug us if something doesn’t appear to show up quickly.

    at the risk of boring 90% of the people here, wordpress uses a distributed spam detection method called akismet run by the wordpress guys. by default we set comments by new commenters to automatically go into moderation. That’s a good default to use. the rationale behind akismet is that blogs can report spam to akismet as well as false positives. But the turing test is becoming harder (and I doubt captcha will make that much of a difference). Long term I expect whitelisting + OpenID to improve things, but for now we’re stuck with an imperfect solution.

    The problem is that spam commenters are mimicking legitimate posts. So Todd, here’s a kind of comment that we receive hundreds if not thousands of identical :

    Daniel | k.daniel@msn.com | mymortgagedeals.net | IP: 64.22.110.2
    I couldn’t understand some parts of this article : Infoseek founder Todd Jonz speaks out on copyright, DRM lockups and more | TeleRead: Bring the E-Books Home, but I guess I just need to check some more resources regarding this, because it sounds interesting.

    Also, in the last week or so, we’ve started to receive comments like this one:

    Daniel | k.daniel@msn.com | mymortgagedeals.net | IP: 64.22.110.2
    May 14th, 2008 at 5:44 pm

    Just out of curiosity, who tends the barbarians at the gate? I’d estimate that they’ve flushed one out of every six or eight of my posts, especially the ones that contain too much pricing information and/or too many links to commercial sites (which I provide solely for the convenience of other readers who might want to chase them.)

    Who “owns” the TeleBlog filter? Can the TeleBlog webmaster configure it, or is some service provider providing a helpful (sic) service over which the webmaster has no control? The better filters are pretty flexible and can be configured to prevent them from relegating too much stuff to the bit bucket.

    how can we weed out a comment like that? it looks legitimate; the only difference is that it cloned a legitimate comment while changing the URL. often we can catch this, but teleread has 7600+ posts and
    12000 legitimate comments. How can we keep track of comments which duplicate another one?

    WordPress/akismet had a feature where you can search for keywords. so normally we could search for terms like ebook/kindle/publish to identify false positives, but now that spammers are cloning legitimate comments, even that is failing.

    One other thing. apparently ebooks is a very popular spam word on the spam blogs, so I think that is another reason for our susceptibility.

    I know David appreciates my willingness to go through the spam to weed out false positives. It’s time-consuming and extremely tedious. both of us hate it. Also, I should add that in the majority of cases comments are approved without problem. But when it is marked incorrectly as spam, chances are it happens over and over to the same person, further causing aggravation. I generally try to contact the commenters if the comment was incorrectly marked as spam to let people know about the mistake.

    akismet works fine for my medium sized personal blog, but the magnitude of this blog just creates 7000+ targets for spam commenters to attack.

  3. Robert,

    Thanks for the geekly background info. I know that heuristic filtering systems can be a pain to maintain, and I agree that in the long term Open ID may provide a better solution.

    In the interim, does the bag of techno-tricks you use to hold this place together include any other sort of registration-and-cookie facilities? They seem to be pretty effective at keeping the spam out on other well-trafficked sites I frequent. Registration could be optional, and unregistered users’ posts would still be fed to the dobies.

    FWIW, even assuming such facilities are a available, I completely understand (and sympathize!) if you have your reasons for not wanting to support them. It was just a thought.

  4. I would definitely consider requiring registration for all commenters first. I fear that it may not solve the problem though (and I would want to try it out on a lesser blog first).

    I’m assuming that better solutions are evolving. It’s just I have a hundred more pressing matters to attend to For example, do you know that roaches are living in my laptop?

  5. I know that a number of my comments have been flagged as spam. I’m not sure why they were caught, but I speculate it was because they contained an embedded URL link (using <a>). When I can I try to avoid including links, but when I do and the comment is flagged, I login and “unspam” the comment.

  6. Robert writes:

    I have a hundred more pressing matters to attend to. For example, do you know that roaches are living in my laptop?

    I wonder how they’d get on with the mice that took up residency in my (former) “sat3llite TV rec3iver?” (There’s another phrase that apparently won’t make it past the Dobies without a h4x0r disguise!)

  7. A substantial fraction of my comments are marked as spam by Akismet. I remove each of the misidentified comments from the spam filter in the manner Jon Noring mentions above. The registration and use of the “Garson O’Toole” name did not satisfy Akismet. The spam filter still unhappily confuses me with the promoters of herbal Viagra and ringtones.

    Often the blocked comments contain multiple links but not always. My most recent comment about fake author portraits contains no links and was spiked by the spam filter. Anyone or anything that is assigned the Sisyphean task of distinguishing the spam from the non-spam has my sympathy. It is becoming trickier than a Turing test.

  8. I’m surprised that Akismet cannot be tweaked to reliably approve comments based on the submitter’s name and/or email address and/or IP address. Has any of you contacted the Akismet people with this problem?

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.