Confessions of a spammer

Just recently there has been a lot of talk about spam, more specifically comment spam, on this site. This came about because I asked my readers for advice on curbing the high volumes of comment spam I have been receiving of late. I received a lot of advice and have taken various steps which seem to be having a positive effect on the amount of spam received.

Just what did I do to combat the problem?

I have always had the WordPress plugin Akismet installed. When I took over the blog there was not many spam comments and Akismet caught them all. Just lately the amount increased to around 50 - 60 per day but Akismet was still happily catching them all. Akismet definitely works well but this was not the problem. I now had all these comments to check manually every day as I was concerned that some REAL comments could be amongst those marked as spam.

I first installed another plugin called Bad Behavior alongside Akismet. This plugin stopped 90% of the spam comments before they even got to Akismet. The latter blocked the few that managed to get through. I noticed all the comments stopped by Akismet (not many) were from bots and not humans. My main problem was however now solved as I now had a manageable amount of comments to check in Akismet.

I then installed another plugin called Challenge to assist the other two. All these plugins seem to work quite well together. Challenge simply adds a simple maths question to the comment box which has to be answered before the comment is allowed through. The idea here was of course to get rid of the comments still coming in via bots. Everything worked as planned and for the last few hours I have received absolutely no spam. I can only hope this lasts (touch wood).

I know this now means my commenters have to deal with this type of ‘captcha’ but this seems to be one of those things we all have to learn to live with.

With all of this going on it meant that I have been doing quite a lot of thinking about spam the last few days. While I was researching all this I found something called Project Honeypot. I have heard about these guys before but I did not really know what they do. I investigated further.

Project Honeypot is an attempt to fight back against the spammers out there. If you join them (completely free) you can assist them in a couple of ways:

  • You can install a honeypot on your server.
  • You can install some quicklinks on your site.
  • You can donate an MX record.

Most of this could sound Greek to you (as it was to me) but all the details are available on Project Honeypot’s site.

Project Honeypot basically gives the little guy the chance to help fight the scourge of spamming.

I installed one of these honeypots and when I returned to their site to check if I had done everything correctly I found a page on which you can check any IP address to find out if it belongs to a spammer or not. I ran my own IP through their checker and guess what? Project Honeypot has the following to say about my site:

… detected behavior from the IP address consistent with that of a spam harvester and comment spammer.

Now that the truth is out I can at least greet my friends before I hand myself over to the police. All jokes aside, this is rather disconcerting. Project Honeypot does also say:

Please note: being listed on these pages does not necessarily mean an IP address, domain name, or any other information is owned by a spammer. For example, it may have been hijacked from its true owner and used by a spammer.

Does anyone know what I can do about this? Sometime in the past this IP address has obviously been hi-jacked and used for all sorts of spammy things. How can I detect where this has happened? This all might have nothing to do with the site as such, it might have originated from the email sent or received on this IP. What can I do to prevent something like this from happening again.

If I am to be considered a spammer I just hope someone is going to give me some of the money these fools make.

Till next time.

Update - 25 September

Over the last 24 hours these were the spam stats on this blog:

  • Bad Behaviour caught 313 attempts.
  • Akismet caught nothing

Judging by the figures from the previous day it seems as if Bad Behaviour is still catching most of the muck, a few attempts by bots could however be getting through. The ‘captcha’ plugin seems to be working as well as it is stopping the bots and the result there is nothing left for Akismet to worry about. I will give the situation a few days and then remove the ‘captcha’ just to check what happens.

Update - 27 September

You will notice that the ‘captchas’ have been removed from the comments. So far so good, the spam comments are still under control without the commenters having any inconvenience imposed upon them.

Let's be sociable, come on, you know you want to. I appreciate the support.

Subscribe to feed Favorite on Technorati Submit to StumbleUpon submit to reddit

This entry was posted on Wednesday, September 24th, 2008 at 6:16 pm and is filed under Plugins, WordPress using as tag(s). You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

30 Responses to “Confessions of a spammer”

  1. Thomas says:

    Hello Lyndi,

    without any “honeypot” I discovered some time ago that my website’s URL was considered a source of spam. I discovered this because my e-mails bounced which I had sent using an address “@” that webpage - because it was blacklisted with mayor e-mail services. Explanation: I was (and am) on a shared server like almost everybody is (a dedicated server is much more expensive) and somebody else on that server was a spammer. Same server, same IP address. Gone with him (albeit unknowingly), hung with him.

    My webhosting supplier then moved my website to another server and that fixed the problem. Maybe this is an option for you, too.

    Good luck
    Thomas

  2. Rarst says:

    Reducing spam is nice but also consider by what percent you had cut reader’s comfort with math question.

    I strongly beleive that solving spam problem at reader’s expense (questions, captchas, etc) is wrong. Spam is blogger’s problem, we shouldn’t say “Let’s just make readers work on it” and dump problem on them.

    Same goes to paranoid plugins. On blogs with Akismet I can often predict that my comment is going to be slayed even before I submit it.

    • Lyndi says:

      I have to do something. I thought hard and long about the whole ‘captcha’ issue. This one at least has at most two characters that have to be typed and it is not one of those difficult to read things.

      It does seem that you do not enjoy Akismet. This blog has had it installed from day one. Since I have taken over, it has never stopped a legitimate comment but I do suppose it could happen.

      If this Honeypot idea works it could mean that some of the other measures could be dropped. Only time will tell.

      • Rarst says:

        Problems with Akismet depend on topic of blog. You can chat about wordpress all you want without any problems. But try to comment on post about putting together PC on tight budget and it is going to kill half comments for money-related content.

        PS I had already received “failed” on math question once and I am very confident I had added 1 to 4 correctly.

        PPS lately I am thinking out post on spam vs spam filters… Going to be contradicting which is good for my comment count. :)

        • Lyndi says:

          I now see what you mean about Akismet. I have never thought about it from this angle. Interesting, to say the least. The mind boggles as to what could happen if you write about Joomla.

          You are better at maths than me, I still struggle with 1 + 4 :-) Hopefully I will soon be able to remove the captcha.

  3. Jim Sefton says:

    hehe, an interesting debate once again… I must say the maths question is much better than the “how many letters with cats are there”. Sometimes these things make me dizzy!

    There is a possibility some subnets have been black listed i.e. if your neighbour (in relation to IP) is a spammer you can get blacklisted too. Sometimes whole ISP’s get black listed, depending on who they are.

    I am fortunate enough to have my own server and bank of IP’s so I can do my best to defend against it, but anti-spam measures are only as good as the people who design and run them.

    Here we go… 2+1 is 3… cross your fingers… here we go….!

  4. Margaret says:

    I know that you inherited this site from Sailor and that he had some issues at one time with people who hijacked it. He got that stopped, but perhaps it’s that history that is haunting you. The change by your ISP to a different server may solve the issue.

    I have not needed and do not want to add any kind of captcha to my sites. It is often difficult for people to read and comprehend — my mother just spent an hour trying to sign into a website she wanted to subscribe to that had a captcha, but because her vision is no longer the best, she never could get the letters right and gave up in frustration.

    ê¿ê

  5. Squeaky says:

    Lydia,

    I would check to see if your site’s IP address is shared or dedicated, first. If it is shared, I would talk to the hosting provider and see if you can get a dedicated IP address. If you can’t obtain one, then I would change hosting providers and find one that will give you a dedicated IP.

    That is where I would start, first.

    • Lyndi says:

      I have asked to have the domain moved from one server to another within the same hosting company. Unfortunately I cannot afford dedicated hosting right now.

      • Squeaky says:

        Lyndi,

        The lowest dedicated sever package I have found is at, http://www.colopronto.com/ - starting out with a $24.95 per month package. I don’t know if that is in your budget, but if you ad adsense at some point in the future, the money you make from that per month will pay for the dedicated server.
        Madmouse is a PR3 and it does very good with adsense, so I don’t see why your site wouldn’t do the same.

  6. BioTecK says:

    Well.. I have Peter’s Custom Anti-Spam installed and I’m very happy with it. (http://www.theblog.ca/anti-spam)
    On my blog I have the “Comment author must have a previously approved comment” checked and if I get a spam comment I simply add the IP of the spammer into the Comment Blacklist under Setting –> Discussion.
    This might help you too.. give it a try! ;)

  7. Nihar says:

    Lyndi,

    I don’t agree with some other readers. Math question is simple to answer and is a must.

    I am also using it.

    Regd Honeypot i will also try checking my IP

    • Lyndi says:

      I hope you are luckier than I was when checking your blog. I am looking into a dedicated IP for my domain but it is a bit too expensive.

  8. Jim Sefton says:

    Hi Lyndi,

    First of all be VERY careful with hosting. That $24 hosting is too cheap. I pay 4 times that and I do think you get what you pay for. For starters that price is if you prodive your own server and ship it to them!

    If you need hosting advice please do email me, I have been down this road many times and know the business inside out. There realy are no short-cuts.

    It is highly unlikely your site has a dedicated IP, it is simply not big enough to justify it. That is not necessarily a bad thing though, most of my sites share an IP.

    Out of interest, the IP you put into honeypot, try some of it’s neighbours i.e. if the ip was x.x.x.34 then try 33 and 35. I would think there is a chance they are blocked also.

    If you want me to look into this for you just let me know, but beware of taking advice about a host from someone who is not actually hosted with that company, you really do get what you pay for ;)

    • Squeaky says:

      Jim,

      Thanks for pointing that out. I must have been a little over tired and I totally missed, that it is for colocated. That wouldn’t help Lyndi, out at all. Sorry, about that.

      • Jim Sefton says:

        No worries at all, my main point is that hosting is such an important thing, especially if you have multiple accounts, that recommendation from someone who uses a service is almost essential, as there are so many deals out there that are “too good to be true”.

        I use Gigenet in the U.S. They are not the cheapest out there, but the reliability is fantastic, the speed is really good, and the support is second to none (ok, maybe second to Rackspace!)

    • Lyndi says:

      Thanks Jim, it sounds as if you really know this stuff. I have not changed anything on the hosting side yet. After what you have just said here, I will not do anything with regards hosting without getting back to you first. Thanks a lot for the advice.

  9. Hicham says:

    I recommend Akimest + Capatcha yet I’ve another tactic that might add another 3rd layer of protection.

    PHP webhosting servers relay on a file called ‘.htaccess.’ which simply place some “Rules” that tell the server how to deal with visitors. Example WordPress (self-hosted) implement rules in this file while dealing with ‘Permalinks’

    The idea is that you can place rules within this file to prevent bots from attacking your website from the very begining before they reach it, so in a nutshell, search for ‘.htaccess’ over google and see how you can use it.

    p.s. sorry for the long comment!

    • Lyndi says:

      Thanks for the tip, I will definitely investigate. This is one site where you can have your say, the length of the comment is not a factor.

  10. Jim Sefton says:

    That’s not a bad idea actually, but I would say that is a good way of deterring a particular bot, if that’s what you are getting trouble from.

    Best bet is to take a look at the apache logs if you have access to them and see if you get recurring instances of a bot in there. If so then it’s not to difficult to add the DENY line to htaccess to prevent the bot next time it visits.

    • Lyndi says:

      I checked the mentioned logs. This muck is definitely coming from bots but it is coming from all over the show. At present all attempts are blocked before I even get to see them, this suits me.

  11. nukeit says:

    You might want to implement a bad bot ban script
    basically, you place a link to a php script that writes a deny entry in your .htaccess. Put the url to the script in your robots.txt:

    User-agent: *
    Disallow: /bansciptdir/

    Place a hidden link to it in your header or footer. When a bad bot scrapes your page without reading the robots.txt, it will automatically add its IP to your .htaccess preventing further scraping. These bots are typically used to find and maximize the effectiveness of spamming blogs… among other things ;)

    • Lyndi says:

      Thanks for the detailed explanation. It seems to work a bit too well, I am getting a 403 error when trying to go to your site. Please let me know when you have the problem sorted out.

  12. Turnip says:

    I don;t know if this was mentioned, but badbehavio now contain Project Honeypot built right in, you just need an api key.

    • Lyndi says:

      Thanks Turnip, this is great news. I do have a Project Honeypot API key so maybe I should give this a try.

    • Lyndi says:

      I have just set up Bad Behavior with the Project Honeypot API and it simply blocked me from seeing my own site. There is definitely something wrong here - I will have to investigate.