Anti-Spam Test: Introduction

I have been thinking recently about porting Drupal’s ‘Antibot ’ module to Backdrop CMS. On the five Backdrop sites I’ve built to date I use the ‘reCAPTCHA ’ module, however I’d like to get away from needing a separate (Google) account to avoid spam, and requiring users to take an extra step to submit a form (tick a checkbox to prove they’re not a robot). Antibot looks good in this case, as it doesn’t require any third-party accounts and is completely hidden from the user. But before I start porting Antibot to Backdrop, I’d like to know how effective it is in relation to other anti-spam modules (e.g. Honeypot ). I’ve therefore decided to test various anti-spam Drupal modules to see how they perform, and will document the process here.

The Plan

The plan is to setup different forms in Drupal, each with a different spam protection method in place. I’ll leave the site running for a month, then check how much spam has accumulated via each of the forms.

This requires:

  • Setting up a new Drupal site
  • Setting up different email addresses to receive the (spam) submissions
  • Setting up different forms in Drupal
  • Selecting and configuring the anti-spam modules
  • Analysing the results

Drupal Setup

I started off by setting up a new Drupal website. I’m using Drupal 7 as it’s the last Drupal version I used before moving to Backdrop, and because D7 modules are easier to port to BD. The site setup is very minimal, with only the following core modules installed:

  • Database logging (so I can see any errors/issues)
  • Update manager (so I get notified when a module needs updating (I want this site to stay up-to-date while it’s in use, to simulate a real website))

I also installed the following contrib modules, for my own benefit/ease-of-use:

  • Administration menu (w/ Administration menu Toolbar style) (for easy access to admin areas of the site)
  • Module Filter (to make the module list easier to use/manage)
  • Devel (mainly because I like to dpm() things when I’m writing code (you can view my custom code below))
  • PackWeb (a custom module where I add my custom code (see below))

There are no nodes, content types, users (apart from me) or enabled permissions (my forms have 'access callback' => TRUE so anyone can access them). The rest of the site/settings are more-or-less the defaults, with some things tweaked for my benefit (e.g. timezone, date formats, etc.).

Email Addresses

To test each anti-spam method, I wanted to setup different email addresses - one address for each method (e.g. antibot@example.com). That way I could simply look in an email’s inbox and count the spam, thereby comparing how effective that particular method was at minimising spam to the other methods.

I started searching for a way to register multiple email accounts. There’s a few options out there, but:

  • I needed lots of storage space (in case I got lots of spam over the course of a month)
  • I didn’t want to register for a general account with a company just to get an email address (e.g. Gmail, Yahoo, etc.)
  • I didn’t want to pay anything
  • The email addresses needed to work for at least a month (no disposable addresses that only last a day or so)
  • I didn’t want any spam filters built into the account already (that’d be pointless for my purposes)

I couldn’t find anything that suited my needs. Even setting up aliases at my own personal address wouldn’t really work as there were spam filters enabled that’d get in the way, and I didn’t want to have to keep a month worth of spam in my email client.

Then it hit me! I didn’t need any email addresses at all. All I needed to do was log all the submissions for the contact forms in Drupal or on the server. Form submissions for Drupal’s Contact form are already logged to the database, but the database log isn’t permanent storage (I needed a month worth of backlogs), so I decided to write some custom code that logged all submissions to a file on the server. See below for my solution to that.

Contact forms

Each anti-spam method I wanted to compare needed to protect its own form without other methods getting in the way. I therefore needed a different form for each method.

The first thought I had was to use the User account contact forms built into core’s Contact module. I thought I could create a user account with a method-specific email address, then assign that anti-spam method to that user’s contact form. However I eventually realised that each user’s contact form had the same form ID (contact_personal_form), so there was no way to assign different anti-spam methods to them.

I then tried the Contact Forms module, but even though it produces forms with unique URLs, they all still share the same form ID.

I finally realised that I’d have to create my own forms programatically (other options, like using the Webform module, seemed like too much work to setup individual forms). I researched Drupal’s API docs and found a way to create multiple forms, each with their own, unique form ID, based off the one form constructor function (the same way that Drupal’s core Node module creates node/add forms for different content types).

I wrote some code in my custom PackWeb module that does just this. I firstly made an array of all the different anti-spam modules/methods I wanted to test (see below), then looped over that to make a form for each one on its own page. The forms are all based off the one base form, which has the following fields:

  • Name
  • Email
  • Subject
  • Message

The form code is mostly copied from core’s Contact module. I included a notice at the top of the form warning humans that this form doesn’t actually send emails, so don’t use it (hoping to minimise the non-spam emails I’d have to wade through later). My code also includes a home page of links to each form (so search engines can find them all, and so I could test them more easily).

As I was writing this blog post, it occurred to me that I also need a ‘control’ form that has no anti-spam protection. This will help to more accurately analyse the results later (e.g. if the log files don’t show any spam after a month, then it’s because spambots couldn’t find/access my site, not necessarily that the anti-spam modules were all doing their jobs perfectly). So I’ve also added an unprotected ‘control’ form to the list.

The real fun happens in the form’s submit handler. I collect all submission data into an array, then write that to a method-specific CSV file on the server. No email is actually sent. If the anti-spam modules do their job, the submission handler will never fire and no spam will be written to the server logs. If the modules fail to catch spam submissions, they’ll be written to the logs and I’ll count them up later to compare each method (I’ll probably manually go through each method’s log file to remove any ’legitimate’ submissions first).

You can view all my custom code here: https://gist.github.com/BWPanda/649088c170608bf7c02f86692924144b

Anti-Spam Modules

Now to choose which anti-spam modules to actually test. I went to Drupal’s Modules page and ran a search for ‘spam’ against all 7.x modules, ordered by ‘most installed’. I then went through the first two pages of results and decided to test the following modules:

There were some modules I couldn’t/wouldn’t test. They were:

  • Mollom : It reached EOL on 2nd April 2018 and no longer works
  • Invisimail : It protects/obfuscates email addresses, not forms
  • SpamSpan filter : It protects/obfuscates email addresses, not forms
  • Spambot : It only protects the user registration form
  • CAPTCHA Pack : It is unsupported, with the last release being in 2011
  • http:BL : It blocks spambots site-wide based on their IP address; not possible to test on my setup without affecting other forms/methods
  • AntiSpam : It uses Akismet which requires you to have a WordPress account; I don’t have/want one
  • Google Captcha : It has been merged with reCAPTCHA (which is one of the modules I am testing above)
  • Bad Behavior : It blocks spambots site-wide based on their HTTP requests; not possible to test on my setup without affecting other forms/methods
  • Graceful Email Obfuscation Filter : It protects/obfuscates email addresses, not forms

So there are the 11 anti-spam modules I decided to test. Some modules have a few methods built-in, so in total I’m testing 13 anti-spam methods. Thankfully each of them allows you to specify the form ID that you’d like to protect, so that made it easy to add individual methods to individual forms.

I tried to stick with the default settings for each module, however some needed tweaking and so I’ve tried to use sensible settings that I’d employ on a real site. I may provide a list of each module’s settings in the analysis to come.

Analysis

This will have to wait a month or so, with the site live and accepting submissions via its contact forms, to collect enough data to analyse later. I’ll be posting a follow-up blog post in a month to share all the juicy results with you.

The site goes live today: http://spamtest.panda.id.au Feel free to check it out, but please don’t submit any of the forms as I don’t want to skew the results, or have to remove too many non-spam submissions later. I’ll make sure Google’s robots find and index the site, so hopefully spambots will too ;-)

Contact me if you’d like me to add any other modules to the test, or if you have any suggestions to make the test better/more accurate.

Update (31st July 2018)

So it’s three months later and I’ve not had a single email submission from the website at all. Sad. I’ll keep the site running in the hopes that it’ll attract some spam eventually, but for now the experiment is on hold.