For years now we have run spamassassin to manage the small amount of spam that gets through the various layers of defenses on our email system at home, on which three domains' email converge for Helen, her business and me.  And the terriers have email accounts there too.  Postfix does an excellent job with various checks on inbound email, and the art becomes ensuring that spams or dodgy emails are detected as quickly as possibly in the SMTP process, and does so with least processing. Postfix's mailgraph package shows how effective this is, blocking well over 90% of unwanted email that would otherwise have had to be dealt with by an anti spam system, in our case, spamassassin.  Spamassassin works well, but it can be resource hungry, and I have had to draw up various scripts over the years to keep the accuracy up to scratch.  To give an idea of the amount of memory, for example, that is required, in our case, on a server running everything needed for an office, spamassassin uses a quarter of the used memory (not total memory, but still a big chunk.)

mailgraph.cgi.png

The last year's spam. No viruses? Not checked, as we run Linux desktops. Postfix defenses working well, and the rest mopped up with spamassassin

But recently I have been thinking about the capabilities of the astonishing little Raspberry Pi to run these services.  It would save us quite a lot of power, and in general, our server does not need a lot of processing power.  The latest Pi 2, with its quad cores, would be fine.  I have also been wondering about the convenience of having your own Pi at a hosting company.  I take advantage of Edis hosting's generous offer of free hosting, though we send a donation too, but these days a number of hosting companies will sell you a Pi at a reasonable price, install it in their racks, and host it with generous bandwidth allowances for very little money - around €36 a year.  My experience with the first generation Pi at Edis has shown how useful having full control of your own server can be.  But in addition, our community has a number of small projects going on for whom various internet services would be helpful, but may not have the budget or the inclination to run their own systems. A shared Pi may be a good solution for such community projects.

For an example of how convenient a hosted Pi can be, we went away at the beginning of the year, and as we left, I unplugged the wifi access point, forgetting that I had temporarily plugged the server into it.  I discovered we had no access when 250 miles away, and as we were planning to be away for a week, lost email was going to be inevitable. Then I remembered my Pi.  It did not take long to configure it as a fallback MX server, and within a short space of time, I could see email flowing into it.  When we got home, and plugged the main server back in, it was with relief and gratitude that I saw all the email backed up on the Pi flowing in, with nothing lost. 

We also run the Prosody jabber XMPP server on the Pi, allowing safe and secure instant messaging for us and all our devices.  Prosody was the easiest to set up, and is reasonable on the Pi's meagre resources.

So what can be done with a Pi2, with four times the memory, four times as many processor cores and at least 6 times the performance?  As always, data security is the biggest issue, with having to rely on microSD cards and USB sticks, but at least one hosting company offers a gig of backup space as part of their service, more than enough for comfort.  As long as you don't need to guarantee five nines of uptime, a hosted Pi might offer flexibility and freedoms unavailable to virtual hosting offerings, at a fraction of the cost of a conventional hosted system.

But running spamassassin on a Pi would consume a chunk of its resources.  So I started taking a look at dspam.  There are other ways and other free software packages that deal with spam, but I would not want to alter the layered approach that the unix way of having separate systems running various services that work together, offers. One alternative to spamassassin is dspam, which seems to meet the requirement for reduced resource.  And as the levels of spam recently were at their lowest for some time, the risks of a system without anti-spam were low.  In the past, I have swapped out live email systems without users being aware, so how hard could this be?

Very hard, as it turns out. DSpam has had a varied history, but is fully free software. It has not had much done on it for some time, but it has an excellent reputation for dealing with spam well, and an equal, but negative reputation for getting it set up and stable.  It stores its information in a hash database, or sqlite, or mysql or postgresql. It hooks into postfix (in this case) in a variety of ways.  And dovecot, my IMAP server of choice, has a plugin that should work well with it, making training a lot easier.  DSpam adds a signature into the body or, more sanely, the header, to identify the spamminess of a message.

The trouble started when the first aliased emails came in, and I tested dovecot's training option.  DSpam was unable to find the signature, as it was looking under the aliased name, while the dovecot plugin assumes it is the primary user's signature that needs to be found.  Further problems of a similar nature arise when you support multiple domains.  Most of the documentation on the net assumes you use virtual users, but that's overkill on a small system, and anyway, does not seem to solve the underlying design problem.  I think I have now found a solution to the problem through the use of shared groups, but there is no doubt that the dspam project really could use some attention, especially to documentation.  This would be a mammoth task, though, as there are so many ways that it can be implemented.

For example, the easiest option is to have postfix regard dspam as a content filter.  This works really well, but it means even outgoing messages are scanned by dspam, an unnecessary and indeed pointless exercise, as spam is an inbound problem.  You can route around this, but mainly relying on obscurity as the mechanism, never a good choice. But you can also persuade postfix to run a check on inbound email, which is where the various layers of defences sit. If you make the last such layer a filter which sends the email to dspam, the inbound-only scanning issue is solved. In other installations, setting it up as a content filter, a native mechanism in postfix, may well be preferable, but defining all use cases in documentation would be a nightmare.

Anyway, we are now in the early stages of training the new dspam installation. It uses less than a third of the memory of spamassassin, but, while it has as much, and probably more, flexibility than spamassassin, it is more blunt as far as handling spam is concerned. In particular, spamassassin tags spam with a subject line change, but also gives the spam a score.  You can get it to deliver spam, usually straight into the user's Junk folder, but you can also get it to simply discard spam that has a score so high that there is just no dount that it is spam, and the user need never see it.  In practice, this nicety may not be necessary with dspam, but it would be nice to have a similar facility.

And there is no doubt that in a setup such as running a Pi for private cloud-type services (ie, just running your own services) dspam is an interesting and valid option, and may be preferable to spamassassin.  I'm looking forward to trying it further.